Re: Hey good looking toPandas () error stack

2020-06-21 Thread Anwar AliKhan
The only change I am making is spark  directory name.
It keeps failing in this same cell. df.toPandas()


findspark.init('/home/spark-2.4.6-bin-hadoop2.7') FAIL

findspark.init('/home/spark-3.0.0-bin-hadoop2.7'). PASS





On Sun, 21 Jun 2020, 19:51 randy clinton,  wrote:

> You can see from the GitHub history for "toPandas()" that the function has
> been in the code for 5 years.
>
> https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923
>
> When I google IllegalArgumentException: 'Unsupported class file major
> version 55'
>
> I see posts about the Java version being used. Are you sure your configs
> are right?
>
> https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version
>
> On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
> wrote:
>
>>
>> Two versions of Spark running against same code
>>
>>
>> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>>
>> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
>> error stack below
>>
>> Jupyter Notebook
>>
>> import findspark
>>
>> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>>
>> cell "spark"
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v3.0.0
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>>
>> import findspark
>>
>> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>>
>> cell  "spark"
>>
>>
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v2.4.6
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>> cell "df.show(5)"
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |PassengerId|Survived|Pclass|Name|
>> Sex|Age|SibSp|Parch|  Ticket|   Fare|Cabin|Embarked|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>>   A/5 21171|   7.25| null|   S|
>>
>> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|
>> 0|PC 17599|71.2833|  C85|   C|
>>
>> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
>> 0|STON/O2. 3101282|  7.925| null|   S|
>>
>> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|
>> 0|  113803|   53.1| C123|   S|
>>
>> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|
>> 0|  373450|   8.05| null|   S|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> only showing top 5 rows
>>
>> cell "df.toPandas()"
>>
>> cell output
>>
>>
>> ---
>>
>> Py4JJavaError Traceback (most recent call
>> last)
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
>> **kw)
>>
>>  62 try:
>>
>> ---> 63 return f(*a, **kw)
>>
>>  64 except py4j.protocol.Py4JJavaError as e:
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
>> in get_return_value(answer, gateway_client, target_id, name)
>>
>> 327 "An error occurred while calling
>> {0}{1}{2}.\n".
>>
>> --> 328 format(target_id, ".", name), value)
>>
>> 329 else:
>>
>> Py4JJavaError: An error occurred while calling o33.collectToPython.
>>
>> : java.lang.IllegalArgumentException: Unsupported class file major
>> version 55
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>>
>> at
>> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>>
>> at
>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>>
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>>
>> at
>> 

Re: Hey good looking toPandas () error stack

2020-06-21 Thread Sean Owen
That part isn't related to Spark. It means you have some code compiled for
Java 11, but are running Java 8.

On Sun, Jun 21, 2020 at 1:51 PM randy clinton 
wrote:

> You can see from the GitHub history for "toPandas()" that the function has
> been in the code for 5 years.
>
> https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923
>
> When I google IllegalArgumentException: 'Unsupported class file major
> version 55'
>
> I see posts about the Java version being used. Are you sure your configs
> are right?
>
> https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version
>
> On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
> wrote:
>
>>
>> Two versions of Spark running against same code
>>
>>
>> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>>
>> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
>> error stack below
>>
>> Jupyter Notebook
>>
>> import findspark
>>
>> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>>
>> cell "spark"
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v3.0.0
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>>
>> import findspark
>>
>> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>>
>> cell  "spark"
>>
>>
>>
>> cell output
>>
>> SparkSession - in-memory
>>
>> SparkContext
>>
>> Spark UI
>>
>> Version
>>
>> v2.4.6
>>
>> Master
>>
>> local[*]
>>
>> AppName
>>
>> Titanic Data
>>
>> cell "df.show(5)"
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |PassengerId|Survived|Pclass|Name|
>> Sex|Age|SibSp|Parch|  Ticket|   Fare|Cabin|Embarked|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>>   A/5 21171|   7.25| null|   S|
>>
>> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|
>> 0|PC 17599|71.2833|  C85|   C|
>>
>> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
>> 0|STON/O2. 3101282|  7.925| null|   S|
>>
>> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|
>> 0|  113803|   53.1| C123|   S|
>>
>> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|
>> 0|  373450|   8.05| null|   S|
>>
>>
>> +---++--++--+---+-+-++---+-++
>>
>> only showing top 5 rows
>>
>> cell "df.toPandas()"
>>
>> cell output
>>
>>
>> ---
>>
>> Py4JJavaError Traceback (most recent call
>> last)
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
>> **kw)
>>
>>  62 try:
>>
>> ---> 63 return f(*a, **kw)
>>
>>  64 except py4j.protocol.Py4JJavaError as e:
>>
>> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
>> in get_return_value(answer, gateway_client, target_id, name)
>>
>> 327 "An error occurred while calling
>> {0}{1}{2}.\n".
>>
>> --> 328 format(target_id, ".", name), value)
>>
>> 329 else:
>>
>> Py4JJavaError: An error occurred while calling o33.collectToPython.
>>
>> : java.lang.IllegalArgumentException: Unsupported class file major
>> version 55
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>>
>> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>>
>> at
>> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>>
>> at
>> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>>
>> at
>> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>>
>> at
>> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>>
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>>
>> at
>> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>>
>> at
>> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>>
>> at
>> 

Re: Hey good looking toPandas () error stack

2020-06-21 Thread randy clinton
You can see from the GitHub history for "toPandas()" that the function has
been in the code for 5 years.
https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923

When I google IllegalArgumentException: 'Unsupported class file major
version 55'

I see posts about the Java version being used. Are you sure your configs
are right?
https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version

On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan 
wrote:

>
> Two versions of Spark running against same code
>
>
> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd
>
> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
> error stack below
>
> Jupyter Notebook
>
> import findspark
>
> findspark.init('/home/spark-3.0.0-bin-hadoop2.7')
>
> cell "spark"
>
> cell output
>
> SparkSession - in-memory
>
> SparkContext
>
> Spark UI
>
> Version
>
> v3.0.0
>
> Master
>
> local[*]
>
> AppName
>
> Titanic Data
>
>
> import findspark
>
> findspark.init('/home/spark-2.4.6-bin-hadoop2.7')
>
> cell  "spark"
>
>
>
> cell output
>
> SparkSession - in-memory
>
> SparkContext
>
> Spark UI
>
> Version
>
> v2.4.6
>
> Master
>
> local[*]
>
> AppName
>
> Titanic Data
>
> cell "df.show(5)"
>
>
> +---++--++--+---+-+-++---+-++
>
> |PassengerId|Survived|Pclass|Name|   Sex|Age|SibSp|Parch|
> Ticket|   Fare|Cabin|Embarked|
>
>
> +---++--++--+---+-+-++---+-++
>
> |  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
>   A/5 21171|   7.25| null|   S|
>
> |  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|0|
>   PC 17599|71.2833|  C85|   C|
>
> |  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
> 0|STON/O2. 3101282|  7.925| null|   S|
>
> |  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|0|
> 113803|   53.1| C123|   S|
>
> |  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|0|
> 373450|   8.05| null|   S|
>
>
> +---++--++--+---+-+-++---+-++
>
> only showing top 5 rows
>
> cell "df.toPandas()"
>
> cell output
>
> ---
>
> Py4JJavaError Traceback (most recent call last)
>
> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
> **kw)
>
>  62 try:
>
> ---> 63 return f(*a, **kw)
>
>  64 except py4j.protocol.Py4JJavaError as e:
>
> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
> in get_return_value(answer, gateway_client, target_id, name)
>
> 327 "An error occurred while calling {0}{1}{2}.\n".
>
> --> 328 format(target_id, ".", name), value)
>
> 329 else:
>
> Py4JJavaError: An error occurred while calling o33.collectToPython.
>
> : java.lang.IllegalArgumentException: Unsupported class file major version
> 55
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
>
> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
>
> at
> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
>
> at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>
> at
> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>
> at
> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
>
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>
> at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
>
> at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>
> at
> org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828)
>
> at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175)
>
> at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238)
>
> at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631)
>
> at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355)
>
> at
> 

Re: Hey good looking toPandas () error stack

2020-06-20 Thread Anwar AliKhan
Two versions of Spark running against same code

https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd

version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See
error stack below

Jupyter Notebook

import findspark

findspark.init('/home/spark-3.0.0-bin-hadoop2.7')

cell "spark"

cell output

SparkSession - in-memory

SparkContext

Spark UI

Version

v3.0.0

Master

local[*]

AppName

Titanic Data


import findspark

findspark.init('/home/spark-2.4.6-bin-hadoop2.7')

cell  "spark"



cell output

SparkSession - in-memory

SparkContext

Spark UI

Version

v2.4.6

Master

local[*]

AppName

Titanic Data

cell "df.show(5)"

+---++--++--+---+-+-++---+-++

|PassengerId|Survived|Pclass|Name|   Sex|Age|SibSp|Parch|
Ticket|   Fare|Cabin|Embarked|

+---++--++--+---+-+-++---+-++

|  1|   0| 3|Braund, Mr. Owen ...|  male| 22|1|0|
A/5 21171|   7.25| null|   S|

|  2|   1| 1|Cumings, Mrs. Joh...|female| 38|1|0|
  PC 17599|71.2833|  C85|   C|

|  3|   1| 3|Heikkinen, Miss. ...|female| 26|0|
0|STON/O2. 3101282|  7.925| null|   S|

|  4|   1| 1|Futrelle, Mrs. Ja...|female| 35|1|0|
113803|   53.1| C123|   S|

|  5|   0| 3|Allen, Mr. Willia...|  male| 35|0|0|
373450|   8.05| null|   S|

+---++--++--+---+-+-++---+-++

only showing top 5 rows

cell "df.toPandas()"

cell output

---

Py4JJavaError Traceback (most recent call last)

/home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a,
**kw)

 62 try:

---> 63 return f(*a, **kw)

 64 except py4j.protocol.Py4JJavaError as e:

/home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py
in get_return_value(answer, gateway_client, target_id, name)

327 "An error occurred while calling {0}{1}{2}.\n".

--> 328 format(target_id, ".", name), value)

329 else:

Py4JJavaError: An error occurred while calling o33.collectToPython.

: java.lang.IllegalArgumentException: Unsupported class file major version
55

at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)

at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)

at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)

at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)

at
org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)

at
org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)

at
org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)

at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)

at
scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)

at
scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)

at
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)

at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)

at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)

at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)

at
org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828)

at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175)

at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238)

at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631)

at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355)

at
org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:272)

at
org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:271)

at scala.collection.immutable.List.foreach(List.scala:392)

at
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:271)

at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:163)

at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)

at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)

at