Re: Hey good looking toPandas () error stack
The only change I am making is spark directory name. It keeps failing in this same cell. df.toPandas() findspark.init('/home/spark-2.4.6-bin-hadoop2.7') FAIL findspark.init('/home/spark-3.0.0-bin-hadoop2.7'). PASS On Sun, 21 Jun 2020, 19:51 randy clinton, wrote: > You can see from the GitHub history for "toPandas()" that the function has > been in the code for 5 years. > > https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923 > > When I google IllegalArgumentException: 'Unsupported class file major > version 55' > > I see posts about the Java version being used. Are you sure your configs > are right? > > https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version > > On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan > wrote: > >> >> Two versions of Spark running against same code >> >> >> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd >> >> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See >> error stack below >> >> Jupyter Notebook >> >> import findspark >> >> findspark.init('/home/spark-3.0.0-bin-hadoop2.7') >> >> cell "spark" >> >> cell output >> >> SparkSession - in-memory >> >> SparkContext >> >> Spark UI >> >> Version >> >> v3.0.0 >> >> Master >> >> local[*] >> >> AppName >> >> Titanic Data >> >> >> import findspark >> >> findspark.init('/home/spark-2.4.6-bin-hadoop2.7') >> >> cell "spark" >> >> >> >> cell output >> >> SparkSession - in-memory >> >> SparkContext >> >> Spark UI >> >> Version >> >> v2.4.6 >> >> Master >> >> local[*] >> >> AppName >> >> Titanic Data >> >> cell "df.show(5)" >> >> >> +---++--++--+---+-+-++---+-++ >> >> |PassengerId|Survived|Pclass|Name| >> Sex|Age|SibSp|Parch| Ticket| Fare|Cabin|Embarked| >> >> >> +---++--++--+---+-+-++---+-++ >> >> | 1| 0| 3|Braund, Mr. Owen ...| male| 22|1|0| >> A/5 21171| 7.25| null| S| >> >> | 2| 1| 1|Cumings, Mrs. Joh...|female| 38|1| >> 0|PC 17599|71.2833| C85| C| >> >> | 3| 1| 3|Heikkinen, Miss. ...|female| 26|0| >> 0|STON/O2. 3101282| 7.925| null| S| >> >> | 4| 1| 1|Futrelle, Mrs. Ja...|female| 35|1| >> 0| 113803| 53.1| C123| S| >> >> | 5| 0| 3|Allen, Mr. Willia...| male| 35|0| >> 0| 373450| 8.05| null| S| >> >> >> +---++--++--+---+-+-++---+-++ >> >> only showing top 5 rows >> >> cell "df.toPandas()" >> >> cell output >> >> >> --- >> >> Py4JJavaError Traceback (most recent call >> last) >> >> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, >> **kw) >> >> 62 try: >> >> ---> 63 return f(*a, **kw) >> >> 64 except py4j.protocol.Py4JJavaError as e: >> >> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py >> in get_return_value(answer, gateway_client, target_id, name) >> >> 327 "An error occurred while calling >> {0}{1}{2}.\n". >> >> --> 328 format(target_id, ".", name), value) >> >> 329 else: >> >> Py4JJavaError: An error occurred while calling o33.collectToPython. >> >> : java.lang.IllegalArgumentException: Unsupported class file major >> version 55 >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237) >> >> at >> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50) >> >> at >> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845) >> >> at >> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828) >> >> at >> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) >> >> at >> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) >> >> at >> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) >> >> at >> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) >> >> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) >> >> at >> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134) >> >> at >> scala.collection.TraversableLike$WithFilter.foreach(Traversa
Re: Hey good looking toPandas () error stack
That part isn't related to Spark. It means you have some code compiled for Java 11, but are running Java 8. On Sun, Jun 21, 2020 at 1:51 PM randy clinton wrote: > You can see from the GitHub history for "toPandas()" that the function has > been in the code for 5 years. > > https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923 > > When I google IllegalArgumentException: 'Unsupported class file major > version 55' > > I see posts about the Java version being used. Are you sure your configs > are right? > > https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version > > On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan > wrote: > >> >> Two versions of Spark running against same code >> >> >> https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd >> >> version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See >> error stack below >> >> Jupyter Notebook >> >> import findspark >> >> findspark.init('/home/spark-3.0.0-bin-hadoop2.7') >> >> cell "spark" >> >> cell output >> >> SparkSession - in-memory >> >> SparkContext >> >> Spark UI >> >> Version >> >> v3.0.0 >> >> Master >> >> local[*] >> >> AppName >> >> Titanic Data >> >> >> import findspark >> >> findspark.init('/home/spark-2.4.6-bin-hadoop2.7') >> >> cell "spark" >> >> >> >> cell output >> >> SparkSession - in-memory >> >> SparkContext >> >> Spark UI >> >> Version >> >> v2.4.6 >> >> Master >> >> local[*] >> >> AppName >> >> Titanic Data >> >> cell "df.show(5)" >> >> >> +---++--++--+---+-+-++---+-++ >> >> |PassengerId|Survived|Pclass|Name| >> Sex|Age|SibSp|Parch| Ticket| Fare|Cabin|Embarked| >> >> >> +---++--++--+---+-+-++---+-++ >> >> | 1| 0| 3|Braund, Mr. Owen ...| male| 22|1|0| >> A/5 21171| 7.25| null| S| >> >> | 2| 1| 1|Cumings, Mrs. Joh...|female| 38|1| >> 0|PC 17599|71.2833| C85| C| >> >> | 3| 1| 3|Heikkinen, Miss. ...|female| 26|0| >> 0|STON/O2. 3101282| 7.925| null| S| >> >> | 4| 1| 1|Futrelle, Mrs. Ja...|female| 35|1| >> 0| 113803| 53.1| C123| S| >> >> | 5| 0| 3|Allen, Mr. Willia...| male| 35|0| >> 0| 373450| 8.05| null| S| >> >> >> +---++--++--+---+-+-++---+-++ >> >> only showing top 5 rows >> >> cell "df.toPandas()" >> >> cell output >> >> >> --- >> >> Py4JJavaError Traceback (most recent call >> last) >> >> /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, >> **kw) >> >> 62 try: >> >> ---> 63 return f(*a, **kw) >> >> 64 except py4j.protocol.Py4JJavaError as e: >> >> /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py >> in get_return_value(answer, gateway_client, target_id, name) >> >> 327 "An error occurred while calling >> {0}{1}{2}.\n". >> >> --> 328 format(target_id, ".", name), value) >> >> 329 else: >> >> Py4JJavaError: An error occurred while calling o33.collectToPython. >> >> : java.lang.IllegalArgumentException: Unsupported class file major >> version 55 >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136) >> >> at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237) >> >> at >> org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50) >> >> at >> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845) >> >> at >> org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828) >> >> at >> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) >> >> at >> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) >> >> at >> scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) >> >> at >> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) >> >> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) >> >> at >> scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134) >> >> at >> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) >> >> at >> org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureClea
Re: Hey good looking toPandas () error stack
You can see from the GitHub history for "toPandas()" that the function has been in the code for 5 years. https://github.com/apache/spark/blame/a075cd5b700f88ef447b559c6411518136558d78/python/pyspark/sql/dataframe.py#L923 When I google IllegalArgumentException: 'Unsupported class file major version 55' I see posts about the Java version being used. Are you sure your configs are right? https://stackoverflow.com/questions/53583199/pyspark-error-unsupported-class-file-major-version On Sat, Jun 20, 2020 at 6:17 AM Anwar AliKhan wrote: > > Two versions of Spark running against same code > > > https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd > > version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See > error stack below > > Jupyter Notebook > > import findspark > > findspark.init('/home/spark-3.0.0-bin-hadoop2.7') > > cell "spark" > > cell output > > SparkSession - in-memory > > SparkContext > > Spark UI > > Version > > v3.0.0 > > Master > > local[*] > > AppName > > Titanic Data > > > import findspark > > findspark.init('/home/spark-2.4.6-bin-hadoop2.7') > > cell "spark" > > > > cell output > > SparkSession - in-memory > > SparkContext > > Spark UI > > Version > > v2.4.6 > > Master > > local[*] > > AppName > > Titanic Data > > cell "df.show(5)" > > > +---++--++--+---+-+-++---+-++ > > |PassengerId|Survived|Pclass|Name| Sex|Age|SibSp|Parch| > Ticket| Fare|Cabin|Embarked| > > > +---++--++--+---+-+-++---+-++ > > | 1| 0| 3|Braund, Mr. Owen ...| male| 22|1|0| > A/5 21171| 7.25| null| S| > > | 2| 1| 1|Cumings, Mrs. Joh...|female| 38|1|0| > PC 17599|71.2833| C85| C| > > | 3| 1| 3|Heikkinen, Miss. ...|female| 26|0| > 0|STON/O2. 3101282| 7.925| null| S| > > | 4| 1| 1|Futrelle, Mrs. Ja...|female| 35|1|0| > 113803| 53.1| C123| S| > > | 5| 0| 3|Allen, Mr. Willia...| male| 35|0|0| > 373450| 8.05| null| S| > > > +---++--++--+---+-+-++---+-++ > > only showing top 5 rows > > cell "df.toPandas()" > > cell output > > --- > > Py4JJavaError Traceback (most recent call last) > > /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, > **kw) > > 62 try: > > ---> 63 return f(*a, **kw) > > 64 except py4j.protocol.Py4JJavaError as e: > > /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py > in get_return_value(answer, gateway_client, target_id, name) > > 327 "An error occurred while calling {0}{1}{2}.\n". > > --> 328 format(target_id, ".", name), value) > > 329 else: > > Py4JJavaError: An error occurred while calling o33.collectToPython. > > : java.lang.IllegalArgumentException: Unsupported class file major version > 55 > > at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166) > > at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148) > > at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136) > > at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237) > > at > org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50) > > at > org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845) > > at > org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828) > > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > > at > scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) > > at > scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) > > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > > at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134) > > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > > at > org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828) > > at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175) > > at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238) > > at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631) > > at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355) > > at > org.apache.spark.util.ClosureCl
Re: Hey good looking toPandas () error stack
Two versions of Spark running against same code https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd version spark-2.4.6-bin-hadoop2.7 is producing error for toPandas(). See error stack below Jupyter Notebook import findspark findspark.init('/home/spark-3.0.0-bin-hadoop2.7') cell "spark" cell output SparkSession - in-memory SparkContext Spark UI Version v3.0.0 Master local[*] AppName Titanic Data import findspark findspark.init('/home/spark-2.4.6-bin-hadoop2.7') cell "spark" cell output SparkSession - in-memory SparkContext Spark UI Version v2.4.6 Master local[*] AppName Titanic Data cell "df.show(5)" +---++--++--+---+-+-++---+-++ |PassengerId|Survived|Pclass|Name| Sex|Age|SibSp|Parch| Ticket| Fare|Cabin|Embarked| +---++--++--+---+-+-++---+-++ | 1| 0| 3|Braund, Mr. Owen ...| male| 22|1|0| A/5 21171| 7.25| null| S| | 2| 1| 1|Cumings, Mrs. Joh...|female| 38|1|0| PC 17599|71.2833| C85| C| | 3| 1| 3|Heikkinen, Miss. ...|female| 26|0| 0|STON/O2. 3101282| 7.925| null| S| | 4| 1| 1|Futrelle, Mrs. Ja...|female| 35|1|0| 113803| 53.1| C123| S| | 5| 0| 3|Allen, Mr. Willia...| male| 35|0|0| 373450| 8.05| null| S| +---++--++--+---+-+-++---+-++ only showing top 5 rows cell "df.toPandas()" cell output --- Py4JJavaError Traceback (most recent call last) /home/spark-2.4.6-bin-hadoop2.7/python/pyspark/sql/utils.py in deco(*a, **kw) 62 try: ---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e: /home/spark-2.4.6-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: Py4JJavaError: An error occurred while calling o33.collectToPython. : java.lang.IllegalArgumentException: Unsupported class file major version 55 at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166) at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148) at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136) at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237) at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50) at org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845) at org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) at org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828) at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175) at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238) at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631) at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355) at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:272) at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:271) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:271) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:163) at org.apache.spark.SparkContext.clean(SparkContext.scala:2326) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990) at org.apache.spark.rdd.RDDOperationScope$.withS
Re: Hey good looking toPandas ()
I got an illegal argument error with 2.4.6. I then pointed my Jupiter notebook to 3.0 version and it worked as expected. Using same .ipnyb file. I was following this machine learning example. “Your First Apache Spark ML Model” by Favio Vázquez https://towardsdatascience.com/your-first-apache-spark-ml-model-d2bb82b599dd In the example he is using version 3.0 so I assumed I got the error because I am using different version (2.4.6). On Fri, 19 Jun 2020, 08:06 Stephen Boesch, wrote: > afaik It has been there since Spark 2.0 in 2015. Not certain about > Spark 1.5/1.6 > > On Thu, 18 Jun 2020 at 23:56, Anwar AliKhan > wrote: > >> I first ran the command >> df.show() >> >> For sanity check of my dataFrame. >> >> I wasn't impressed with the display. >> >> I then ran >> df.toPandas() in Jupiter Notebook. >> >> Now the display is really good looking . >> >> Is toPandas() a new function which became available in Spark 3.0 ? >> >> >> >> >> >>
Re: Hey good looking toPandas ()
afaik It has been there since Spark 2.0 in 2015. Not certain about Spark 1.5/1.6 On Thu, 18 Jun 2020 at 23:56, Anwar AliKhan wrote: > I first ran the command > df.show() > > For sanity check of my dataFrame. > > I wasn't impressed with the display. > > I then ran > df.toPandas() in Jupiter Notebook. > > Now the display is really good looking . > > Is toPandas() a new function which became available in Spark 3.0 ? > > > > > >
Hey good looking toPandas ()
I first ran the command df.show() For sanity check of my dataFrame. I wasn't impressed with the display. I then ran df.toPandas() in Jupiter Notebook. Now the display is really good looking . Is toPandas() a new function which became available in Spark 3.0 ?