[ https://issues.apache.org/jira/browse/SPARK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251507#comment-14251507 ]
Shixiong Zhu commented on SPARK-2075: ------------------------------------- Dig deeply and found weird things: If I used `mvn -Dhadoop.version=1.2.1 -DskipTests clean package -pl core -am` to compile, the `saveAsTextFile` will be: {noformat} public void saveAsTextFile(java.lang.String); Code: 0: aload_0 1: new #1577; //class org/apache/spark/rdd/RDD$$anonfun$27 4: dup 5: aload_0 6: invokespecial #1578; //Method org/apache/spark/rdd/RDD$$anonfun$27."<init>":(Lorg/apache/spark/rdd/RDD;)V 9: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 12: ldc_w #441; //class scala/Tuple2 15: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 18: invokevirtual #447; //Method map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD; 21: astore_2 22: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 25: ldc_w #1580; //class org/apache/hadoop/io/NullWritable 28: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 31: astore_3 32: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 35: ldc_w #1582; //class org/apache/hadoop/io/Text 38: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 41: astore 4 43: getstatic #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$; 46: aload_2 47: invokevirtual #23; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions$default$4:(Lorg/apache/spark/rdd/RDD;)Lscala/runtime/Null$; 50: astore 5 52: getstatic #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$; 55: aload_2 56: aload_3 57: aload 4 59: aload 5 61: pop 62: aconst_null 63: invokevirtual #47; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions; 66: aload_1 67: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 70: ldc_w #1584; //class org/apache/hadoop/mapred/TextOutputFormat 73: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 76: invokevirtual #1588; //Method org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V 79: return {noformat} If I used `mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package -pl core -am` to compile, the `saveAsTextFile` is different: {noformat} public void saveAsTextFile(java.lang.String); Code: 0: getstatic #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$; 3: aload_0 4: new #1577; //class org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1 7: dup 8: aload_0 9: invokespecial #1578; //Method org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1."<init>":(Lorg/apache/spark/rdd/RDD;)V 12: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 15: ldc_w #441; //class scala/Tuple2 18: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 21: invokevirtual #447; //Method map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD; 24: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 27: ldc_w #1580; //class org/apache/hadoop/io/NullWritable 30: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 33: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 36: ldc_w #1582; //class org/apache/hadoop/io/Text 39: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 42: getstatic #1587; //Field scala/math/Ordering$.MODULE$:Lscala/math/Ordering$; 45: getstatic #471; //Field scala/Predef$.MODULE$:Lscala/Predef$; 48: invokevirtual #1591; //Method scala/Predef$.conforms:()Lscala/Predef$$less$colon$less; 51: invokevirtual #1595; //Method scala/math/Ordering$.ordered:(Lscala/Function1;)Lscala/math/Ordering; 54: invokevirtual #47; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions; 57: aload_1 58: getstatic #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$; 61: ldc_w #1597; //class org/apache/hadoop/mapred/TextOutputFormat 64: invokevirtual #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag; 67: invokevirtual #1601; //Method org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V 70: return {noformat} Note: in hadoop 1.2.1, saveAsTextFile use the default `Ordering` value `null`, while in hadoop 2.2.0, saveAsTextFile will use `Ordering.ordered` to create a new `Ordering`. > Anonymous classes are missing from Spark distribution > ----------------------------------------------------- > > Key: SPARK-2075 > URL: https://issues.apache.org/jira/browse/SPARK-2075 > Project: Spark > Issue Type: Bug > Components: Build, Spark Core > Affects Versions: 1.0.0 > Reporter: Paul R. Brown > Priority: Critical > > Running a job built against the Maven dep for 1.0.0 and the hadoop1 > distribution produces: > {code} > java.lang.ClassNotFoundException: > org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1 > {code} > Here's what's in the Maven dep as of 1.0.0: > {code} > jar tvf > ~/.m2/repository/org/apache/spark/spark-core_2.10/1.0.0/spark-core_2.10-1.0.0.jar > | grep 'rdd/RDD' | grep 'saveAs' > 1519 Mon May 26 13:57:58 PDT 2014 > org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class > 1560 Mon May 26 13:57:58 PDT 2014 > org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class > {code} > And here's what's in the hadoop1 distribution: > {code} > jar tvf spark-assembly-1.0.0-hadoop1.0.4.jar| grep 'rdd/RDD' | grep 'saveAs' > {code} > I.e., it's not there. It is in the hadoop2 distribution: > {code} > jar tvf spark-assembly-1.0.0-hadoop2.2.0.jar| grep 'rdd/RDD' | grep 'saveAs' > 1519 Mon May 26 07:29:54 PDT 2014 > org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class > 1560 Mon May 26 07:29:54 PDT 2014 > org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org