Re: Does NullWritable can not be used in Spark?

donhoff_h Sun, 10 May 2015 20:06:40 -0700

Hi,


I am using hadoop2.5.2. My codes are listed as following. Besides, I also made 
some further tests. I found the following interesting result:


1.I will meet those exceptions when I set the Key Class as NullWritable, 
LongWritable, or IntWritable and used the 
PairRDDFunctions.saveAsNewAPIHadoopFile API .
2.I won't meet those exceptions when I set the Key Class as Text or 
BytesWritable and used the PairRDDFunctions.saveAsNewAPIHadoopFile API .
3.I won't meet those exceptions when I use the 
SequenceFileRDDFunctions.saveAsSequenceFile API, no matter which class I set my 
Key Class as. And sinceSequenceFileRDDFunctions.saveAsSequenceFile calls 
PairRDDFunctions.saveAsHadoopFile, I suppose that the 
PairRDDFunctions.saveAsHadoopFile API is also OK.


Following is my code,it's very simple. Please help me to find out is there 
anything I did wrong, or does the PairRDDFunctions.saveAsNewAPIHadoopFile API 
has bugs? Many Thanks!
******************My Code**********************
    val conf = new SparkConf()
    val sc = new SparkContext(conf)
    val rdd = 
sc.textFile("hdfs://bgdt-dev-hrb/user/spark/tst/charset/A_utf8.txt")
    rdd.map(s => { val value = new Text(s); (NullWritable.get(),value) } )
      
.saveAsNewAPIHadoopFile("hdfs://bgdt-dev-hrb/user/spark/tst/seq.output.02", 
classOf[NullWritable],classOf[Text],classOf[SequenceFileOutputFormat[NullWritable,Text]])
    sc.stop()




------------------ Original ------------------
From:  "yuzhihong";<[email protected]>;
Send time: Sunday, May 10, 2015 10:44 PM
To: "donhoff_h"<[email protected]>; 
Cc: "user"<[email protected]>; 
Subject:  Re: Does NullWritable can not be used in Spark?



Looking at 
./core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala :

   * Load an RDD saved as a SequenceFile containing serialized objects, with 
NullWritable keys and
   * BytesWritable values that contain a serialized partition. This is still an 
experimental storage

...
 def objectFile[T](path: String, minPartitions: Int): JavaRDD[T] = {



and ./core/src/main/scala/org/apache/spark/rdd/RDD.scala :



  def saveAsTextFile(path: String): Unit = withScope {

...
    // Therefore, here we provide an explicit Ordering `null` to make sure the 
compiler generate
    // same bytecodes for `saveAsTextFile`.



Which hadoop release are you using ?
Can you show us your code so that we can have more context ?


Cheers


On Sat, May 9, 2015 at 9:58 PM, donhoff_h <[email protected]> wrote:
Hi, experts.


I wrote a spark program to write a sequence file. I found if I used the 
NullWritable as the Key Class of the SequenceFile, the program reported 
exceptions. But if I used the BytesWritable or Text as the Key Class, the 
program did not report the exceptions.


Does spark not support NullWritable class?  The spark version I use is 1.3.0 
and the exceptions are as following:


ERROR yarn.ApplicationMaster: User class threw exception: 
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
  java.lang.NoSuchMethodError: 
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
    at dhao.test.SeqFile.TestWriteSeqFile02$.main(TestWriteSeqFile02.scala:21)
    at dhao.test.SeqFile.TestWriteSeqFile02.main(TestWriteSeqFile02.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)

Re: Does NullWritable can not be used in Spark?

Reply via email to