Hi,
I am using hadoop2.5.2. My codes are listed as following. Besides, I also made
some further tests. I found the following interesting result:
1.I will meet those exceptions when I set the Key Class as NullWritable,
LongWritable, or IntWritable and used the
PairRDDFunctions.saveAsNewAPIHadoopFile API .
2.I won't meet those exceptions when I set the Key Class as Text or
BytesWritable and used the PairRDDFunctions.saveAsNewAPIHadoopFile API .
3.I won't meet those exceptions when I use the
SequenceFileRDDFunctions.saveAsSequenceFile API, no matter which class I set my
Key Class as. And sinceSequenceFileRDDFunctions.saveAsSequenceFile calls
PairRDDFunctions.saveAsHadoopFile, I suppose that the
PairRDDFunctions.saveAsHadoopFile API is also OK.
Following is my code,it's very simple. Please help me to find out is there
anything I did wrong, or does the PairRDDFunctions.saveAsNewAPIHadoopFile API
has bugs? Many Thanks!
******************My Code**********************
val conf = new SparkConf()
val sc = new SparkContext(conf)
val rdd =
sc.textFile("hdfs://bgdt-dev-hrb/user/spark/tst/charset/A_utf8.txt")
rdd.map(s => { val value = new Text(s); (NullWritable.get(),value) } )
.saveAsNewAPIHadoopFile("hdfs://bgdt-dev-hrb/user/spark/tst/seq.output.02",
classOf[NullWritable],classOf[Text],classOf[SequenceFileOutputFormat[NullWritable,Text]])
sc.stop()
------------------ Original ------------------
From: "yuzhihong";<[email protected]>;
Send time: Sunday, May 10, 2015 10:44 PM
To: "donhoff_h"<[email protected]>;
Cc: "user"<[email protected]>;
Subject: Re: Does NullWritable can not be used in Spark?
Looking at
./core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala :
* Load an RDD saved as a SequenceFile containing serialized objects, with
NullWritable keys and
* BytesWritable values that contain a serialized partition. This is still an
experimental storage
...
def objectFile[T](path: String, minPartitions: Int): JavaRDD[T] = {
and ./core/src/main/scala/org/apache/spark/rdd/RDD.scala :
def saveAsTextFile(path: String): Unit = withScope {
...
// Therefore, here we provide an explicit Ordering `null` to make sure the
compiler generate
// same bytecodes for `saveAsTextFile`.
Which hadoop release are you using ?
Can you show us your code so that we can have more context ?
Cheers
On Sat, May 9, 2015 at 9:58 PM, donhoff_h <[email protected]> wrote:
Hi, experts.
I wrote a spark program to write a sequence file. I found if I used the
NullWritable as the Key Class of the SequenceFile, the program reported
exceptions. But if I used the BytesWritable or Text as the Key Class, the
program did not report the exceptions.
Does spark not support NullWritable class? The spark version I use is 1.3.0
and the exceptions are as following:
ERROR yarn.ApplicationMaster: User class threw exception:
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
java.lang.NoSuchMethodError:
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
at dhao.test.SeqFile.TestWriteSeqFile02$.main(TestWriteSeqFile02.scala:21)
at dhao.test.SeqFile.TestWriteSeqFile02.main(TestWriteSeqFile02.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)