Re: Spark RuntimeException hadoop output format

Ted Yu Fri, 14 Aug 2015 16:37:25 -0700

Please take a look at JavaPairDStream.scala:
 def saveAsHadoopFiles[F <: OutputFormat[_, _]](
      prefix: String,
      suffix: String,
      keyClass: Class[_],
      valueClass: Class[_],
      outputFormatClass: Class[F]) {


Did you intend to use outputPath as prefix ?

Cheers


On Fri, Aug 14, 2015 at 1:36 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:

> Spark 1.3
>
> Code:
>
> wordCounts.foreachRDD(*new* *Function2<JavaPairRDD<String, Integer>,
> Time, Void>()* {
>
> @Override
>
> *public* Void call(JavaPairRDD<String, Integer> rdd, Time time) *throws*
> IOException {
>
> String counts = "Counts at time " + time + " " + rdd.collect();
>
> System.*out*.println(counts);
>
> System.*out*.println("Appending to " + outputFile.getAbsolutePath());
>
> Files.*append*(counts + "\n", outputFile, Charset.*defaultCharset*());
>
> *return* *null*;
>
> }
>
> });
>
> wordCounts.saveAsHadoopFiles(outputPath, "txt", Text.*class*, Text.*class*,
> TextOutputFormat.*class*);
>
>
> What do I need to check in namenode? I see 0 bytes files like this:
>
>
> drwxr-xr-x   - ec2-user supergroup          0 2015-08-13 15:45
> /tmp/out-1439495124000.txt
> drwxr-xr-x   - ec2-user supergroup          0 2015-08-13 15:45
> /tmp/out-1439495125000.txt
> drwxr-xr-x   - ec2-user supergroup          0 2015-08-13 15:45
> /tmp/out-1439495126000.txt
> drwxr-xr-x   - ec2-user supergroup          0 2015-08-13 15:45
> /tmp/out-1439495127000.txt
> drwxr-xr-x   - ec2-user supergroup          0 2015-08-13 15:45
> /tmp/out-1439495128000.txt
>
>
>
> However, I also wrote data to a local file on the local file system for
> verification and I see the data:
>
>
> $ ls -ltr !$
> ls -ltr /tmp/out
> -rw-r--r-- 1 yarn yarn 5230 Aug 13 15:45 /tmp/out
>
>
> On Fri, Aug 14, 2015 at 6:15 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Which Spark release are you using ?
>>
>> Can you show us snippet of your code ?
>>
>> Have you checked namenode log ?
>>
>> Thanks
>>
>>
>>
>> On Aug 13, 2015, at 10:21 PM, Mohit Anchlia <mohitanch...@gmail.com>
>> wrote:
>>
>> I was able to get this working by using an alternative method however I
>> only see 0 bytes files in hadoop. I've verified that the output does exist
>> in the logs however it's missing from hdfs.
>>
>> On Thu, Aug 13, 2015 at 10:49 AM, Mohit Anchlia <mohitanch...@gmail.com>
>> wrote:
>>
>>> I have this call trying to save to hdfs 2.6
>>>
>>> wordCounts.saveAsNewAPIHadoopFiles("prefix", "txt");
>>>
>>> but I am getting the following:
>>> java.lang.RuntimeException: class scala.runtime.Nothing$ not
>>> org.apache.hadoop.mapreduce.OutputFormat
>>>
>>
>>
>

Re: Spark RuntimeException hadoop output format

Reply via email to