6 at 4:17 PM
To: Andrew Davidson <a...@santacruzintegration.com>
Cc: "user @spark" <user@spark.apache.org>
Subject: Re: Saving Spark streaming RDD with saveAsTextFiles ends up
creating empty files on HDFS
> I agree every time an OS file is created, it requires a context
gt; From: Mich Talebzadeh <mich.talebza...@gmail.com>
> Date: Tuesday, April 5, 2016 at 3:49 PM
> To: Andrew Davidson <a...@santacruzintegration.com>
> Cc: "user @spark" <user@spark.apache.org>
> Subject: Re: Saving Spark streaming RDD with saveAsTextF
n.com>
Cc: "user @spark" <user@spark.apache.org>
Subject: Re: Saving Spark streaming RDD with saveAsTextFiles ends up
creating empty files on HDFS
> Thanks Andy.
>
> Do we know if this is a known bug or simply a feature that on the face of it
> Spark cannot save RDD o
>
> rdd = rdd.repartition(tmp.intValue());
>
> return rdd;
>
> }
>
> });
>
>
>
> }
>
>
>
> From: Mich Talebzadeh <mich.talebza...@gmail.com>
> Date: Tuesday, April 5, 2016 at 3:06 P
h.talebza...@gmail.com>
Date: Tuesday, April 5, 2016 at 3:06 PM
To: "user @spark" <user@spark.apache.org>
Subject: Saving Spark streaming RDD with saveAsTextFiles ends up creating
empty files on HDFS
> Spark 1.6.1
>
> The following creates empty files. It prints li
Spark 1.6.1
The following creates empty files. It prints lines OK with println
val result = lines.filter(_.contains("ASE 15")).flatMap(line =>
line.split("\n,")).map(word => (word, 1)).reduceByKey(_ + _)
result.saveAsTextFiles("/tmp/rdd_stuff")
I am getting zero length files
drwxr-xr-x -