Re: save spark streaming output to single file on hdfs

Davies Liu Tue, 13 Jan 2015 11:11:27 -0800

Right now, you couldn't. You could load each file as a partition into
Hive, or you need to pack the files together by other tools or spark
job.


On Tue, Jan 13, 2015 at 10:35 AM, Tamas Jambor <jambo...@gmail.com> wrote:
> Thanks. The problem is that we'd like it to be picked up by hive.
>
>
> On Tue Jan 13 2015 at 18:15:15 Davies Liu <dav...@databricks.com> wrote:
>>
>> On Tue, Jan 13, 2015 at 10:04 AM, jamborta <jambo...@gmail.com> wrote:
>> > Hi all,
>> >
>> > Is there a way to save dstream RDDs to a single file so that another
>> > process
>> > can pick it up as a single RDD?
>>
>> It does not need to a single file, Spark can pick any directory as a
>> single RDD.
>>
>> Also, it's easy to union multiple RDDs into single one.
>>
>> > It seems that each slice is saved to a separate folder, using
>> > saveAsTextFiles method.
>> >
>> > I'm using spark 1.2 with pyspark
>> >
>> > thanks,
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> > http://apache-spark-user-list.1001560.n3.nabble.com/save-spark-streaming-output-to-single-file-on-hdfs-tp21124.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: user-h...@spark.apache.org
>> >

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: save spark streaming output to single file on hdfs

Reply via email to