Hi,
You can use FileUtil.copyMerge API and specify the path to the folder where
saveAsTextFile is save the part text file.
Suppose your directory is /a/b/c/
use FileUtil.copyMerge(FileSystem of source, a/b/c, FileSystem of
destination, Path to the merged file say (a/b/c.txt), true(to delete the
thanks for the replies. very useful.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/save-spark-streaming-output-to-single-file-on-hdfs-tp21124p21176.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Thanks. The problem is that we'd like it to be picked up by hive.
On Tue Jan 13 2015 at 18:15:15 Davies Liu dav...@databricks.com wrote:
On Tue, Jan 13, 2015 at 10:04 AM, jamborta jambo...@gmail.com wrote:
Hi all,
Is there a way to save dstream RDDs to a single file so that another
On Tue, Jan 13, 2015 at 10:04 AM, jamborta jambo...@gmail.com wrote:
Hi all,
Is there a way to save dstream RDDs to a single file so that another process
can pick it up as a single RDD?
It does not need to a single file, Spark can pick any directory as a single RDD.
Also, it's easy to union
Right now, you couldn't. You could load each file as a partition into
Hive, or you need to pack the files together by other tools or spark
job.
On Tue, Jan 13, 2015 at 10:35 AM, Tamas Jambor jambo...@gmail.com wrote:
Thanks. The problem is that we'd like it to be picked up by hive.
On Tue Jan