Each RDD has multiple partitions, each of them will produce one hdfs file when 
saving output. I don’t think you are allowed to have multiple file handler 
writing to the same hdfs file.  You still can load multiple files into hive 
tables, right?

Thanks..

Zhan Zhang

On Mar 15, 2015, at 7:31 AM, tarek_abouzeid <tarek.abouzei...@yahoo.com> wrote:

> i am doing word count example on flume stream and trying to save output as
> text files in HDFS , but in the save directory i got multiple sub
> directories each having files with small size , i wonder if there is a way
> to append in a large file instead of saving in multiple files , as i intend
> to save the output in hive hdfs directory so i can query the result using
> hive 
> 
> hope anyone have a workaround for this issue , Thanks in advance 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Saving-Dstream-into-a-single-file-tp22058.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to