Re: Saving Dstream into a single file

2015-03-23 Thread Dean Wampler
You can use the coalesce method to reduce the number of partitions. You can reduce to one if the data is not too big. Then write the output. Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com

Re: Saving Dstream into a single file

2015-03-16 Thread Zhan Zhang
Each RDD has multiple partitions, each of them will produce one hdfs file when saving output. I don’t think you are allowed to have multiple file handler writing to the same hdfs file. You still can load multiple files into hive tables, right? Thanks.. Zhan Zhang On Mar 15, 2015, at 7:31