Hi I have couple of Spark jobs which reads Hive table partitions data and
processes it independently in different threads in a driver. Now data to
process is huge in terms of TB my jobs are not scaling and running slow. So
I am thinking to use Spark Streaming as and when data is added into Hive
partitions so that I dont need to process only loaded partitions.

Can we read directly Hive table partitions data using Spark streaming?
Please guide. Also please share best practices to process TBs of data
generated everyday. Please guide. Thanks in advance.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-using-Spark-Streaming-to-stream-data-from-Hive-table-partitions-tp24915.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to