Reading Hive tables Parallel in Spark

FN Mon, 17 Jul 2017 05:12:44 -0700

Hi
I am currently trying to parallelize reading multiple tables from Hive . As
part of an archival framework, i need to convert few hundred tables which
are in txt format to Parquet. For now i am calling a Spark SQL inside a for
loop for conversion. But this execute sequential and entire process takes
longer time to finish.


I tired  submitting 4 different Spark jobs ( each having set of tables to be
converted), it did give me some parallelism , but i would like to do this in
single Spark job due to few limitation of our cluster and process

Any help will be greatly appreciated 





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Reading-Hive-tables-Parallel-in-Spark-tp28869.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reading Hive tables Parallel in Spark

Reply via email to