I did threading but got many failed tasks and they were not reprocessed. I
am guessing driver lost track of threaded tasks. I had also tired Future
and par of scala and same result as above

On Mon, Jul 17, 2017 at 5:56 PM, Pralabh Kumar <pralabhku...@gmail.com>
wrote:

> Run the spark context in multithreaded way .
>
> Something like this
>
> val spark =  SparkSession.builder()
>   .appName("practice")
>   .config("spark.scheduler.mode","FAIR")
>   .enableHiveSupport().getOrCreate()
> val sc = spark.sparkContext
> val hc = spark.sqlContext
>
>
> val thread1 = new Thread {
>      override def run {
>        hc.sql("select * from table1")
>      }
>    }
>
>    val thread2 = new Thread {
>      override def run {
>        hc.sql("select * from table2")
>      }
>    }
>
>    thread1.start()
>    thread2.start()
>
>
>
> On Mon, Jul 17, 2017 at 5:42 PM, FN <nuson.fr...@gmail.com> wrote:
>
>> Hi
>> I am currently trying to parallelize reading multiple tables from Hive .
>> As
>> part of an archival framework, i need to convert few hundred tables which
>> are in txt format to Parquet. For now i am calling a Spark SQL inside a
>> for
>> loop for conversion. But this execute sequential and entire process takes
>> longer time to finish.
>>
>> I tired  submitting 4 different Spark jobs ( each having set of tables to
>> be
>> converted), it did give me some parallelism , but i would like to do this
>> in
>> single Spark job due to few limitation of our cluster and process
>>
>> Any help will be greatly appreciated
>>
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Reading-Hive-tables-Parallel-in-Spark-tp28869.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>

Reply via email to