[SparkSQL] Number of map tasks in SparkSQL

Yana Kadiyska Tue, 24 Feb 2015 09:29:20 -0800

Shark used to have shark.map.tasks variable. Is there an equivalent for
Spark SQL?


We are trying a scenario with heavily partitioned Hive tables. We end up
with a UnionRDD with a lot of partitions underneath and hence too many
tasks:
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L202

is there a good way to tell SQL to coalesce these?

thanks for any pointers

[SparkSQL] Number of map tasks in SparkSQL

Reply via email to