Shark used to have shark.map.tasks variable. Is there an equivalent for
Spark SQL?

We are trying a scenario with heavily partitioned Hive tables. We end up
with a UnionRDD with a lot of partitions underneath and hence too many
tasks:
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L202

is there a good way to tell SQL to coalesce these?

thanks for any pointers

Reply via email to