Hi,
I do a sql query on about 10,000 partitioned orc files. Because of the
partition schema the files cannot be merged any longer (to reduce the
total number).
From this command hiveContext.sql(sqlText), the 10K tasks were created
to handle each file. Is it possible to use less tasks? How to force the
spark sql to use less tasks?
BR,
Patcharee
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org