Hi there,
I encountered a problem that makes hive on spark with a very low performance.
I'm using spark 1.6.2 and hive 2.1.0, I specified
spark.shuffle.service.enabled true
spark.dynamicAllocation.enabled true
in my spark-default.conf file (the file is in both spark and hive conf folder)
to make spark job to get executors dynamically.
The configuration works correctly when I run spark jobs, but when I use hive on
spark, it only started a few executors although there are more enough cores and
memories to start more executors.
For example, for the same SQL query, if I run on sparkSQL, it can start more
than 20 executors, but with hive on spark, only 3.
How can I improve the performance on hive on spark? Any suggestions please.
Thanks,
Minghao Feng