In my opinion, this ultimately becomes a resource balance issue that you'll
need to test.
You have a fixed amount of memory (although you haven't said what it is). As
you increase the number of tasks, the available memory per task will decrease.
If the tasks run out of memory, they will either page to disk or fail.
If they fail you'll be forced to increase available memory until you figure out
how much memory each task needs to succeed...that will become the limiting
factor on how many tasks you can run.
Hive is generally good about falling back and not failing, but performance may
suffer.
But given the situation you describe, I'd generally recommend increasing the
number of tasks and monitoring to ensure that actually improves the performance.
-Original Message-
From: mahender bigdata [mailto:mahender.bigd...@outlook.com]
Sent: Monday, March 28, 2016 6:04 PM
To: user@hive.apache.org
Subject: make best use of VCore in Hive
Hi,
Currently we are doing join 2-3 big tables and couple of Left Joins. We
are running on 40 node cluster, During query execution, we could see all
the memory has been utilized completely (100%), which is perfect. But
Number of VCore used are less than 50%. Is there a way to increase usage
of number of cores to almost 90-100% . Can setting Number of Tasks for
mapper and reduce with TEZ execution makes effectively VCore ?. Any
guidance on VCore maximum utilization.
==
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL
and may contain information that is privileged and exempt from disclosure under
applicable law. If you are neither the intended recipient nor responsible for
delivering the message to the intended recipient, please note that any
dissemination, distribution, copying or the taking of any action in reliance
upon the message is strictly prohibited. If you have received this
communication in error, please notify the sender immediately. Thank you.