I was trying to benchmark some hive queries. I am using the tez execution
engine. I varied the values of the following properties:

   1.

   hive.tez.container.size
   2.

   tez.task.resource.memory.mb
   3.

   tez.task.resource.cpu.vcores

Changes in values for property 1 is reflected properly. However it seems
that hive does not respect changes in values of property 3; it always
allocates one vcore per requested container (RM is configured to use the
DominantResourceCalculator). This got me thinking about the precedence of
property values in hive and tez.

I have the following questions with respect to these configurations

   1.

   Does hive respect the set values for the properties 2 and 3 at all?
   2.

   If I set property 1 to a value say 2048 MB and property 2 is set to a
   value of say 1024 MB does this mean that I am wasting about a GB of memory
   for each spawned container?
   3.

   Is there a property in hive similar to property 1 that allows me to use
   the 'set' command in the .hql file to specify the number of vcores to use
   per container?
   4.

   Changes in value for the property tez.am.resource.cpu.vcores are
   reflected at runtime. However I do not observe the same behaviour with
   property 3. Are there other configurations that take precedence over it?

Your inputs and suggestions would be highly appreciated.

Thanks!


PS: Tests conducted on a 5 node cluster running HDP 2.3.0

Reply via email to