Sahil Takiar created HIVE-19525:
-----------------------------------

             Summary: Spark task logs print PLAN PATH excessive number of times
                 Key: HIVE-19525
                 URL: https://issues.apache.org/jira/browse/HIVE-19525
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Sahil Takiar


A ton of logs with this {{Utilities - PLAN PATH = 
hdfs://localhost:59527/.../apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/6ebceb49-7a76-4159-9082-5bba44391e30/hive_2018-05-14_07-28-44_672_8205774950452575544-1/-mr-10006/bf14c0b5-a014-4ee8-8ddf-fdb7453eb0f0/map.xml}}

Seems it print multiple times per task exception, not sure where it is coming 
from, but its too verbose. It should be changed to DEBUG level. Furthermore, 
given that we are using {{Utilities#getBaseWork}} anytime we need to access a 
{{MapWork}} or {{ReduceWork}} object, we should make the method slightly more 
efficient. Right now it borrows a {{Kryo}} from a pool and does a bunch of 
stuff to set the classloader, then it checks the cache to see if the work 
object has already been created. It should check the cache before doing any of 
that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to