Gopal V created HIVE-14060:
------------------------------

             Summary: Hive: Remove bogus "localhost" from Hive splits
                 Key: HIVE-14060
                 URL: https://issues.apache.org/jira/browse/HIVE-14060
             Project: Hive
          Issue Type: Bug
          Components: Tez
    Affects Versions: 2.1.0, 2.2.0
            Reporter: Gopal V
            Assignee: Gopal V


On remote filesystems like Azure, GCP and S3, the splits contain a filler 
location of "localhost".

This is worse than having no location information at all - on large clusters 
yarn waits upto 200[1] seconds for heartbeat from "localhost" before allocating 
a container.

To speed up this process, the split affinity provider should scrub the bogus 
"localhost" from the locations and allow for the allocation of "*" containers 
instead on each heartbeat.

[1] - yarn.scheduler.capacity.node-locality-delay=40 x heartbeat of 5s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to