Gopal V created HIVE-14060: ------------------------------ Summary: Hive: Remove bogus "localhost" from Hive splits Key: HIVE-14060 URL: https://issues.apache.org/jira/browse/HIVE-14060 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.1.0, 2.2.0 Reporter: Gopal V Assignee: Gopal V
On remote filesystems like Azure, GCP and S3, the splits contain a filler location of "localhost". This is worse than having no location information at all - on large clusters yarn waits upto 200[1] seconds for heartbeat from "localhost" before allocating a container. To speed up this process, the split affinity provider should scrub the bogus "localhost" from the locations and allow for the allocation of "*" containers instead on each heartbeat. [1] - yarn.scheduler.capacity.node-locality-delay=40 x heartbeat of 5s -- This message was sent by Atlassian JIRA (v6.3.4#6332)