[ https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033116#comment-13033116 ]
Jeffrey Naisbitt commented on MAPREDUCE-2489: --------------------------------------------- Honestly, I'm not sure what caching was enabled at the time. How would caching have helped in this case though - where we have basically tons of lookups on garbage hostnames? (none of these strings are repeated) > Jobsplits with random hostnames can make the queue unusable > ----------------------------------------------------------- > > Key: MAPREDUCE-2489 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Reporter: Jeffrey Naisbitt > Assignee: Jeffrey Naisbitt > > We saw an issue where a custom InputSplit was returning invalid hostnames for > the splits that were then causing the JobTracker to attempt to excessively > resolve host names. This caused a major slowdown for the JobTracker. We > should prevent invalid InputSplit hostnames from affecting everyone else. > I propose we implement some verification for the hostnames to try to ensure > that we only do DNS lookups on valid hostnames (and fail otherwise). We > could also fail the job after a certain number of failures in the resolve. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira