[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083529#comment-13083529
 ] 

Hudson commented on MAPREDUCE-2489:
---

Integrated in Hadoop-Common-trunk-Commit #728 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/728/])
MAPREDUCE-2489. Jobsplits with random hostnames can make the queue unusable 
(jeffrey naisbit via mahadev)

mahadev : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1156821
Files : 
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobTracker.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s-v6.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred-v4.patch, 
 MAPREDUCE-2489-mapred-v5.patch, MAPREDUCE-2489-mapred-v6.patch, 
 MAPREDUCE-2489-mapred-v7.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083809#comment-13083809
 ] 

Hudson commented on MAPREDUCE-2489:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #761 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/761/])
MAPREDUCE-2489. Jobsplits with random hostnames can make the queue unusable 
(jeffrey naisbit via mahadev)

mahadev : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1156821
Files : 
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobTracker.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s-v6.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred-v4.patch, 
 MAPREDUCE-2489-mapred-v5.patch, MAPREDUCE-2489-mapred-v6.patch, 
 MAPREDUCE-2489-mapred-v7.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13083814#comment-13083814
 ] 

Hudson commented on MAPREDUCE-2489:
---

Integrated in Hadoop-Mapreduce-trunk #752 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/752/])
MAPREDUCE-2489. Jobsplits with random hostnames can make the queue unusable 
(jeffrey naisbit via mahadev)

mahadev : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1156821
Files : 
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobTracker.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s-v6.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred-v4.patch, 
 MAPREDUCE-2489-mapred-v5.patch, MAPREDUCE-2489-mapred-v6.patch, 
 MAPREDUCE-2489-mapred-v7.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-09 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081917#comment-13081917
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

The only test failures on trunk are unrelated to this patch and fail with a 
clean checkout of trunk as well.

trunk test-patch results (since this patch will fail until HADOOP-7499 has been 
committed):
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] -1 findbugs.  The patch appears to introduce 1 new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 system test framework.  The patch failed system test 
framework compile.


The extra findbugs warning is unrelated to this bug (Incorrect lazy 
initialization of static field org.apache.hadoop.mapred.TaskLog.localFS in 
org.apache.hadoop.mapred.TaskLog.writeToIndexFile(String, boolean))

The system test framework is because HADOOP-7499 has not been committed yet.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred-v5.patch, 
 MAPREDUCE-2489-mapred-v6.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-06 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080438#comment-13080438
 ] 

Mahadev konar commented on MAPREDUCE-2489:
--

+1 for 0.20S patch. I have committed that to the branch. Thanks Jeffrey. I'll 
wait for the trunk common and mapred patches before closing this jira.



 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred-v5.patch, 
 MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-04 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079472#comment-13079472
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

I meant 'ant test' instead of 'mvn test'.
I was running with ant-1.8.2, which apparently caused the test failures. 
All of the tests pass with ant-1.7.1

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred-v5.patch, 
 MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-04 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079540#comment-13079540
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

To clarify, all the tests pass for the 0.20s patch.  I am still looking at the 
patch for trunk.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s-v5.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred-v5.patch, 
 MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-02 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13076195#comment-13076195
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

I agree.  It would definitely fit better in NetUtils.  I'll post new patches 
soon.


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-02 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13076283#comment-13076283
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

20-security branch test-patch results:

 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s-v4.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred-v4.patch, 
 MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-08-01 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073838#comment-13073838
 ] 

Mahadev konar commented on MAPREDUCE-2489:
--

Jeffrey,
 One minor nit, 

 The method:

{code}

  static void verifyHostnames(String[] names) throws UnknownHostException {
{code}

does not seem appropriate for JobInProgress class. It needs to be moved out to 
some helper class. NetUtils seems more appropriate for this helper method. What 
do you think?


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-07-25 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070456#comment-13070456
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

Mahadev,
There are two parts to this patch.  The most important is the use of the new 
resolveValidHosts function and throwing the UnknownHostException up the 
chain.  That part ensures that the hostnames are actually valid and is needed 
to resolve the problem this Jira was filed for. 

The second part (that you describe above) is meant to only be an early sanity 
check.  All it is trying to do is ensure that the hostnames appear to be 
valid.  With this, I'm just trying to check the hostname string itself to see 
that it contains valid characters and such so we can fail early on if the 
hostname string contains garbage (like in the original case this Jira was filed 
for).  The URI object seemed the easiest and most standard way I could find to 
do this.  The URI object creation will fail, or the uri.getHost() method will 
return null if the string passed in doesn't match their regular expressions for 
what a URI should look like (which includes checking the hostname portion).  
The URI object requires a schema (http://, ftp://, etc.), so I add one on if it 
doesn't already have one in the string.  Again, this is only really meant to 
catch the garbage string cases, and it's not meant to be a perfect check that 
will catch all hostname problems (those are caught by the first part of the 
patch mentioned above).

Does that make sense?  If you have other questions or suggestions, please let 
me know :) 
Thanks for looking at it!

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-07-24 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070318#comment-13070318
 ] 

Mahadev konar commented on MAPREDUCE-2489:
--

Jeffrey,
 Sorry, I am a little unclear on what the patch is doing. Can you please 
specify what you are trying to achieve with the patch? The patch seems to 
create a URI with hostname and checking if its a valid URI or not? How is that 
verifying if a hostname is valid or not? 


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.20.205.0, 0.23.0

 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065433#comment-13065433
 ] 

Hadoop QA commented on MAPREDUCE-2489:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12486481/MAPREDUCE-2489-mapred-v4.patch
  against trunk revision 1146517.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:


-1 contrib tests.  The patch failed contrib unit tests.

-1 system test framework.  The patch failed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/467//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/467//console

This message is automatically generated.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s-v3.patch, MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred-v3.patch, 
 MAPREDUCE-2489-mapred-v4.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-06-23 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053997#comment-13053997
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

test-patch results for the 0.20s patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.


 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-06-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054000#comment-13054000
 ] 

Hadoop QA commented on MAPREDUCE-2489:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12483623/MAPREDUCE-2489-mapred-v3.patch
  against trunk revision 1138301.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:


-1 contrib tests.  The patch failed contrib unit tests.

-1 system test framework.  The patch failed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/413//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/413//console

This message is automatically generated.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-06-23 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054002#comment-13054002
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

Again, this patch will fail until HADOOP-7314 has been committed.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s-v2.patch, 
 MAPREDUCE-2489-0.20s.patch, MAPREDUCE-2489-mapred-v2.patch, 
 MAPREDUCE-2489-mapred-v3.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-06-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049876#comment-13049876
 ] 

Hadoop QA commented on MAPREDUCE-2489:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12482679/MAPREDUCE-2489-0.20s.patch
  against trunk revision 1136000.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/397//console

This message is automatically generated.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-06-15 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049941#comment-13049941
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

Test-patch results for 20-security patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.205.0, 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-0.20s.patch, 
 MAPREDUCE-2489-mapred-v2.patch, MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-05-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038008#comment-13038008
 ] 

jirapos...@reviews.apache.org commented on MAPREDUCE-2489:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/776/
---

Review request for hadoop-mapreduce.


Summary
---

We saw an issue where a custom InputSplit was returning invalid hostnames 
(non-repeating) for the splits that were then causing the JobTracker to attempt 
to excessively resolve host names. This caused a major slowdown for the 
JobTracker. We should prevent invalid InputSplit hostnames from affecting 
everyone else.

I propose we implement some verification for the hostnames to try to ensure 
that we only do DNS lookups on valid hostnames (and fail otherwise). We could 
also fail the job after a certain number of failures in the resolve.

NOTE: This requires the changes in HADOOP-7314


This addresses bug MAPREDUCE-2489.
https://issues.apache.org/jira/browse/MAPREDUCE-2489


Diffs
-

  trunk/ivy.xml 1125074 
  trunk/ivy/libraries.properties 1125074 
  
trunk/src/contrib/mumak/src/java/org/apache/hadoop/mapred/SimulatorJobTracker.java
 1125074 
  trunk/src/java/org/apache/hadoop/mapred/JobInProgress.java 1125074 
  trunk/src/java/org/apache/hadoop/mapred/JobTracker.java 1125074 

Diff: https://reviews.apache.org/r/776/diff


Testing
---


Thanks,

Jeffrey



 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-05-20 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036847#comment-13036847
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

@Allen, apologies for not being clear in the description.

Also, I expect this patch will fail in the Hudson build until HADOOP-7314 has 
been committed.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Attachments: MAPREDUCE-2489-mapred.patch


 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-05-13 Thread Jeffrey Naisbitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033116#comment-13033116
 ] 

Jeffrey Naisbitt commented on MAPREDUCE-2489:
-

Honestly, I'm not sure what caching was enabled at the time.  How would caching 
have helped in this case though - where we have basically tons of lookups on 
garbage hostnames?  (none of these strings are repeated)

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt

 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-05-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033129#comment-13033129
 ] 

Allen Wittenauer commented on MAPREDUCE-2489:
-

Ah, ok, the non-repeating hostnames wasn't clear in the description.

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt

 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable

2011-05-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032738#comment-13032738
 ] 

Allen Wittenauer commented on MAPREDUCE-2489:
-

I'm assuming no nscd or dns cache was running on the JT?

 Jobsplits with random hostnames can make the queue unusable
 ---

 Key: MAPREDUCE-2489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt

 We saw an issue where a custom InputSplit was returning invalid hostnames for 
 the splits that were then causing the JobTracker to attempt to excessively 
 resolve host names.  This caused a major slowdown for the JobTracker.  We 
 should prevent invalid InputSplit hostnames from affecting everyone else.
 I propose we implement some verification for the hostnames to try to ensure 
 that we only do DNS lookups on valid hostnames (and fail otherwise).  We 
 could also fail the job after a certain number of failures in the resolve.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira