[jira] Updated: (MAPREDUCE-408) TestKillSubProcesses fails with assertion failure sometimes

2009-07-13 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-408:
---

Attachment: MR-408.v1.patch

Attaching new patch with clean up of code of test case on Vinod's offline 
comments.

> TestKillSubProcesses fails with assertion failure sometimes
> ---
>
> Key: MAPREDUCE-408
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-408
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Attachments: MR-408.patch, MR-408.v1.patch
>
>
> org.apache.hadoop.mapred.TestKillSubProcesses.testJobKillFailAndSucceed fails 
> sometimes with following error Message:
> {noformat}
> Unexpected: The subprocess at level 3 in the subtree is not alive before Job 
> completion
> {noformat}
> Stacktrace
> {noformat}
> junit.framework.AssertionFailedError: Unexpected: The subprocess at level 3 
> in the subtree is not alive before Job completion
>   at 
> org.apache.hadoop.mapred.TestKillSubProcesses.runJobAndSetProcessHandle(TestKillSubProcesses.java:221)
>   at 
> org.apache.hadoop.mapred.TestKillSubProcesses.runFailingJobAndValidate(TestKillSubProcesses.java:112)
>   at 
> org.apache.hadoop.mapred.TestKillSubProcesses.runTests(TestKillSubProcesses.java:327)
>   at 
> org.apache.hadoop.mapred.TestKillSubProcesses.testJobKillFailAndSucceed(TestKillSubProcesses.java:310)
> {noformat}
> one such failure at 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/495/testReport/org.apache.hadoop.mapred/TestKillSubProcesses/testJobKillFailAndSucceed/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-757) JobConf will not be deleted from the logs folder if job retires from finalizeJob()

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-757:
-

Attachment: MAPREDUCE-757-v1.0.patch

Attaching a patch that factors out the code to do with job-expiry. The 
testcases are changes to do some fast testing. Result of test-patch
[exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Testing and running ant tests now.

> JobConf will not be deleted from the logs folder if job retires from 
> finalizeJob()
> --
>
> Key: MAPREDUCE-757
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-757
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-757-v1.0.patch
>
>
> MAPREDUCE-130 fixed the case where the job is retired from the retire jobs 
> thread. But jobs can also retire when the num-job-per-user limit is exceeded. 
> In such cases the conf file will not be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-758) JobInProgressListener events might be garbled

2009-07-13 Thread Amar Kamat (JIRA)
JobInProgressListener events might be garbled
-

 Key: MAPREDUCE-758
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-758
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat


Consider the following scenario 
# EagerTaskInitializer calls jobtracker.initJob(obj1)
# initJob will snapshot the job run-state to PREP
# Before initJob() issues job1.initTask(), user issues a kill and the job now 
moves to KILLED state. The jobtracker updates the listener about the 
PREP->KILLED event.
# Now initJob() issues a job1.initTask() which comes out nicely.
# initJob() now snapshots the job state it will be KILLED
# jobtracker now updates the listener with PREP->KILLED event which is incorrect

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-677) TestNodeRefresh timesout

2009-07-13 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-677:


Attachment: TEST-org.apache.hadoop.mapred.TestNodeRefresh.txt

Test Log with a time out

> TestNodeRefresh timesout
> 
>
> Key: MAPREDUCE-677
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-677
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-677-v1.0.patch, 
> MAPREDUCE-677-v1.1-branch-0.20.patch, MAPREDUCE-677-v1.1.patch, 
> TEST-org.apache.hadoop.mapred.TestNodeRefresh.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-677) TestNodeRefresh timesout

2009-07-13 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan reopened MAPREDUCE-677:
-


I am still seeing a time out with this test

> TestNodeRefresh timesout
> 
>
> Key: MAPREDUCE-677
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-677
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-677-v1.0.patch, 
> MAPREDUCE-677-v1.1-branch-0.20.patch, MAPREDUCE-677-v1.1.patch, 
> TEST-org.apache.hadoop.mapred.TestNodeRefresh.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-750) Extensible ConnManager factory API

2009-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730711#action_12730711
 ] 

Hadoop QA commented on MAPREDUCE-750:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12413182/MAPREDUCE-750.patch
  against trunk revision 793457.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 316 release audit warnings 
(more than the trunk's current 315 warnings).

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/383/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/383/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/383/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/383/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/383/console

This message is automatically generated.

> Extensible ConnManager factory API
> --
>
> Key: MAPREDUCE-750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-750
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-750.patch
>
>
> Sqoop uses the ConnFactory class to instantiate a ConnManager implementation 
> based on the connect string and other arguments supplied by the user. This 
> allows per-database logic to be encapsulated in different ConnManager 
> instances, and dynamically chosen based on which database the user is 
> actually importing from. But adding new ConnManager implementations requires 
> modifying the source of a common ConnFactory class. An indirection layer 
> should be used to delegate instantiation to a number of factory 
> implementations which can be specified in the static configuration or at 
> runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-465) Deprecate org.apache.hadoop.mapred.lib.MultithreadedMapRunner

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-465:
--

Attachment: patch-465-0.20.txt

Patch for 0.20.1 without the deprecation.
TestMultithreadedMapper passed on 0.20.1 as well

> Deprecate org.apache.hadoop.mapred.lib.MultithreadedMapRunner
> -
>
> Key: MAPREDUCE-465
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-465
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: patch-465-0.20.txt, patch-465.txt, patch-6023.txt
>
>
> Deprecate org.apache.hadoop.mapred.lib.MultithreadedMapRunner to use 
> org.apache.hadoop.mapreduce.lib.MultithreadedMapRunner 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-353) Allow shuffle read and connection timeouts to be configurable

2009-07-13 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730697#action_12730697
 ] 

Ravi Gummadi commented on MAPREDUCE-353:


ant test-patch gave

 [exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

No testcase is added and tested manually by setting small values for these 
timeouts and saw getting the socketTimeoutException.

Unit tests passed on my local machine.

> Allow shuffle read and connection timeouts to be configurable
> -
>
> Key: MAPREDUCE-353
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-353
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: MR-353.patch, MR-353.v1.patch
>
>
> It would be good for latency-sensitive applications to tune the shuffle 
> read/connection timeouts... in fact this made a huge difference to terasort 
> since we were seeing individual shuffles stuck for upwards of 60s and had to 
> have a very small read timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Status: Patch Available  (was: Open)

All unit tests passed, save TestStreamingExitStatus, which is known-bad. 
MAPREDUCE-587

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch, h5750.patch, 
> h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-681) Some testcases wait forever on a condition which might result into timeouts

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat resolved MAPREDUCE-681.
--

Resolution: Duplicate

Incorporated in MAPREDUCE-757

> Some testcases wait forever on a condition which might result into timeouts
> ---
>
> Key: MAPREDUCE-681
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-681
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: test
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-681-v1.0.patch
>
>
> MAPREDUCE-502 and MAPREDUCE-130 testcases should change to fail instead of 
> timeout upon failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-757) JobConf will not be deleted from the logs folder if job retires from finalizeJob()

2009-07-13 Thread Amar Kamat (JIRA)
JobConf will not be deleted from the logs folder if job retires from 
finalizeJob()
--

 Key: MAPREDUCE-757
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-757
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat
Assignee: Amar Kamat


MAPREDUCE-130 fixed the case where the job is retired from the retire jobs 
thread. But jobs can also retire when the num-job-per-user limit is exceeded. 
In such cases the conf file will not be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-656) Change org.apache.hadoop.mapred.SequenceFile* classes to use new api

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-656:
--

Attachment: patch-656-1.txt

Removed an unnecessary change from the earlier patch

> Change org.apache.hadoop.mapred.SequenceFile* classes to use new api
> 
>
> Key: MAPREDUCE-656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-656
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-656-1.txt, patch-656.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-656) Change org.apache.hadoop.mapred.SequenceFile* classes to use new api

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-656:
--

Status: Patch Available  (was: Open)

> Change org.apache.hadoop.mapred.SequenceFile* classes to use new api
> 
>
> Key: MAPREDUCE-656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-656
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-656-1.txt, patch-656.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-353) Allow shuffle read and connection timeouts to be configurable

2009-07-13 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-353:
---

Fix Version/s: 0.21.0
 Assignee: Ravi Gummadi  (was: Arun C Murthy)
Affects Version/s: 0.21.0
 Release Note: Expert level config properties 
mapred.shuffle.connect.timeout and mapred.shuffle.read.timeout that are to be 
used at cluster level are added by this patch.
   Status: Patch Available  (was: Open)

> Allow shuffle read and connection timeouts to be configurable
> -
>
> Key: MAPREDUCE-353
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-353
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: MR-353.patch, MR-353.v1.patch
>
>
> It would be good for latency-sensitive applications to tune the shuffle 
> read/connection timeouts... in fact this made a huge difference to terasort 
> since we were seeing individual shuffles stuck for upwards of 60s and had to 
> have a very small read timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-353) Allow shuffle read and connection timeouts to be configurable

2009-07-13 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730678#action_12730678
 ] 

Jothi Padmanabhan commented on MAPREDUCE-353:
-

+1. Changes look good.

> Allow shuffle read and connection timeouts to be configurable
> -
>
> Key: MAPREDUCE-353
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-353
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: MR-353.patch, MR-353.v1.patch
>
>
> It would be good for latency-sensitive applications to tune the shuffle 
> read/connection timeouts... in fact this made a huge difference to terasort 
> since we were seeing individual shuffles stuck for upwards of 60s and had to 
> have a very small read timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-743) Progress of map phase in map task is not updated properly

2009-07-13 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730677#action_12730677
 ] 

Ravi Gummadi commented on MAPREDUCE-743:


When compressed files are given as input to maps, the progress is not updated 
because the size of the input file(uncompressed size) is considered as 
Long.MAX_VALUE and thus the progress of map task with compressed file as input 
is ignored because of very small value 1/Long.MAX_VALUE. Progress values seen 
are of the order of 10^-17 to 10^-11.

I saw on the web   
http://www.abeel.be/content/determine-uncompressed-size-gzip-filethat says 
that the last 4 bytes of gzipped file contain the uncompressed file size. But 
this works only if the size is < 4GB.

Any thoughts on getting the uncompressed file size of compressed files(at 
leaset for gzipped files) ?

> Progress of map phase in map task is not updated properly
> -
>
> Key: MAPREDUCE-743
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-743
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: MR-743.patch, MR-743.v1.patch
>
>
> Progress of map phase in map task is not updated properly. The progress set 
> by TrackedRecordReader and NewTrackingRecordReader should set the progress 
> object of map phase. It was setting it as the progress of whole task and 
> because of phases, this is not considered as part of map task progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-656) Change org.apache.hadoop.mapred.SequenceFile* classes to use new api

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-656:
--

Status: Open  (was: Patch Available)

> Change org.apache.hadoop.mapred.SequenceFile* classes to use new api
> 
>
> Key: MAPREDUCE-656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-656
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-656.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-355) Change org.apache.hadoop.mapred.join to use new api

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730672#action_12730672
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-355:
---

-1 release audit. Is Spurious. releaseAuditDiffWarnings.txt shows the diff in 
jdiff files.
-1 core tests. Is not related to the patch. Raised MAPREDUCE-756 for the same.
-1 contrib tests. Streaming test failures on trunk is known issue.

> Change org.apache.hadoop.mapred.join to use new api
> ---
>
> Key: MAPREDUCE-355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-355
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-355-1.txt, patch-355-2.txt, patch-355.txt
>
>
> To change org.apache.hadoop.examples.Join to use new api, we need to change 
> org.apache.hadoop.mapred.join to use new api. So,
> Deprecate the code in org.apache.hadoop.mapred.join. 
> Copy the code to org.apache.hadoop.mapreduce.lib.join and Change it to use 
> new api. 
> Thoughts ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-756) TestSpeculativeExecution.testAtSpeculativeCap timed out in one of the runs

2009-07-13 Thread Amareshwari Sriramadasu (JIRA)
TestSpeculativeExecution.testAtSpeculativeCap timed out in one of the runs
--

 Key: MAPREDUCE-756
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-756
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Amareshwari Sriramadasu
 Fix For: 0.21.0


TestSpeculativeExecution.testAtSpeculativeCap timed out in one of the hudson 
runs
@ 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/testReport/org.apache.hadoop.mapred/TestSpeculativeExecution/testAtSpeculativeCap/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-739) Allow relative paths to be created inside archives.

2009-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730650#action_12730650
 ] 

Hadoop QA commented on MAPREDUCE-739:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12413168/MAPREDUCE-739.patch
  against trunk revision 793457.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 317 release audit warnings 
(more than the trunk's current 315 warnings).

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/382/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/382/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/382/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/382/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/382/console

This message is automatically generated.

> Allow relative paths to be created inside archives.
> ---
>
> Key: MAPREDUCE-739
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-739
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.21.0
>
> Attachments: HADOOP-3663.patch, HADOOP-3663.patch, HADOOP-3663.patch, 
> MAPREDUCE-739.patch
>
>
> Archives currently stores the full path from the input sources -- since it 
> allows multiple sources and regular expressions as inputs. So the created 
> archives have the full path of the input sources. This is un intuitive and a 
> user hassle. We should get rid of it and allow users to say that the created 
> archive should be relative to some absolute path and throw an excpetion if 
> the input does not confirm to the relative absolute path.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Attachment: h5750.patch

This uses a trivial Partitioner for map-only jobs. Verified no javac warnings

{noformat}
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
{noformat}

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch, h5750.patch, 
> h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Status: Open  (was: Patch Available)

Looking more closely at the patch, it shouldn't call the partitioner for 
map-only jobs.

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch, h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Attachment: h5750.patch

Last patch introduced javac warnings not caught by test-patch; this resolves 
them.

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch, h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730628#action_12730628
 ] 

Chris Douglas commented on MAPREDUCE-565:
-

{noformat}
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
{noformat}

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Attachment: h5750.patch

Merged with trunk

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-565) Partitioner does not work with new API

2009-07-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-565:


Status: Patch Available  (was: Open)

> Partitioner does not work with new API
> --
>
> Key: MAPREDUCE-565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Jothi Padmanabhan
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.20.1
>
> Attachments: h5750.patch, h5750.patch, h5750.patch
>
>
>  Partitioner does not work with the new API. MapTask.java looks for 
> "mapred.partitioner.class" whereas the new API sets it to 
> mapreduce.partitioner.class

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-257) Preventing node from swapping

2009-07-13 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730568#action_12730568
 ] 

Hong Tang commented on MAPREDUCE-257:
-

I am pretty sure we track memory by walking down a process tree. Of course, a 
user can defeat this by fork->fork->exec (aka daemonize). We should also be 
able to (if not already so) first collect the set of processes derived from the 
child task (inclusive), and then kill them all. But we may still miss some 
processes that are created between the checking and killing.

> Preventing node from swapping
> -
>
> Key: MAPREDUCE-257
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-257
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>
> When a node swaps, it slows everything: maps running on that node, reducers 
> fetching output from the node, and DFS clients reading from the DN. We should 
> just treat it the same way as if OS exhausts memory and kill some tasks to 
> free up memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-705) User-configurable quote and delimiter characters for Sqoop records and record reparsing

2009-07-13 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730552#action_12730552
 ] 

Aaron Kimball commented on MAPREDUCE-705:
-

None of these are sqoop-related bugs.

> User-configurable quote and delimiter characters for Sqoop records and record 
> reparsing
> ---
>
> Key: MAPREDUCE-705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-705
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-705.2.patch, MAPREDUCE-705.patch
>
>
> Sqoop needs a mechanism for users to govern how fields are quoted and what 
> delimiter characters separate fields and records. With delimiters providing 
> an unambiguous format, a parse method can reconstitute the generated record 
> data object from a text-based representation of the same record.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-257) Preventing node from swapping

2009-07-13 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730549#action_12730549
 ] 

dhruba borthakur commented on MAPREDUCE-257:


This can also be a problem when the map task itself forks off some external 
script. The external script might consume lots of memory and might not get 
killed even if the TaskTrackerChild is killed. I wish there was the concept of 
process-groups in Java so that when u kill the leader, all the process in the 
process group is killed by the OS.

> Preventing node from swapping
> -
>
> Key: MAPREDUCE-257
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-257
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>
> When a node swaps, it slows everything: maps running on that node, reducers 
> fetching output from the node, and DFS clients reading from the DN. We should 
> just treat it the same way as if OS exhausts memory and kill some tasks to 
> free up memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-257) Preventing node from swapping

2009-07-13 Thread Milind Bhandarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730539#action_12730539
 ] 

Milind Bhandarkar commented on MAPREDUCE-257:
-

I have seen this in one of our production clusters. The java task itself is 
killed due to memory limits, but there is a runaway task consuming lost of 
memory. So, I think killing the entire process tree did not work.

> Preventing node from swapping
> -
>
> Key: MAPREDUCE-257
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-257
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>
> When a node swaps, it slows everything: maps running on that node, reducers 
> fetching output from the node, and DFS clients reading from the DN. We should 
> just treat it the same way as if OS exhausts memory and kill some tasks to 
> free up memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-467) Collect information about number of tasks succeeded / total per time unit for a tasktracker.

2009-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730454#action_12730454
 ] 

Hudson commented on MAPREDUCE-467:
--

Integrated in Hadoop-Mapreduce-trunk #21 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/21/])
. Provide ability to collect statistics about total tasks and succeeded 
tasks in different time windows.


> Collect information about number of tasks succeeded / total per time unit for 
> a tasktracker. 
> -
>
> Key: MAPREDUCE-467
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-467
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Hemanth Yamijala
>Assignee: Sharad Agarwal
> Fix For: 0.21.0
>
> Attachments: 467_branch_0.20.patch, 467_v4.patch, 467_v5.patch, 
> 467_v6.patch, 467_v7.patch, 5931_v1.patch, 5931_v2.patch, 5931_v3.patch
>
>
> Collecting information of number of tasks succeeded / total per tasktracker 
> and being able to see these counts per hour, day and since start time will 
> help reason about things like the blacklisting strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Attachment: MAPREDUCE-479-2.patch

Updated, correct patch.

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479-2.patch, MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Status: Patch Available  (was: Open)

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479-2.patch, MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Status: Open  (was: Patch Available)

Sorry tainted patch, not the latest. Will upload cleaned up one.

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Attachment: MAPREDUCE-479-1.patch

Adds reduce attempt ID to shuffle call to mapOutputServlet's query string to 
enable true causal tracing. Eliminates assumption in tracing that no two 
attempts of the same task can run on the same node; even if we allow two 
attempts of the same task to run on the same node, we need globally 
synchronized clocks to disambiguate them.  

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Release Note: Adds Reduce Attempt ID to ClientTrace log messages, and adds 
Reduce Attempt ID to HTTP query string sent to mapOutputServlet.  (was: Adds 
Reduce ID to ClientTrace log messages. Explicitly uses new mapreduce.JobID for 
compatibility with updated TaskID constructor.)
  Status: Patch Available  (was: Open)

I would prefer adding the reduce attempt ID to the HTTP query string because 
this eliminates the need for assuming that no two attempts of the same task can 
run on the same node; I can see scenarios where a custom scheduler may break 
this assumption and make tracing very complicated. The incremental cost in 
terms of additional network traffic of adding the reduce attempt ID should be 
minimal and much smaller than the total data shuffled in a typical job. 

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-755) TestMRKeyFieldBasedComparator might now work as expected

2009-07-13 Thread Amar Kamat (JIRA)
TestMRKeyFieldBasedComparator might now work as expected


 Key: MAPREDUCE-755
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-755
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Amar Kamat


TestMRKeyFieldBasedComparator.testWithoutMRJob() tests 
KeyFieldBasedComparator.compare() which expects bytes[] with specific encoding. 
Passing String.getBytes() is incorrect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Status: Open  (was: Patch Available)

Will submit a new patch to add reduce attempt ID to eliminate assumption that 
no 2 attempts will run on same host, in case the assumption breaks in post-0.20 
scheduling.

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-656) Change org.apache.hadoop.mapred.SequenceFile* classes to use new api

2009-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730363#action_12730363
 ] 

Hadoop QA commented on MAPREDUCE-656:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12413106/patch-656.txt
  against trunk revision 793457.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 15 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 323 release audit warnings 
(more than the trunk's current 315 warnings).

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/380/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/380/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/380/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/380/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/380/console

This message is automatically generated.

> Change org.apache.hadoop.mapred.SequenceFile* classes to use new api
> 
>
> Key: MAPREDUCE-656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-656
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-656.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-40) Memory management variables need a backwards compatibility option after HADOOP-5881

2009-07-13 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-40:
---

Attachment: hadoop-5919-13-20.patch
hadoop-5919-13.patch

Uploading the new patch to rectify the javadoc warning.

 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec]
 [exec]
 [exec]
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==

> Memory management variables need a backwards compatibility option after 
> HADOOP-5881
> ---
>
> Key: MAPREDUCE-40
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-40
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hemanth Yamijala
>Assignee: rahul k singh
>Priority: Blocker
> Attachments: hadoop-5919-1.patch, hadoop-5919-10.patch, 
> hadoop-5919-11.patch, hadoop-5919-12-20.patch, hadoop-5919-12.patch, 
> hadoop-5919-13-20.patch, hadoop-5919-13.patch, hadoop-5919-2.patch, 
> hadoop-5919-3.patch, hadoop-5919-4.patch, hadoop-5919-5.patch, 
> hadoop-5919-6.patch, hadoop-5919-7.patch, hadoop-5919-8.patch, 
> hadoop-5919-9.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking 
> at the backwards compatibility angle. This JIRA is to adress the gap. Marking 
> it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-735) ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner

2009-07-13 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730331#action_12730331
 ] 

Iyappan Srinivasan commented on MAPREDUCE-735:
--

Tested the below scenarios and found them to PASS:

Input for some of the below scenarios for comparator:

Input :
3.6.2.8.9.12.43
3.6.1.8.9.12.43
3.6.6.8.9.12.43
3.6.5.8.9.12.43
3.6.8.8.9.12.43
3.6.8.8.9.12.43
3.6.2.8.9.12.43
3.6.9.8.9.12.43
3.6.3.8.9.12.43
3.6.1.8.9.12.43
3.6.5.8.9.12.43
3.6.2.8.9.12.43
3.6.1.8.9.12.43
1.7.8.6.3.2.4.7


1) bin/hadoop jar hadoop-dev-streaming.jar -Dmapred.reduce.tasks=1 
-Dmapred.text.key.partitioner.options=-k1,1 
-Dmapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -Dmap.output.key.field.separator=. 
-Dmapred.text.key.comparator.options=-k3,3nr -input input1/inputfile2  -mapper 
/bin/cat -reducer org.apache.hadoop.mapred.lib.IdentityReducer -output output2

- This sorts it numberically on third field and reverses it.

Output:
3.6.9.8.9.12.43
3.6.8.8.9.12.43
3.6.8.8.9.12.43
1.7.8.6.3.2.4.7
3.6.6.8.9.12.43
3.6.5.8.9.12.43
3.6.5.8.9.12.43
3.6.3.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.1.8.9.12.43


2) Sort it on third field, but make it as normal sort. No reverse.

bin/hadoop jar hadoop-dev-streaming.jar -Dmapred.reduce.tasks=1 
-Dmapred.text.key.partitioner.options=-k1,1 
-Dmapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -Dmap.output.key.field.separator=. -Dmapred.text.key.comparator.options=-k3,3n 
-input input1/inputfile2  -mapper /bin/cat 
-reducer=org.apache.hadoop.mapred.lib.IdentityReducer -output output3

3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.3.8.9.12.43
3.6.5.8.9.12.43
3.6.5.8.9.12.43
3.6.6.8.9.12.43
3.6.8.8.9.12.43
3.6.8.8.9.12.43
1.7.8.6.3.2.4.7
3.6.9.8.9.12.43

3) sorting on 7th filed and then in that result sort on 3rd field.

bin/hadoop jar hadoop-dev-streaming.jar -Dmapred.reduce.tasks=1 
-Dmapred.text.key.partitioner.options=-k1,1 
-Dmapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -Dmap.output.key.field.separator=. 
-Dmapred.text.key.comparator.options="-k7,7nr -k3,3n" -input input1/inputfile2  
-mapper /bin/cat -reducer org.apache.hadoop.mapred.lib.IdentityReducer -output 
output8

3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.3.8.9.12.43
3.6.5.8.9.12.43
3.6.5.8.9.12.43
3.6.6.8.9.12.43
3.6.8.8.9.12.43
3.6.8.8.9.12.43
3.6.9.8.9.12.43
1.7.8.6.3.2.4.7


4) Look for global precedence going off in case of local preference.

bin/hadoop jar hadoop-dev-streaming.jar -Dmapred.reduce.tasks=1 
-Dmapred.text.key.partitioner.options=-k1,1 
-Dmapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -Dmap.output.key.field.separator=. -Dmapred.text.key.comparator.options="-n 
-k7,7r -k3,3n" -input input1/inputfile2  -mapper /bin/cat -reducer 
org.apache.hadoop.mapred.lib.IdentityReducer -output output15

3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.1.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.2.8.9.12.43
3.6.3.8.9.12.43
3.6.5.8.9.12.43
3.6.5.8.9.12.43
3.6.6.8.9.12.43
3.6.8.8.9.12.43
3.6.8.8.9.12.43
3.6.9.8.9.12.43
1.7.8.6.3.2.4.7

5) For any special charecters like "^" and "p" and "letters" instead of 
numeric, it still sorts it.

6) Breaking the file into two also gives correct results. The output file 
divides itself into two parts and sorts in that correctly, even for huge sized 
files.This true for all the options.

7) If that column that is going to get sorted is "i", "^", or " ", or "" - null 
then  it shd put it in the end.

8) Introduction of "-Dnum.key.fields.for.partition=5" does not make any 
difference. Does not cause any exception.

Scenarios for  KeyFieldBasedPartitioner :

1) bin/hadoop jar hadoop-streaming.jar 
-Dmapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -Dmapred.text.key.comparator.options="-k5,5"  -Dmapred.reduce.tasks=2 
-Dmapred.text.key.partitioner.options=-k5,5 -Dmap.output.key.field.separator=" 
" -input input1/inputfile2 -mapper org.apache.hadoop.mapred.lib.IdentityMapper 
-reducer org.apache.hadoop.mapred.lib.IdentityReducer -inputformat 
org.apache.hadoop.mapred.KeyValueTextInputFormat -partitioner 
org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner -output output8

It sorts text on the fifth field. I had also tested for other fields.

2) Even if "-Dnum.key.fields.for.partition=5" is added, still it works properly 
without exception..

3) If that column that is going to get sorted is "i", "^", or " ", or "" - It 
sorts it without giving any erros. 

Some points to note are:
1) If  "-rn" option is used anywhere instead of "-nr" , it does not work. This 
is as per requirement.
2) if -D options spelling is wrong it just gets

[jira] Created: (MAPREDUCE-754) NPE in expiry thread when a TT is lost

2009-07-13 Thread Ramya R (JIRA)
NPE in expiry thread when a TT is lost
--

 Key: MAPREDUCE-754
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-754
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Ramya R
Priority: Minor


NullPointerException is obtained in Tracker Expiry Thread. Below is the 
exception obtained in the JT logs 
{noformat}
ERROR org.apache.hadoop.mapred.JobTracker: Tracker Expiry Thread got exception: 
java.lang.NullPointerException
at 
org.apache.hadoop.mapred.JobTracker.updateTaskTrackerStatus(JobTracker.java:2971)
at org.apache.hadoop.mapred.JobTracker.access$300(JobTracker.java:104)
at 
org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run(JobTracker.java:381)
at java.lang.Thread.run(Thread.java:619)
{noformat}
The steps to reproduce this issue are:
* Blacklist a TT. 
* Restart it. 
* The above exception is obtained when the first instance of TT is marked as 
lost.

However the above exception does not break any functionality.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-681) Some testcases wait forever on a condition which might result into timeouts

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-681:
-

Description: MAPREDUCE-502 and MAPREDUCE-130 testcases should change to 
fail instead of timeout upon failure.  (was: HADOOP-502 and HADOOP-130 
testcases should change to fail instead of timeout upon failure.)

> Some testcases wait forever on a condition which might result into timeouts
> ---
>
> Key: MAPREDUCE-681
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-681
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: test
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-681-v1.0.patch
>
>
> MAPREDUCE-502 and MAPREDUCE-130 testcases should change to fail instead of 
> timeout upon failure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-355) Change org.apache.hadoop.mapred.join to use new api

2009-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730286#action_12730286
 ] 

Hadoop QA commented on MAPREDUCE-355:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12413098/patch-355-2.txt
  against trunk revision 793457.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 18 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 333 release audit warnings 
(more than the trunk's current 315 warnings).

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/379/console

This message is automatically generated.

> Change org.apache.hadoop.mapred.join to use new api
> ---
>
> Key: MAPREDUCE-355
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-355
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-355-1.txt, patch-355-2.txt, patch-355.txt
>
>
> To change org.apache.hadoop.examples.Join to use new api, we need to change 
> org.apache.hadoop.mapred.join to use new api. So,
> Deprecate the code in org.apache.hadoop.mapred.join. 
> Copy the code to org.apache.hadoop.mapreduce.lib.join and Change it to use 
> new api. 
> Thoughts ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-467) Collect information about number of tasks succeeded / total per time unit for a tasktracker.

2009-07-13 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-467:
-

Attachment: 467_branch_0.20.patch

Patch for Yahoo's distribution for branch 20.

> Collect information about number of tasks succeeded / total per time unit for 
> a tasktracker. 
> -
>
> Key: MAPREDUCE-467
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-467
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Hemanth Yamijala
>Assignee: Sharad Agarwal
> Fix For: 0.21.0
>
> Attachments: 467_branch_0.20.patch, 467_v4.patch, 467_v5.patch, 
> 467_v6.patch, 467_v7.patch, 5931_v1.patch, 5931_v2.patch, 5931_v3.patch
>
>
> Collecting information of number of tasks succeeded / total per tasktracker 
> and being able to see these counts per hour, day and since start time will 
> help reason about things like the blacklisting strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-430) Task stuck in cleanup with OutOfMemoryErrors

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-430:
-

Attachment: MAPREDUCE-430-v1.6-branch-0.20.patch

Attaching a patch for branch-0.20.

> Task stuck in cleanup with OutOfMemoryErrors
> 
>
> Key: MAPREDUCE-430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
> Attachments: MAPREDUCE-430-v1.6-branch-0.20.patch, 
> MAPREDUCE-430-v1.6.patch
>
>
> Obesrved a task with OutOfMemory error, stuck in cleanup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-735) ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner

2009-07-13 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730270#action_12730270
 ] 

Amar Kamat commented on MAPREDUCE-735:
--

mapred tests passed on my box. Contrib test passed except 
TestStreamingExitStatus (FAILED) and TestStreamingStderr (FAILED-timeout). 


> ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner
> 
>
> Key: MAPREDUCE-735
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-735
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Attachments: HADOOP-6130-v1.0.patch, MAPREDUCE-735-v1.2.patch, 
> MAPREDUCE-735-v1.4-branch-0.20.patch, MAPREDUCE-735-v1.4.patch, 
> MAPREDUCE-735-v1.5.patch
>
>
> KeyFieldBasedPartitioner throws "KeyFieldBasedPartitioner" when some part of 
> the specified key is missing. 
> Scenario :
> ===
> when  value of num.key.fields.for.partition is greater than the separators 
> provided in the input.
> Command:
> 
> hadoop jar streaming.jar -Dmapred.reduce.tasks=3 
> -Dnum.key.fields.for.partition=5 -input   -output  
> -mapper org.apache.hadoop.mapred.lib.IdentityMapper -reducer 
> org.apache.hadoop.mapred.lib.IdentityReducer -inputformat 
> org.apache.hadoop.mapred.KeyValueTextInputFormat -partitioner 
> org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-735) ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-735:
-

Attachment: MAPREDUCE-735-v1.5.patch

Attaching a file with improved test-case. Added some more tests. Test-patch 
result 
[exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.


> ArrayIndexOutOfBoundsException is thrown by KeyFieldBasedPartitioner
> 
>
> Key: MAPREDUCE-735
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-735
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Attachments: HADOOP-6130-v1.0.patch, MAPREDUCE-735-v1.2.patch, 
> MAPREDUCE-735-v1.4-branch-0.20.patch, MAPREDUCE-735-v1.4.patch, 
> MAPREDUCE-735-v1.5.patch
>
>
> KeyFieldBasedPartitioner throws "KeyFieldBasedPartitioner" when some part of 
> the specified key is missing. 
> Scenario :
> ===
> when  value of num.key.fields.for.partition is greater than the separators 
> provided in the input.
> Command:
> 
> hadoop jar streaming.jar -Dmapred.reduce.tasks=3 
> -Dnum.key.fields.for.partition=5 -input   -output  
> -mapper org.apache.hadoop.mapred.lib.IdentityMapper -reducer 
> org.apache.hadoop.mapred.lib.IdentityReducer -inputformat 
> org.apache.hadoop.mapred.KeyValueTextInputFormat -partitioner 
> org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-18) Under load the shuffle sometimes gets incorrect data

2009-07-13 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-18:
--

Attachment: MR-18.v1.patch

Attaching patch with the suggested changes.

> Under load the shuffle sometimes gets incorrect data
> 
>
> Key: MAPREDUCE-18
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-18
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Ravi Gummadi
>Priority: Blocker
> Attachments: MR-18.patch, MR-18.v1.patch
>
>
> While testing HADOOP-5223 under load, we found reduces receiving completely 
> incorrect data. It was often random, but sometimes was the output of the 
> wrong map for the wrong map. It appears to either be a Jetty or JVM bug, but 
> it is clearly happening on the server side. In the HADOOP-5223 code, I added 
> information about the map and reduce that were included and we should add 
> similar protection to 0.20 and trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-153) TestJobInProgressListener sometimes timesout

2009-07-13 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-153:
-

Attachment: MAPREDUCE-153-v1.1-branch-0.20.patch

Attaching a patch for branch-0.20.

> TestJobInProgressListener sometimes timesout
> 
>
> Key: MAPREDUCE-153
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-153
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-153-v1.0.patch, 
> MAPREDUCE-153-v1.1-branch-0.20.patch, MAPREDUCE-153-v1.1.patch
>
>
> It times out with "Could not find /taskTracker/jobcache/jobid/work in any of 
> the configured local directories".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-753) In Streaming, "comparator options" column does not document the global options.

2009-07-13 Thread Iyappan Srinivasan (JIRA)
In Streaming, "comparator options" column does not document the global options.
---

 Key: MAPREDUCE-753
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-753
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Reporter: Iyappan Srinivasan
Priority: Minor


In streaming,  the "-Dmapred.text.key.partitioner.options" has some options 
like "-r -k7,7 -k3,3", which works like this : First soft inteh seventh column, 
then in taht sorting, subsort on 3rd column and reverse both these sorts.

The documentation for this is not found anywhere. Please document it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-753) In Streaming, "comparator options" column does not document the global options.

2009-07-13 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730247#action_12730247
 ] 

Iyappan Srinivasan commented on MAPREDUCE-753:
--

Sorry for typo.

In streaming, the "-Dmapred.text.key.partitioner.options" has some options like 
"-r -k7,7 -k3,3", which works like this : First sort in the seventh column, 
then in that sorting, subsort on 3rd column and reverse both these sorts. 

> In Streaming, "comparator options" column does not document the global 
> options.
> ---
>
> Key: MAPREDUCE-753
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-753
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Iyappan Srinivasan
>Priority: Minor
>
> In streaming,  the "-Dmapred.text.key.partitioner.options" has some options 
> like "-r -k7,7 -k3,3", which works like this : First soft inteh seventh 
> column, then in taht sorting, subsort on 3rd column and reverse both these 
> sorts.
> The documentation for this is not found anywhere. Please document it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.