[jira] Commented: (MAPREDUCE-706) Support for FIFO pools in the fair scheduler

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740847#action_12740847
 ] 

Hadoop QA commented on MAPREDUCE-706:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12415798/mapreduce-706.v4.patch
  against trunk revision 801959.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/456/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/456/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/456/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/456/console

This message is automatically generated.

> Support for FIFO pools in the fair scheduler
> 
>
> Key: MAPREDUCE-706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-706
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
>Assignee: Matei Zaharia
> Attachments: fsdesigndoc.pdf, fsdesigndoc.tex, mapreduce-706.patch, 
> mapreduce-706.v1.patch, mapreduce-706.v2.patch, mapreduce-706.v3.patch, 
> mapreduce-706.v4.patch
>
>
> The fair scheduler should support making the internal scheduling algorithm 
> for some pools be FIFO instead of fair sharing in order to work better for 
> batch workloads. FIFO pools will behave exactly like the current default 
> scheduler, sorting jobs by priority and then submission time. Pools will have 
> their scheduling algorithm set through the pools config file, and it will be 
> changeable at runtime.
> To support this feature, I'm also changing the internal logic of the fair 
> scheduler to no longer use deficits. Instead, for fair sharing, we will 
> assign tasks to the job farthest below its share as a ratio of its share. 
> This is easier to combine with other scheduling algorithms and leads to a 
> more stable sharing situation, avoiding unfairness issues brought up in 
> MAPREDUCE-543 and MAPREDUCE-544 that happen when some jobs have long tasks. 
> The new preemption (MAPREDUCE-551) will ensure that critical jobs can gain 
> their fair share within a bounded amount of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent

2009-08-07 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740837#action_12740837
 ] 

Todd Lipcon commented on MAPREDUCE-64:
--

Hi Arun,

Have you guys worked on this at all already? I'm interested in playing around 
with rewriting part of the mapside sort to get rid of this tunable. Like you 
said, for a lot of applications the default values are *way* off. 350K records 
in 95MB = 271 bytes average record size, which is larger than probably the 
majority of jobs we see in practice. If you already have worked on this I don't 
want to duplicate your effort, but if not, I think it would be a good step 
towards better average performance
without expert tuning.

> Map-side sort is hampered by io.sort.record.percent
> ---
>
> Key: MAPREDUCE-64
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-64
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>
> Currently io.sort.record.percent is a fairly obscure, per-job configurable, 
> expert-level parameter which controls how much accounting space is available 
> for records in the map-side sort buffer (io.sort.mb). Typically values for 
> io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store 
> ~350,000 records in the buffer before necessitating a sort/combine/spill.
> However for many applications which deal with small records e.g. the 
> world-famous wordcount and it's family this implies we can only use 5-10% of 
> io.sort.mb i.e. (5-10M) before we spill inspite of having _much_ more memory 
> available in the sort-buffer. The word-count for e.g. results in ~12 spills 
> (given hdfs block size of 64M). The presence of a combiner exacerbates the 
> problem by piling serialization/deserialization of records too...
> Sure, jobs can configure io.sort.record.percent, but it's tedious and 
> obscure; we really can do better by getting the framework to automagically 
> pick it by using all available memory (upto io.sort.mb) for either the data 
> or accounting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-841) Protect Job Tracker against memory exhaustion due to very large InputSplit or JobConf objects

2009-08-07 Thread Hong Tang (JIRA)
Protect Job Tracker against memory exhaustion due to very large InputSplit or 
JobConf objects
-

 Key: MAPREDUCE-841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-841
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Hong Tang
 Fix For: 0.21.0


JobTracker only needs to examine a subset of information contained by 
InputSplit or JobConf objects. But currently JobTracker loads the complete 
user-defined InputSplit and JobConf objects in memory. This design would leave 
JobTracker susceptible to memory exhaustion particularly in cases when some 
bugs in user code which could result in very large input splits or job conf 
objects (e.g. PIG-901).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run

2009-08-07 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740816#action_12740816
 ] 

Aaron Kimball commented on MAPREDUCE-799:
-

contrib failures are just streaming.

> Some of MRUnit's self-tests were not being run
> --
>
> Key: MAPREDUCE-799
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-799
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-799.patch
>
>
> Due to method naming issues, some test cases were not being executed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740808#action_12740808
 ] 

Hadoop QA commented on MAPREDUCE-799:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12414378/MAPREDUCE-799.patch
  against trunk revision 801959.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/455/console

This message is automatically generated.

> Some of MRUnit's self-tests were not being run
> --
>
> Key: MAPREDUCE-799
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-799
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-799.patch
>
>
> Due to method naming issues, some test cases were not being executed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API

2009-08-07 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-750:


Status: Open  (was: Patch Available)

> Extensible ConnManager factory API
> --
>
> Key: MAPREDUCE-750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-750
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, 
> MAPREDUCE-750.patch
>
>
> Sqoop uses the ConnFactory class to instantiate a ConnManager implementation 
> based on the connect string and other arguments supplied by the user. This 
> allows per-database logic to be encapsulated in different ConnManager 
> instances, and dynamically chosen based on which database the user is 
> actually importing from. But adding new ConnManager implementations requires 
> modifying the source of a common ConnFactory class. An indirection layer 
> should be used to delegate instantiation to a number of factory 
> implementations which can be specified in the static configuration or at 
> runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API

2009-08-07 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-750:


Attachment: MAPREDUCE-750.3.patch

New patch resync'd with trunk

> Extensible ConnManager factory API
> --
>
> Key: MAPREDUCE-750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-750
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, 
> MAPREDUCE-750.patch
>
>
> Sqoop uses the ConnFactory class to instantiate a ConnManager implementation 
> based on the connect string and other arguments supplied by the user. This 
> allows per-database logic to be encapsulated in different ConnManager 
> instances, and dynamically chosen based on which database the user is 
> actually importing from. But adding new ConnManager implementations requires 
> modifying the source of a common ConnFactory class. An indirection layer 
> should be used to delegate instantiation to a number of factory 
> implementations which can be specified in the static configuration or at 
> runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-750) Extensible ConnManager factory API

2009-08-07 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-750:


Status: Patch Available  (was: Open)

> Extensible ConnManager factory API
> --
>
> Key: MAPREDUCE-750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-750
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.3.patch, 
> MAPREDUCE-750.patch
>
>
> Sqoop uses the ConnFactory class to instantiate a ConnManager implementation 
> based on the connect string and other arguments supplied by the user. This 
> allows per-database logic to be encapsulated in different ConnManager 
> instances, and dynamically chosen based on which database the user is 
> actually importing from. But adding new ConnManager implementations requires 
> modifying the source of a common ConnFactory class. An indirection layer 
> should be used to delegate instantiation to a number of factory 
> implementations which can be specified in the static configuration or at 
> runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-840) DBInputFormat leaves open transaction

2009-08-07 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-840:


Status: Patch Available  (was: Open)

> DBInputFormat leaves open transaction
> -
>
> Key: MAPREDUCE-840
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-840
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: MAPREDUCE-840.patch
>
>
> DBInputFormat.getSplits() does not connection.commit() after the COUNT query. 
> This can leave an open transaction against the database which interferes with 
> other connections to the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-840) DBInputFormat leaves open transaction

2009-08-07 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-840:


Attachment: MAPREDUCE-840.patch

Attaching trivial patch for this issue. No new tests because I've only seen 
this issue manifest in interacting with postgresql. I've verified that with 
this fix in place, it works with postgresql. The TestDBJob unit test also works.

> DBInputFormat leaves open transaction
> -
>
> Key: MAPREDUCE-840
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-840
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: MAPREDUCE-840.patch
>
>
> DBInputFormat.getSplits() does not connection.commit() after the COUNT query. 
> This can leave an open transaction against the database which interferes with 
> other connections to the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-840) DBInputFormat leaves open transaction

2009-08-07 Thread Aaron Kimball (JIRA)
DBInputFormat leaves open transaction
-

 Key: MAPREDUCE-840
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-840
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Minor


DBInputFormat.getSplits() does not connection.commit() after the COUNT query. 
This can leave an open transaction against the database which interferes with 
other connections to the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-825) JobClient completion poll interval of 5s causes slow tests in local mode

2009-08-07 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740743#action_12740743
 ] 

Aaron Kimball commented on MAPREDUCE-825:
-

Failures are in streaming only.

> JobClient completion poll interval of 5s causes slow tests in local mode
> 
>
> Key: MAPREDUCE-825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-825
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: completion-poll-interval.patch, MAPREDUCE-825.2.patch
>
>
> The JobClient.NetworkedJob.waitForCompletion() method polls for job 
> completion every 5 seconds. When running a set of short tests in 
> pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted 
> time. When bandwidth is not scarce, setting the poll interval to 100 ms 
> results in a 4x speedup in some tests.  This interval should be parametrized 
> to allow users to control the interval for testing purposes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-839) unit test TestMiniMRChildTask fails on mac os-x

2009-08-07 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740727#action_12740727
 ] 

Hong Tang commented on MAPREDUCE-839:
-

The problem is discovered on Mac OS-X. But I tried to list the root causes that 
could also affect non-mac-os-x platforms:

Line 66:  assertEquals(tmp, new 
Path(System.getProperty("java.io.tmpdir")). makeQualified(localFs).toString());
expected = "file:/[private/]tmp/hadoop-htang/map...", actual = 
"file:/[]tmp/hadoop-htang/map...".
Root cause: on Mac OS-X, /tmp is symlink to /private/tmp. The test probably 
would fail on normal unix systems if /tmp is also symlinked.

Line 160:   assertTrue("LD doesnt contain pwd",  
System.getenv("LD_LIBRARY_PATH").contains(pwd));
Root cause: the environment variable for dynamic library on Mac OS-X is 
DYLD_LIBRARY_PATH instead of LD_LIBRARY_PATH


> unit test TestMiniMRChildTask fails on mac os-x
> ---
>
> Key: MAPREDUCE-839
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-839
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Hong Tang
>Priority: Minor
>
> The unit test TestMiniMRChildTask fails on Mac OS-X (10.5.8)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-825) JobClient completion poll interval of 5s causes slow tests in local mode

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740726#action_12740726
 ] 

Hadoop QA commented on MAPREDUCE-825:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12415772/MAPREDUCE-825.2.patch
  against trunk revision 801959.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/console

This message is automatically generated.

> JobClient completion poll interval of 5s causes slow tests in local mode
> 
>
> Key: MAPREDUCE-825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-825
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: completion-poll-interval.patch, MAPREDUCE-825.2.patch
>
>
> The JobClient.NetworkedJob.waitForCompletion() method polls for job 
> completion every 5 seconds. When running a set of short tests in 
> pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted 
> time. When bandwidth is not scarce, setting the poll interval to 100 ms 
> results in a 4x speedup in some tests.  This interval should be parametrized 
> to allow users to control the interval for testing purposes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-839) unit test TestMiniMRChildTask fails on mac os-x

2009-08-07 Thread Hong Tang (JIRA)
unit test TestMiniMRChildTask fails on mac os-x
---

 Key: MAPREDUCE-839
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-839
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Hong Tang
Priority: Minor


The unit test TestMiniMRChildTask fails on Mac OS-X (10.5.8)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-838) Task succeeds even when committer.commitTask fails with IOException

2009-08-07 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-838:


Priority: Critical  (was: Major)

Could be a serious bug, raising priority.

> Task succeeds even when committer.commitTask fails with IOException
> ---
>
> Key: MAPREDUCE-838
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-838
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.1
>Reporter: Koji Noguchi
>Priority: Critical
>
> In MAPREDUCE-837, job succeeded with empty output even though all the tasks 
> were throwing IOException at commiter.commitTask.
> {noformat}
> 2009-08-07 17:51:47,458 INFO org.apache.hadoop.mapred.TaskRunner: Task 
> attempt_200907301448_8771_r_00_0 is allowed to commit now
> 2009-08-07 17:51:47,466 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
> committing: java.io.IOException: Can not get the relative path: \
> base = 
> hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
>  \
> child = 
> hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
>   at 
> org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
>   at org.apache.hadoop.mapred.Task.commit(Task.java:768)
>   at org.apache.hadoop.mapred.Task.done(Task.java:692)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 2009-08-07 17:51:47,468 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
> asking whether task can commit: java.io.IOException: \
> Can not get the relative path: base = 
> hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
>  \
> child = 
> hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
>   at 
> org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
>   at 
> org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
>   at org.apache.hadoop.mapred.Task.commit(Task.java:768)
>   at org.apache.hadoop.mapred.Task.done(Task.java:692)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 2009-08-07 17:51:47,469 INFO org.apache.hadoop.mapred.TaskRunner: Task 
> attempt_200907301448_8771_r_00_0 is allowed to commit now
> 2009-08-07 17:51:47,472 INFO org.apache.hadoop.mapred.TaskRunner: Task 
> 'attempt_200907301448_8771_r_00_0' done.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-837) harchive fail when output directory has URI with default port of 8020

2009-08-07 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740697#action_12740697
 ] 

Koji Noguchi commented on MAPREDUCE-837:


bq. I'll create a separate Jira for the 0.20 job succeeding part.

Created MAPREDUCE-838

> harchive fail when output directory has URI with default port of 8020
> -
>
> Key: MAPREDUCE-837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.20.1
>Reporter: Koji Noguchi
>Priority: Minor
>
> % hadoop archive -archiveName abc.har /user/knoguchi/abc 
> hdfs://mynamenode:8020/user/knoguchi
> doesn't work on 0.18 nor 0.20

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-838) Task succeeds even when committer.commitTask fails with IOException

2009-08-07 Thread Koji Noguchi (JIRA)
Task succeeds even when committer.commitTask fails with IOException
---

 Key: MAPREDUCE-838
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-838
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.20.1
Reporter: Koji Noguchi


In MAPREDUCE-837, job succeeded with empty output even though all the tasks 
were throwing IOException at commiter.commitTask.

{noformat}
2009-08-07 17:51:47,458 INFO org.apache.hadoop.mapred.TaskRunner: Task 
attempt_200907301448_8771_r_00_0 is allowed to commit now
2009-08-07 17:51:47,466 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
committing: java.io.IOException: Can not get the relative path: \
base = 
hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
 \
child = 
hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
  at 
org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
  at 
org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
  at org.apache.hadoop.mapred.Task.commit(Task.java:768)
  at org.apache.hadoop.mapred.Task.done(Task.java:692)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
  at org.apache.hadoop.mapred.Child.main(Child.java:170)

2009-08-07 17:51:47,468 WARN org.apache.hadoop.mapred.TaskRunner: Failure 
asking whether task can commit: java.io.IOException: \
Can not get the relative path: base = 
hdfs://mynamenode:8020/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0
 \
child = 
hdfs://mynamenode/user/knoguchi/test2.har/_temporary/_attempt_200907301448_8771_r_00_0/_index
  at 
org.apache.hadoop.mapred.FileOutputCommitter.getFinalPath(FileOutputCommitter.java:150)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:106)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:126)
  at 
org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:86)
  at 
org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:171)
  at org.apache.hadoop.mapred.Task.commit(Task.java:768)
  at org.apache.hadoop.mapred.Task.done(Task.java:692)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
  at org.apache.hadoop.mapred.Child.main(Child.java:170)

2009-08-07 17:51:47,469 INFO org.apache.hadoop.mapred.TaskRunner: Task 
attempt_200907301448_8771_r_00_0 is allowed to commit now
2009-08-07 17:51:47,472 INFO org.apache.hadoop.mapred.TaskRunner: Task 
'attempt_200907301448_8771_r_00_0' done.


{noformat}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-837) harchive fail when output directory has URI with default port of 8020

2009-08-07 Thread Koji Noguchi (JIRA)
harchive fail when output directory has URI with default port of 8020
-

 Key: MAPREDUCE-837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-837
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: harchive
Affects Versions: 0.20.1
Reporter: Koji Noguchi
Priority: Minor


% hadoop archive -archiveName abc.har /user/knoguchi/abc 
hdfs://mynamenode:8020/user/knoguchi

doesn't work on 0.18 nor 0.20


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-837) harchive fail when output directory has URI with default port of 8020

2009-08-07 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740692#action_12740692
 ] 

Koji Noguchi commented on MAPREDUCE-837:


hadoop archive -archiveName abc.har /user/knoguchi/abc 
hdfs://mynamenode:8020/user/knoguchi

in 0.18, job fails with
{noformat}
09/08/07 19:41:57 INFO mapred.JobClient: Task Id :
attempt_200908071938_0001_m_00_2, Status : FAILED
Failed to rename output with the exception: java.io.IOException: Can not get the
relative path: base =
hdfs://mynamenode:8020/user/knoguchi/abc.har/_temporary/_attempt_200908071938_0001_m_00_2
child =
hdfs://mynamenode/user/knoguchi/abc.har/_temporary/_attempt_200908071938_0001_m_00_2/part-0
at org.apache.hadoop.mapred.Task.getFinalPath(Task.java:590)
at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:603)
at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:621)
at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:565)
at
org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2616)
{noformat}

in 0.20, it logs the above warning but job succeeds with empty output directory.
(which is worse)

I'll create a separate Jira for the 0.20 job succeeding part.



> harchive fail when output directory has URI with default port of 8020
> -
>
> Key: MAPREDUCE-837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.20.1
>Reporter: Koji Noguchi
>Priority: Minor
>
> % hadoop archive -archiveName abc.har /user/knoguchi/abc 
> hdfs://mynamenode:8020/user/knoguchi
> doesn't work on 0.18 nor 0.20

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-375) Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api.

2009-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740657#action_12740657
 ] 

Hudson commented on MAPREDUCE-375:
--

Integrated in Hadoop-Mapreduce-trunk #41 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/41/])
. Change org.apache.hadoop.mapred.lib.NLineInputFormat and 
org.apache.hadoop.mapred.MapFileOutputFormat to use new api. Contributed by 
Amareshwari Sriramadasu.


>  Change org.apache.hadoop.mapred.lib.NLineInputFormat and 
> org.apache.hadoop.mapred.MapFileOutputFormat to use new api.
> --
>
> Key: MAPREDUCE-375
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-375
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-375-1.txt, patch-375-2.txt, patch-375.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740658#action_12740658
 ] 

Hudson commented on MAPREDUCE-796:
--

Integrated in Hadoop-Mapreduce-trunk #41 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/41/])
. Fixes a ClassCastException in an exception log in MultiThreadedMapRunner. 
Contributed by Amar Kamat.


> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Fix For: 0.20.1
>
> Attachments: MAPREDUCE-796-v1.0.patch
>
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-814) Move completed Job history files to HDFS

2009-08-07 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740613#action_12740613
 ] 

Sharad Agarwal commented on MAPREDUCE-814:
--

test patch and ant test passed.

> Move completed Job history files to HDFS
> 
>
> Key: MAPREDUCE-814
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-814
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
> Attachments: 814_v1.patch, 814_v2.patch, 814_v3.patch, 814_v4.patch, 
> 814_v5.patch
>
>
> Currently completed job history files remain on the jobtracker node. Having 
> the files available on HDFS will enable clients to access these files more 
> easily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-08-07 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740609#action_12740609
 ] 

Aaron Kimball commented on MAPREDUCE-798:
-

test failures are in streaming

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.2.patch, MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-800) MRUnit should support the new API

2009-08-07 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740606#action_12740606
 ] 

Aaron Kimball commented on MAPREDUCE-800:
-

failures are in streaming

> MRUnit should support the new API
> -
>
> Key: MAPREDUCE-800
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-800
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-800.2.patch, MAPREDUCE-800.patch
>
>
> MRUnit's TestDriver implementations use the old 
> org.apache.hadoop.mapred-based classes. TestDrivers and associated mock 
> object implementations are required for org.apache.hadoop.mapreduce-based 
> code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-08-07 Thread Jiaqi Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740602#action_12740602
 ] 

Jiaqi Tan commented on MAPREDUCE-479:
-

Failed QA tests due to other patch that was also in the same build; this patch 
did not change anything in mapred.lib but failed tests were in mapred.lib.

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479-4.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740584#action_12740584
 ] 

Hadoop QA commented on MAPREDUCE-479:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12415747/MAPREDUCE-479-4.patch
  against trunk revision 801959.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/453/console

This message is automatically generated.

> Add reduce ID to shuffle clienttrace
> 
>
> Key: MAPREDUCE-479
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: Jiaqi Tan
>Assignee: Jiaqi Tan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
> MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479-4.patch, 
> MAPREDUCE-479.patch
>
>
> Current clienttrace messages from shuffles note only the destination map ID 
> but not the source reduce ID. Having both source and destination ID of each 
> shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-805) Deadlock in Jobtracker

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-805:
-

Attachment: MAPREDUCE-805-v1.7.patch

Attaching a patch incorporating Devaraj's offline comments. Result of 
test-patch 
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 21 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.


> Deadlock in Jobtracker
> --
>
> Key: MAPREDUCE-805
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-805
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Michael Tamm
> Attachments: MAPREDUCE-805-v1.1.patch, MAPREDUCE-805-v1.2.patch, 
> MAPREDUCE-805-v1.3.patch, MAPREDUCE-805-v1.6.patch, MAPREDUCE-805-v1.7.patch
>
>
> We are running a hadoop cluster (version 0.20.0) and have detected the 
> following deadlock on our jobtracker:
> {code}
> "IPC Server handler 51 on 9001":
>   at 
> org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:943)
>   - waiting to lock <0x7f2b6fb46130> (a 
> org.apache.hadoop.mapred.JobInProgress)
>   at 
> org.apache.hadoop.mapred.JobTracker.getJobCounters(JobTracker.java:3102)
>   - locked <0x7f2b5f026000> (a org.apache.hadoop.mapred.JobTracker)
>   at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>  "pool-1-thread-2":
>   at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2017)
>   - waiting to lock <0x7f2b5f026000> (a 
> org.apache.hadoop.mapred.JobTracker)
>   at 
> org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2483)
>   - locked <0x7f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
>   at 
> org.apache.hadoop.mapred.JobInProgress.terminateJob(JobInProgress.java:2152)
>   - locked <0x7f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
>   at 
> org.apache.hadoop.mapred.JobInProgress.terminate(JobInProgress.java:2169)
>   - locked <0x7f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
>   at org.apache.hadoop.mapred.JobInProgress.fail(JobInProgress.java:2245)
>   - locked <0x7f2b6fb46130> (a org.apache.hadoop.mapred.JobInProgress)
>   at 
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:86)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-835) hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package is used

2009-08-07 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-835:
--

Description: 
When checking mapreduce trunk.
If run ant binary or ant bin-package commands-:
hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, 
hadoop-mapred-tools-0.21.0-dev.jar are being packaged in tar or 
build/hadoop-mapred-0.21.0-dev package directory. But they are present under 
build directory.

For ant tar and ant package they are being packaged correctly under 
buid/hadoop-mapred-0.21.0-dev directory. and in tar file

  was:
When checking mapreduce trunk.
If run ant binary or ant bin-package commands-:
hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, 
hadoop-mapred-tools-0.21.0-dev.jar are being in tar or 
build/hadoop-mapred-0.21.0-dev packe directory. But they present under build 
directory.

For ant tar and ant package they are being packaged correclty. 
buid/hadoop-mapred-0.21.0-dev directory.

Summary: hadoop-mapred examples,test and tools jar iles are being 
packaged when ant binary or bin-package is used   (was: hadoop-mapred 
examples,test and tools jar iles are being packaged when ant binary or 
bin-package)

> hadoop-mapred examples,test and tools jar iles are being packaged when ant 
> binary or bin-package is used 
> -
>
> Key: MAPREDUCE-835
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-835
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Karam Singh
>
> When checking mapreduce trunk.
> If run ant binary or ant bin-package commands-:
> hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, 
> hadoop-mapred-tools-0.21.0-dev.jar are being packaged in tar or 
> build/hadoop-mapred-0.21.0-dev package directory. But they are present under 
> build directory.
> For ant tar and ant package they are being packaged correctly under 
> buid/hadoop-mapred-0.21.0-dev directory. and in tar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-836) Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes option are used while running ant package or tar or similar commands.

2009-08-07 Thread Karam Singh (JIRA)
Examples of hadoop pipes a even when -Dcompile.native=yes -Dcompile.c++=yes 
option are used while running ant package or tar or similar commands.
-

 Key: MAPREDUCE-836
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-836
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Affects Versions: 0.20.1, 0.21.0
Reporter: Karam Singh


Examples of hadoop pies and python are not packed even when 
-Dcompile.native=yes -Dcompile.c++=yes option are used while running ant 
package or tar or similar commands. 
The pipes examples are compiled and copied under build/c++-examples but are not 
being packaged. Similar is case with python examples also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-835) hadoop-mapred examples,test and tools jar iles are being packaged when ant binary or bin-package

2009-08-07 Thread Karam Singh (JIRA)
hadoop-mapred examples,test and tools jar iles are being packaged when ant 
binary or bin-package


 Key: MAPREDUCE-835
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-835
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Karam Singh


When checking mapreduce trunk.
If run ant binary or ant bin-package commands-:
hadoop-mapred-test-0.21.0-dev.jar, hadoop-mapred-examples-0.21.0-dev.jar, 
hadoop-mapred-tools-0.21.0-dev.jar are being in tar or 
build/hadoop-mapred-0.21.0-dev packe directory. But they present under build 
directory.

For ant tar and ant package they are being packaged correclty. 
buid/hadoop-mapred-0.21.0-dev directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-757) JobConf will not be deleted from the logs folder if job retires from finalizeJob()

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-757:
-

Attachment: MAPREDUCE-757-v2.0-branch-0.20.patch

Attaching a patch for branch 0.20.

> JobConf will not be deleted from the logs folder if job retires from 
> finalizeJob()
> --
>
> Key: MAPREDUCE-757
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-757
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-757-v1.0.patch, 
> MAPREDUCE-757-v2.0-branch-0.20.patch, MAPREDUCE-757-v2.0.patch
>
>
> MAPREDUCE-130 fixed the case where the job is retired from the retire jobs 
> thread. But jobs can also retire when the num-job-per-user limit is exceeded. 
> In such cases the conf file will not be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-757) JobConf will not be deleted from the logs folder if job retires from finalizeJob()

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat reopened MAPREDUCE-757:
--


> JobConf will not be deleted from the logs folder if job retires from 
> finalizeJob()
> --
>
> Key: MAPREDUCE-757
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-757
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-757-v1.0.patch, 
> MAPREDUCE-757-v2.0-branch-0.20.patch, MAPREDUCE-757-v2.0.patch
>
>
> MAPREDUCE-130 fixed the case where the job is retired from the retire jobs 
> thread. But jobs can also retire when the num-job-per-user limit is exceeded. 
> In such cases the conf file will not be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-375) Change org.apache.hadoop.mapred.lib.NLineInputFormat and org.apache.hadoop.mapred.MapFileOutputFormat to use new api.

2009-08-07 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-375:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

>  Change org.apache.hadoop.mapred.lib.NLineInputFormat and 
> org.apache.hadoop.mapred.MapFileOutputFormat to use new api.
> --
>
> Key: MAPREDUCE-375
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-375
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-375-1.txt, patch-375-2.txt, patch-375.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-08-07 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740516#action_12740516
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-370:
---

bq. To achieve this, I think we could port MultipleOutputs, and change the 
semantics of getCollector() in the multi name case, so that the multi name is 
the full name of the name of the output file. This method is typically invoked 
in the reduce() method, where the key and value are available, and can be used 
to form the name.

If we do this, this will remove generate* methods from the api proposed. And 
api for writing would look like :
{code}
public  void write(String namedOutput,  K key, V value, String outputPath)
  throws IOException, InterruptedException;
public  void write(String namedOutput,  K key, V value)
  throws IOException, InterruptedException;
public  void write( K key, V value, String outputPath)
  throws IOException, InterruptedException;
{code}

let me know if this looks fine.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-478) separate jvm param for mapper and reducer

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740513#action_12740513
 ] 

Hadoop QA commented on MAPREDUCE-478:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12415805/MAPREDUCE-478_1_20090806_yhadoop20.patch
  against trunk revision 801954.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 19 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/452/console

This message is automatically generated.

> separate jvm param for mapper and reducer
> -
>
> Key: MAPREDUCE-478
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-478
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Koji Noguchi
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HADOOP-5684_0_20090420.patch, 
> MAPREDUCE-478_0_20090804.patch, MAPREDUCE-478_0_20090804_yhadoop20.patch, 
> MAPREDUCE-478_1_20090806.patch, MAPREDUCE-478_1_20090806_yhadoop20.patch
>
>
> Memory footprint of mapper and reducer can differ. 
> It would be nice if we can pass different jvm param (mapred.child.java.opts) 
> for mappers and reducers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-750) Extensible ConnManager factory API

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740511#action_12740511
 ] 

Hadoop QA commented on MAPREDUCE-750:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12415690/MAPREDUCE-750.2.patch
  against trunk revision 801517.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/451/console

This message is automatically generated.

> Extensible ConnManager factory API
> --
>
> Key: MAPREDUCE-750
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-750
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-750.2.patch, MAPREDUCE-750.patch
>
>
> Sqoop uses the ConnFactory class to instantiate a ConnManager implementation 
> based on the connect string and other arguments supplied by the user. This 
> allows per-database logic to be encapsulated in different ConnManager 
> instances, and dynamically chosen based on which database the user is 
> actually importing from. But adding new ConnManager implementations requires 
> modifying the source of a common ConnFactory class. An indirection layer 
> should be used to delegate instantiation to a number of factory 
> implementations which can be specified in the static configuration or at 
> runtime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das resolved MAPREDUCE-796.
---

   Resolution: Fixed
Fix Version/s: 0.20.1
 Hadoop Flags: [Reviewed]

I just committed this. Thanks, Amar!

> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Fix For: 0.20.1
>
> Attachments: MAPREDUCE-796-v1.0.patch
>
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-08-07 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740509#action_12740509
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-370:
---

bq. To achieve this, I think we could port MultipleOutputs, and change the 
semantics of getCollector() in the multi name case, so that the multi name is 
the full name of the name of the output file. This method is typically invoked 
in the reduce() method, where the key and value are available, and can be used 
to form the name.
Tom, are you saying that we should not have a protected method to 
generateOutputName(), which could be overridden to give the functionality. If 
so, we should have a way to find out whether it is namedOutput (i meant 
multiNamedOutputs) or an arbitrary name, to know which output format should be 
used for writing.
We should have something like :
{code}
  public  void write(String namedOutput, String outputPath, K key, V value)
  throws IOException, InterruptedException;
  public  void write(String outputPath, K key, V value)
  throws IOException, InterruptedException;
{code}

bq. Applications that want to add a unique suffix can call 
FileOutputFormat#getUniqueFile() themselves.
This should be done by the framework to support counters as  explained earlier.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-08-07 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-370:
--

Attachment: patch-370.txt

Attaching an early patch.

Patch does the following:
1. Adds an api in org.apache.hadoop.mapreduce.lib.output.FileOutputFormat to 
get RecordWriter by taking the filename. Current api does not support passing a 
filename.

2. Adds org.apache.hadoop.mapreduce.lib.output.MultipleOutputs with following 
api :
{code}
public class MultipleOutputs  {

  public MultipleOutputs(TaskInputOutputContext context);

   // Adds a named output for the job.
  public static void addNamedOutput(Job job, String namedOutput,
  Class outputFormatClass,
  Class keyClass, Class valueClass) ;

  // Enables counters for named outputs
  public static void setCountersEnabled(Job job, boolean enabled);

  // Write to a named output. 
  // write to an output file name that depends on key, value, context and 
namedoutput
  // gets the record writer from output format added for the named output 
  public  void write(String namedOutput, K key, V value)
  throws IOException, InterruptedException;

  // Writes to  an output file name that depends on key, value and context
  // gets the record writer from job's outputformat.  
  //Job's output format should be a FileOutputFormat. 
  public  void write(KEYOUT key, VALUEOUT value) 
  throws IOException, InterruptedException;

  protected String generateOutputName(K  key, V value,
  TaskAttemptContext context, String name);

  protected  K generateActualKey(K key, V value) ;
  protected  V generateActualValue(K key, V value);
{code}

User can add namedOutputs and corresponding OutputFormat, Output key/value 
types using addNamedOutput. 
generateOutputName api can be overridden by the user to give final output name. 
This gives the complete control of the output name to the user. Generating 
unique file-name can done once user gives this name (can be done in framework 
it self) as done in the patch. This facilitates the available counter feature 
to count the number of records written to each output name. The same method can 
be used to plug-in the functionality of multiNamedOutputs.

I illustrated using the api, in the added test-case. 

3. Deprecates org.apache.hadoop.mapred.lib.Multiple*Output*



> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-814) Move completed Job history files to HDFS

2009-08-07 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-814:
-

Attachment: 814_v5.patch

Incorporated Devaraj's offline comments. Minimized the jobtracker init changes. 
Passing filesystem handle in JobHistory#getJobHistoryFileName

> Move completed Job history files to HDFS
> 
>
> Key: MAPREDUCE-814
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-814
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
> Attachments: 814_v1.patch, 814_v2.patch, 814_v3.patch, 814_v4.patch, 
> 814_v5.patch
>
>
> Currently completed job history files remain on the jobtracker node. Having 
> the files available on HDFS will enable clients to access these files more 
> easily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-800) MRUnit should support the new API

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740498#action_12740498
 ] 

Hadoop QA commented on MAPREDUCE-800:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12415679/MAPREDUCE-800.2.patch
  against trunk revision 801517.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/450/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/450/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/450/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/450/console

This message is automatically generated.

> MRUnit should support the new API
> -
>
> Key: MAPREDUCE-800
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-800
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-800.2.patch, MAPREDUCE-800.patch
>
>
> MRUnit's TestDriver implementations use the old 
> org.apache.hadoop.mapred-based classes. TestDrivers and associated mock 
> object implementations are required for org.apache.hadoop.mapreduce-based 
> code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-08-07 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-779:
-

Status: Patch Available  (was: Open)

> Add node health failures into JobTrackerStatistics
> --
>
> Key: MAPREDUCE-779
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch, 
> mapreduce-779-3.patch, mapreduce-779-4.patch
>
>
> Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-767) to remove mapreduce dependency on commons-cli2

2009-08-07 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740496#action_12740496
 ] 

Amar Kamat commented on MAPREDUCE-767:
--

Tested this patch with examples mentioned in [streaming 
docs|http://hadoop.apache.org/common/docs/r0.20.0/streaming.html]. All cases 
seem to pass. Doing further testing.

> to remove mapreduce dependency on commons-cli2
> --
>
> Key: MAPREDUCE-767
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-767
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/streaming
>Reporter: Giridharan Kesavan
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-767-v1.1.patch
>
>
> mapreduce, streaming and eclipse plugin depends on common-cli2 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740493#action_12740493
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-796:
---

+1

> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-796-v1.0.patch
>
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-832) Too many WARN messages about deprecated memorty config variables in JobTacker log

2009-08-07 Thread Karam Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh updated MAPREDUCE-832:
--

Summary: Too many WARN messages about deprecated memorty config variables 
in JobTacker log  (was: Too man y WARN messages about deprecated memorty config 
variables in JobTacker log)

> Too many WARN messages about deprecated memorty config variables in JobTacker 
> log
> -
>
> Key: MAPREDUCE-832
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-832
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Karam Singh
>
> When user submit a mapred job using old memory config vairiable 
> (mapred.task.maxmem) followinig message too many times in JobTracker logs -:
> [
> WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no 
> longer used instead use  mapred.job.map.memory.mb and 
> mapred.job.reduce.memory.mb
> ]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-796:
-

Attachment: MAPREDUCE-796-v1.0.patch

Attaching a simple fix.

> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-796-v1.0.patch
>
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat reassigned MAPREDUCE-796:


Assignee: Amar Kamat

> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-08-07 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat reopened MAPREDUCE-796:
--


> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Assignee: Amar Kamat
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.

2009-08-07 Thread Karam Singh (JIRA)
When TaskTracker config use old memory management values its memory monitoring 
is diabled.
--

 Key: MAPREDUCE-834
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Karam Singh


TaskTracker memory config values -:
mapred.tasktracker.vmem.reserved=8589934592
mapred.task.default.maxvmem=2147483648
mapred.task.limit.maxvmem=4294967296
mapred.tasktracker.pmem.reserved=2147483648
TaskTracker start as -:
   2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable 
mapred.tasktracker.vmem.reserved is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable 
mapred.tasktracker.pmem.reserved is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem 
is no longer used
2009-08-05 12:39:03,308 WARN 
org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is 
no longer used
2009-08-05 12:39:03,308 INFO 
org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for 
all reduce tasks on 
2009-08-05 12:39:03,309 INFO 
org.apache.hadoop.mapred.TaskTracker:  Using MemoryCalculatorPlugin : 
org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777
2009-08-05 12:39:03,311 WARN 
org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks 
is -1. TaskMemoryManager is disabled.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-833) Jobclient does not print any warning message when old memory config variable used with -D option from command line

2009-08-07 Thread Karam Singh (JIRA)
Jobclient does not print any warning message when old memory config variable 
used with -D option from command line
--

 Key: MAPREDUCE-833
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-833
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Karam Singh




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-832) Too man y WARN messages about deprecated memorty config variables in JobTacker log

2009-08-07 Thread Karam Singh (JIRA)
Too man y WARN messages about deprecated memorty config variables in JobTacker 
log
--

 Key: MAPREDUCE-832
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-832
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Karam Singh


When user submit a mapred job using old memory config vairiable 
(mapred.task.maxmem) followinig message too many times in JobTracker logs -:
[
WARN org.apache.hadoop.mapred.JobConf: The variable mapred.task.maxvmem is no 
longer used instead use  mapred.job.map.memory.mb and 
mapred.job.reduce.memory.mb
]


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740444#action_12740444
 ] 

Hadoop QA commented on MAPREDUCE-798:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12415678/MAPREDUCE-798.2.patch
  against trunk revision 801517.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/449/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/449/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/449/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/449/console

This message is automatically generated.

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.2.patch, MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers

2009-08-07 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-802:
-

Attachment: eventmodel-1.patch

Attaching the patch which makes changes in the event model as described in the 
[comment|https://issues.apache.org/jira/browse/MAPREDUCE-802?focusedCommentId=12738226&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12738226]

I have introduced {{JobSchedulingInfoIndex}} for removal based on the  old 
{{JobSchedulingInfo}} as I thought the update of the jobs are happening with 
{{JobTracker}} lock.

> Simplify the job updated event notification between Jobtracker and schedulers
> -
>
> Key: MAPREDUCE-802
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-802
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Hemanth Yamijala
>Assignee: Sreekanth Ramakrishnan
> Attachments: eventmodel-1.patch
>
>
> HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state 
> / property of a job like the run state / priority of a job notified to the 
> scheduler. We've seen some issues with this framework, such as the following:
> - Events are not raised correctly at all places. If a new code path is added 
> to kill a job, raising events is missed out.
> - Events are raised with incorrect event data. For e.g. typically start time 
> value is missed out.
> The resulting contract break between jobtracker and schedulers has lead to 
> problems in the capacity scheduler where jobs remain stuck in the queue 
> without being ever removed and so on.
> It has proven complicated to get this right in the framework and fixes have 
> typically still left dangling cases. Or new code paths introduce new bugs.
> This JIRA is about trying to simplify the interaction model so that it is 
> more robust and works well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.