[jira] Commented: (MAPREDUCE-1656) JobStory should provide queue info.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852728#action_12852728
 ] 

Hadoop QA commented on MAPREDUCE-1656:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440568/mr-1656-2.patch
  against trunk revision 930088.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/90/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/90/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/90/console

This message is automatically generated.

> JobStory should provide queue info.
> ---
>
> Key: MAPREDUCE-1656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1656
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>Assignee: Hong Tang
>Priority: Minor
> Attachments: mr-1656-2.patch, mr-1656.patch
>
>
> Add a method in JobStory to get the queue to which a job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-04-01 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852690#action_12852690
 ] 

rahul k singh commented on MAPREDUCE-1594:
--

I have implemented all the comments except 

-Structure wise, it would be better to rename GridmixJob to LoadJob, and create 
a common base (probably should be abstract) class for LoadJob and SleepJob and 
call it GridmixJob that only contains the shared parts of LoadJob and SleepJob. 
E.g. outdir may only belong to LoadJob but not SleepJob. (BTW, are 
File{Input,Output}Format.set{Input,Output}Path needed for SleepJob.call()?)

GridmixJob is created as an abstract class , outdir has been pushed to 
GridmixJob as SleepJob is also using this. We need 
File{Input,Output}Format.set{Input,Output}Path for mapreduce . Iam not sure if 
this is a bug , but it is required.

> Support for Sleep Jobs in gridmix
> -
>
> Key: MAPREDUCE-1594
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
> Attachments: 1594-yhadoop-20-1xx-1-2.patch, 
> 1594-yhadoop-20-1xx-1.patch, 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-04-01 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-1594:
-

Attachment: 1594-yhadoop-20-1xx-1-2.patch

Attaching the new patch with hong's comments.


> Support for Sleep Jobs in gridmix
> -
>
> Key: MAPREDUCE-1594
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
> Attachments: 1594-yhadoop-20-1xx-1-2.patch, 
> 1594-yhadoop-20-1xx-1.patch, 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852679#action_12852679
 ] 

Hadoop QA commented on MAPREDUCE-1428:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch
  against trunk revision 930088.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/89/console

This message is automatically generated.

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1656) JobStory should provide queue info.

2010-04-01 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1656:
-

Attachment: mr-1656-2.patch

fixing a bug to handle cases where queue info was not available in existing 
logs.

> JobStory should provide queue info.
> ---
>
> Key: MAPREDUCE-1656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1656
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>Assignee: Hong Tang
>Priority: Minor
> Attachments: mr-1656-2.patch, mr-1656.patch
>
>
> Add a method in JobStory to get the queue to which a job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1656) JobStory should provide queue info.

2010-04-01 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1656:
-

Status: Patch Available  (was: Open)

> JobStory should provide queue info.
> ---
>
> Key: MAPREDUCE-1656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1656
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>Assignee: Hong Tang
>Priority: Minor
> Attachments: mr-1656-2.patch, mr-1656.patch
>
>
> Add a method in JobStory to get the queue to which a job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1656) JobStory should provide queue info.

2010-04-01 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1656:
-

Status: Open  (was: Patch Available)

> JobStory should provide queue info.
> ---
>
> Key: MAPREDUCE-1656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1656
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>Assignee: Hong Tang
>Priority: Minor
> Attachments: mr-1656.patch
>
>
> Add a method in JobStory to get the queue to which a job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1654) Automate the job killing system test case.

2010-04-01 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated MAPREDUCE-1654:
--

Attachment: TEST-org.apache.hadoop.mapred.TestJobKill.txt
patch_1654.txt

Fixed patch
Log of failed test run

> Automate the job killing system test case. 
> ---
>
> Key: MAPREDUCE-1654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1654
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.20.3
> Environment: Herriot system test case development env. 
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: patch_1654.txt, patch_1654.txt, patch_1654.txt, 
> TEST-org.apache.hadoop.mapred.TestJobKill.txt
>
>   Original Estimate: 0.27h
>  Remaining Estimate: 0.27h
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1654) Automate the job killing system test case.

2010-04-01 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852614#action_12852614
 ] 

Konstantin Boudnik commented on MAPREDUCE-1654:
---

- not all of my comments were addressed. I'm attaching properly formatted Java 
source so you'll have a proper example in front of you;
- test isn't passing (log is attached)
- this will be failing
{noformat}
+fs = inDir.getFileSystem(cluster.getJTClient().getConf());
+fs.create(inDir);
{noformat}
if directory already exists
- @After method doesn't clean after itself i.e. it leaves behind input 
directory created by @Before


> Automate the job killing system test case. 
> ---
>
> Key: MAPREDUCE-1654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1654
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.20.3
> Environment: Herriot system test case development env. 
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: patch_1654.txt, patch_1654.txt
>
>   Original Estimate: 0.27h
>  Remaining Estimate: 0.27h
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852613#action_12852613
 ] 

Hadoop QA commented on MAPREDUCE-1428:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440531/MAPREDUCE-1428.patch
  against trunk revision 930088.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/88/console

This message is automatically generated.

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.

2010-04-01 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852609#action_12852609
 ] 

Hong Tang commented on MAPREDUCE-1523:
--

The failed tests seem unrelated.

> Sometimes rumen trace generator fails to extract the job finish time.
> -
>
> Key: MAPREDUCE-1523
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hong Tang
>Assignee: Dick King
> Attachments: mapreduce-1523--2010-03-31a-1612PDT.patch
>
>
> We saw sometimes (not very often) that rumen may fail to extract the job 
> finish time from Hadoop 0.20 history log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-1666.
-

Resolution: Invalid

Muhaha!  I'm quite stunned to see I'm held in such high regard!
 
Anyway, I've been told to close this out, as the people who really wanted to do 
this say this will work.

Thanks.

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1662) TaskRunner.prepare() and close() can be removed

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852564#action_12852564
 ] 

Hadoop QA commented on MAPREDUCE-1662:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440478/patch-1662.txt
  against trunk revision 929712.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/87/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/87/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/87/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/87/console

This message is automatically generated.

> TaskRunner.prepare() and close() can be removed
> ---
>
> Key: MAPREDUCE-1662
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1662
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1662.txt
>
>
> TaskRunner.prepare() and close() methods call only mapOutputFile.removeAll(). 
> The removeAll() call is a always a no-op in prepare(), because the directory 
> is always empty during start up of the task. The removeAll() call in close() 
> is useless, because it is followed by a attempt directory cleanup. Since the 
> map output files are in attempt directory,  the call to close() is useless.
> After MAPREDUCE-842, these calls are under TaskTracker space, passing the 
> wrong conf. Now, the calls do not make sense at all.
> I think we can remove the methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852561#action_12852561
 ] 

Arun C Murthy commented on MAPREDUCE-1666:
--

No worries, once your highness is satisfied please close this one! :)

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852559#action_12852559
 ] 

Allen Wittenauer commented on MAPREDUCE-1666:
-

Oooo. Maybe.  I'll see if that works for the use case these guys have.  Thanks. 
 I didn't know that existed. :)

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852558#action_12852558
 ] 

Arun C Murthy commented on MAPREDUCE-1666:
--

Why doesn't '-output' switch (-Dmapred.output.dir) suffice then?

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852553#action_12852553
 ] 

Allen Wittenauer commented on MAPREDUCE-1666:
-

Yes.  I'm not sure whether this should be a HDFS or MR bug tho.  So I started 
here. :)

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1514) Add documentation on permissions, limitations, error handling for archives.

2010-04-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852552#action_12852552
 ] 

Hudson commented on MAPREDUCE-1514:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #298 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/298/])
.  Add documentation on replication, permissions, new options, limitations 
and internals of har.  Contributed by mahadev


> Add documentation on permissions, limitations, error handling for archives.
> ---
>
> Key: MAPREDUCE-1514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1514
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1514.patch, MAPREDUCE-1514.patch
>
>
> add documentaion on permissions aspect of archives and other limitations that 
> it might have. Also add documentation on error handling (with respect to 
> quota's/otherwise) to the forrest docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1616) Automate system test case for checking the file permissions in mapred.local.dir

2010-04-01 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated MAPREDUCE-1616:
--

Attachment: patch_1616.txt

I still see some inconsistent formatting. I have uploaded the patch which fixes 
all these issues. Test is passing. One last comment: it seems like none of the 
assert statements have error message. This will lead to something like this:

null
assert stack trace

in case of test failure. Which is hard to diagnose and require reading of the 
test source code. Please add meaningful messages where feasible. 

> Automate system test case for checking the file permissions in 
> mapred.local.dir
> ---
>
> Key: MAPREDUCE-1616
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1616
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.20.3
> Environment: Herriot framework is required for running the test. 
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: patch_1616.txt, patch_1616.txt, patch_1655.txt, 
> patch_3392207_7.txt
>
>   Original Estimate: 0.27h
>  Remaining Estimate: 0.27h
>
> The file under mapred.local.dir permission must be recursively tested when 
> the task is running, for this use the controllable task, so the temporary 
> file permission can be checked. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852551#action_12852551
 ] 

Arun C Murthy commented on MAPREDUCE-1666:
--

By 'output' you mean on HDFS? 

> job output chroot support
> -
>
> Key: MAPREDUCE-1666
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: job submission
>Affects Versions: 0.20.2
>Reporter: Allen Wittenauer
>Priority: Minor
>
> It would be useful to be able to submit the same job and have it chroot the 
> output to a different base directory before execution.  This would allow for 
> input to be the same, but output different for the same job over multiple 
> runs (potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1514) Add documentation on permissions, limitations, error handling for archives.

2010-04-01 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1514:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1 patch looks good.

I have committed this.  Thanks, Mahadev!

> Add documentation on permissions, limitations, error handling for archives.
> ---
>
> Key: MAPREDUCE-1514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1514
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1514.patch, MAPREDUCE-1514.patch
>
>
> add documentaion on permissions aspect of archives and other limitations that 
> it might have. Also add documentation on error handling (with respect to 
> quota's/otherwise) to the forrest docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1666) job output chroot support

2010-04-01 Thread Allen Wittenauer (JIRA)
job output chroot support
-

 Key: MAPREDUCE-1666
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1666
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: job submission
Affects Versions: 0.20.2
Reporter: Allen Wittenauer
Priority: Minor


It would be useful to be able to submit the same job and have it chroot the 
output to a different base directory before execution.  This would allow for 
input to be the same, but output different for the same job over multiple runs 
(potentially by different users).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852531#action_12852531
 ] 

Konstantin Boudnik commented on MAPREDUCE-1646:
---

One more thing: the class has about 10 unused import statement. Please optimize 
the import list.

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated MAPREDUCE-1646:
--

Attachment: TEST-org.apache.hadoop.mapred.TestTaskKilling.txt

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852529#action_12852529
 ] 

Konstantin Boudnik commented on MAPREDUCE-1646:
---

Well, looks like {noformat} did a bad job in the above snippet case. What I've 
meant is that in say line 128 all indentation is done with whitespaces, but in 
line 129 and other 3 after it the indentation is tabs + whitespace. Different 
IDEs might be configured differently about the size of the tabs. I.e. for 
Hadoop style the size of tab is 2. Thus in some people editors tabs will be 
looking differently which might lead to unnecessary reformatting of the code.

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852526#action_12852526
 ] 

Konstantin Boudnik commented on MAPREDUCE-1646:
---

- It seems like you have some issues with the IDE configuration.
{noformat}
+  while (counter  < 240) {
+UtilsForTests.waitFor(1000);
+counter ++;
+  }
+  if (counter == 240 ) {
+throw new IOException();
+  }
+}
{noformat}
and many places like this. Please fix.
- also, I've tried to run it and it failed (see attached log)

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1514) Add documentation on permissions, limitations, error handling for archives.

2010-04-01 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1514:
-

Attachment: MAPREDUCE-1514.patch

addressed nicholas's comments.

> Add documentation on permissions, limitations, error handling for archives.
> ---
>
> Key: MAPREDUCE-1514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1514
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1514.patch, MAPREDUCE-1514.patch
>
>
> add documentaion on permissions aspect of archives and other limitations that 
> it might have. Also add documentation on error handling (with respect to 
> quota's/otherwise) to the forrest docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852499#action_12852499
 ] 

Hadoop QA commented on MAPREDUCE-1635:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440447/patch-1635-1.txt
  against trunk revision 929712.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/console

This message is automatically generated.

> ResourceEstimator does not work after MAPREDUCE-842
> ---
>
> Key: MAPREDUCE-1635
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1635-1.txt, patch-1635.txt
>
>
> MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base 
> local directory. Also assumption is that
> org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. 
> But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which 
> does not find the output file. Thus TaskTracker.tryToGetOutputSize() always 
> returns -1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1428:
--

Status: Patch Available  (was: Open)

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1428:
-

Status: Open  (was: Patch Available)

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1428:
-

Status: Patch Available  (was: Open)

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-1428:
--

Status: Open  (was: Patch Available)

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1428:
-

Attachment: MAPREDUCE-1428.patch

this patch fixes the javac warning!

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1514) Add documentation on permissions, limitations, error handling for archives.

2010-04-01 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852480#action_12852480
 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1514:
---

Some suggestions:
- In Limitations of Hadoop Archive, "... input files with spaces ..." => "... 
input paths with spaces ..."
- In Internals of Hadoop Archive, do you want to mention what does _masterindex 
contain?
- You probably don't want to change the format for the entire doc file.

> Add documentation on permissions, limitations, error handling for archives.
> ---
>
> Key: MAPREDUCE-1514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1514
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1514.patch
>
>
> add documentaion on permissions aspect of archives and other limitations that 
> it might have. Also add documentation on error handling (with respect to 
> quota's/otherwise) to the forrest docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-01 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852465#action_12852465
 ] 

Mahadev konar commented on MAPREDUCE-1428:
--

the javac warning is because of : 
{code}
[deprecation] 
create(org.apache.hadoop.fs.Path,org.apache.hadoop.fs.permission.FsPermission,boolean,int,short,long,org.apache.hadoop.util.Progressable)
 in org.apache.hadoop.fs.FileSystem has been deprecated
[javac] partStream = destFs.create(tmpOutput, new 
FsPermission((short)0700),

{code}

I dont see a solution arnd it.

> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1628) HarFileSystem shows incorrect replication numbers and permissions

2010-04-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852460#action_12852460
 ] 

Hudson commented on MAPREDUCE-1628:
---

Integrated in Hadoop-Mapreduce-trunk #276 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/276/])
. HarFileSystem shows incorrect replication numbers and permissions (tsz 
via mahadev)


> HarFileSystem shows incorrect replication numbers and permissions
> -
>
> Key: MAPREDUCE-1628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.22.0
>
> Attachments: m1628_20100329.patch, m1628_20100329b.patch, 
> m1628_20100330.patch, m1628_20100331.patch
>
>
> In the har dir, the replication # of part-0 is 3.
> {noformat}
> -bash-3.1$ hadoop fs -ls  ${DIR}.har
> Found 3 items
> -rw---   5 tsz users   1141 2010-02-10 18:34 /user/tsz/t20.har/_index
> -rw---   5 tsz users 24 2010-02-10 18:34 
> /user/tsz/t20.har/_masterindex
> -rw---   3 tsz users  15052 2010-02-10 18:34 /user/tsz/t20.har/part-0
> {noformat}
> but the replication # of the individual har:// files is shown as 5.
> {noformat}
> -bash-3.1$ hadoop fs -lsr  ${HAR_FULL}/
> drw---   - tsz users  0 2010-02-10 18:34 /user/tsz/t20.har/t20
> -rw---   5 tsz users723 2010-02-10 18:34 
> /user/tsz/t20.har/t20/text-
> -rw---   5 tsz users779 2010-02-10 18:34 
> /user/tsz/t20.har/t20/text-0001
> -rw---   5 tsz users818 2010-02-10 18:34 
> /user/tsz/t20.har/t20/text-0002
> ...
> {noformat}
> The permission also has similar problem.  Clearly, the permission of 
> t20.har/t20 shown above is incorrect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1602) When the src does not exist, archive shows IndexOutOfBoundsException

2010-04-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852461#action_12852461
 ] 

Hudson commented on MAPREDUCE-1602:
---

Integrated in Hadoop-Mapreduce-trunk #276 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/276/])
. Fix the error message for the case that src does not exist.


> When the src does not exist, archive shows IndexOutOfBoundsException
> 
>
> Key: MAPREDUCE-1602
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1602
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.22.0
>
> Attachments: m1602_20100329.patch, m1602_20100330.patch
>
>
> {noformat}
> -bash-3.1$ $H archive -archiveName foo.har -p / src-not-exists dst
> IndexOutOfBoundsException in archives
> Index: 0, Size: 0
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers

2010-04-01 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852434#action_12852434
 ] 

Ravi Gummadi commented on MAPREDUCE-1100:
-

The patch existing now at MAPREDUCE-1057 doesn't solve the whole problem. 
Currently, "tail -c" is done after task process has finished execution but 
log.index file is created(and written to) by task itself. So log.index cannot 
have correct index details of task logs(i.e. startingOffset and length of 
stdout, stderr and syslog).

One way to solve this is to make TT write the index details to log.index file 
once a task is done.

Thoughts ?

> User's task-logs filling up local disks on the TaskTrackers
> ---
>
> Key: MAPREDUCE-1100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Vinod K V
> Attachments: MAPREDUCE-1100-20091102.txt, 
> MAPREDUCE-1100-20091106.txt, MAPREDUCE-1100-20091216.2.txt, 
> patch-1100-fix-ydist.2.txt
>
>
> Some user's jobs are filling up TT disks by outrageous logging. 
> mapreduce.task.userlog.limit.kb is not enabled on the cluster. Disks are 
> getting filled up before task-log cleanup via 
> mapred.task.userlog.retain.hours can kick in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1638) Divide MapReduce into API and implementation source trees

2010-04-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852427#action_12852427
 ] 

Steve Loughran commented on MAPREDUCE-1638:
---

I know of people who are looking at alternate execution engines to the JT, so 
having that bit of the API decoupled from the implementation would be good. I 
don't (yet) see the need for separate JARs though, it only complicates things 
and creates new problems.

> Divide MapReduce into API and implementation source trees
> -
>
> Key: MAPREDUCE-1638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build, client
>Reporter: Tom White
>Assignee: Tom White
>
> I think it makes sense to separate the MapReduce source into public API and 
> implementation trees. The public API could be broken further into kernel and 
> library trees.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1514) Add documentation on permissions, limitations, error handling for archives.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852418#action_12852418
 ] 

Hadoop QA commented on MAPREDUCE-1514:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440433/MAPREDUCE-1514.patch
  against trunk revision 929712.

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/85/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/85/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/85/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/85/console

This message is automatically generated.

> Add documentation on permissions, limitations, error handling for archives.
> ---
>
> Key: MAPREDUCE-1514
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1514
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1514.patch
>
>
> add documentaion on permissions aspect of archives and other limitations that 
> it might have. Also add documentation on error handling (with respect to 
> quota's/otherwise) to the forrest docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1665) kill and modify should not be the same acl

2010-04-01 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-1665:


Summary: kill and modify should not be the same acl  (was: modify acl 
should not grant kill perms)

> kill and modify should not be the same acl
> --
>
> Key: MAPREDUCE-1665
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1665
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Allen Wittenauer
>
> The permission to kill a job/task should be split out from modification.  
> There are definitely instances where someone who can kill a job should not be 
> able to modify it.  [Third person job monitoring, for example, such as we 
> have here at LinkedIn.]  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1665) modify acl should not grant kill perms

2010-04-01 Thread Allen Wittenauer (JIRA)
modify acl should not grant kill perms
--

 Key: MAPREDUCE-1665
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1665
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Allen Wittenauer


The permission to kill a job/task should be split out from modification.  There 
are definitely instances where someone who can kill a job should not be able to 
modify it.  [Third person job monitoring, for example, such as we have here at 
LinkedIn.]  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers

2010-04-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852395#action_12852395
 ] 

Steve Loughran commented on MAPREDUCE-1100:
---

OK, so what do you want here? I could +1 MAPREDUCE-1057 and if hudson is happy, 
push out to 0.21/0.22 branches. MAPREDUCE-1100  would be the patch that goes 
into trunk.

The other fix would be documentation. I've added something on logging to 
[http://wiki.apache.org/hadoop/DiskSetup] but it could be improved. More 
subtly, any production cluster that doesn't have a limit on log size is doomed. 
Some preflight checks could note this and perhaps warn.

> User's task-logs filling up local disks on the TaskTrackers
> ---
>
> Key: MAPREDUCE-1100
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Vinod K V
>Assignee: Vinod K V
> Attachments: MAPREDUCE-1100-20091102.txt, 
> MAPREDUCE-1100-20091106.txt, MAPREDUCE-1100-20091216.2.txt, 
> patch-1100-fix-ydist.2.txt
>
>
> Some user's jobs are filling up TT disks by outrageous logging. 
> mapreduce.task.userlog.limit.kb is not enabled on the cluster. Disks are 
> getting filled up before task-log cleanup via 
> mapred.task.userlog.retain.hours can kick in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1646:
-

Attachment: TaskKilling_1646.patch

All the comments incorporated into the latest patch.

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1646:
-

Attachment: (was: TaskKilling_1646.patch)

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1646:
-

Attachment: TaskKilling_1646.patch

All the comments incorporated into the latest patch.

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852350#action_12852350
 ] 

Hadoop QA commented on MAPREDUCE-1523:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12440421/mapreduce-1523--2010-03-31a-1612PDT.patch
  against trunk revision 929712.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 13 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/84/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/84/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/84/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/84/console

This message is automatically generated.

> Sometimes rumen trace generator fails to extract the job finish time.
> -
>
> Key: MAPREDUCE-1523
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hong Tang
>Assignee: Dick King
> Attachments: mapreduce-1523--2010-03-31a-1612PDT.patch
>
>
> We saw sometimes (not very often) that rumen may fail to extract the job 
> finish time from Hadoop 0.20 history log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1073) Progress reported for pipes tasks is incorrect.

2010-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852295#action_12852295
 ] 

Hadoop QA commented on MAPREDUCE-1073:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12440406/mapreduce-1073--2010-03-31.patch
  against trunk revision 929712.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/83/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/83/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/83/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/83/console

This message is automatically generated.

> Progress reported for pipes tasks is incorrect.
> ---
>
> Key: MAPREDUCE-1073
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1073
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 0.20.1
>Reporter: Sreekanth Ramakrishnan
> Attachments: mapreduce-1073--2010-03-31.patch, 
> MAPREDUCE-1073_yhadoop20.patch
>
>
> Currently in pipes, 
> {{org.apache.hadoop.mapred.pipes.PipesMapRunner.run(RecordReader, 
> OutputCollector, Reporter)}} we do the following:
> {code}
> while (input.next(key, value)) {
>   downlink.mapItem(key, value);
>   if(skipping) {
> downlink.flush();
>   }
> }
> {code}
> This would result in consumption of all the records for current task and 
> taking task progress to 100% whereas the actual pipes application would be 
> trailing behind. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1664) Job Acls affect Queue Acls

2010-04-01 Thread Ravi Gummadi (JIRA)
Job Acls affect Queue Acls
--

 Key: MAPREDUCE-1664
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1664
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Fix For: 0.22.0


MAPREDUCE-1307 introduced job ACLs for securing job level operations. So in 
current trunk, queue ACLs and job ACLs are checked(with AND for both acls) for 
allowing job level operations. So for doing operations like killJob, killTask 
and setJobPriority user should be part of both 
mapred.queue.{queuename}.acl-administer-jobs and in 
mapreduce.job.acl-modify-job. This needs to change so that users who are part 
of mapred.queue.{queuename}.acl-administer-jobs will be able to do 
killJob,killTask,setJobPriority and users part of mapreduce.job.acl-modify-job 
will be able to do killJob,killTask,setJobPriority.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1662) TaskRunner.prepare() and close() can be removed

2010-04-01 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1662:
---

Status: Patch Available  (was: Open)

> TaskRunner.prepare() and close() can be removed
> ---
>
> Key: MAPREDUCE-1662
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1662
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1662.txt
>
>
> TaskRunner.prepare() and close() methods call only mapOutputFile.removeAll(). 
> The removeAll() call is a always a no-op in prepare(), because the directory 
> is always empty during start up of the task. The removeAll() call in close() 
> is useless, because it is followed by a attempt directory cleanup. Since the 
> map output files are in attempt directory,  the call to close() is useless.
> After MAPREDUCE-842, these calls are under TaskTracker space, passing the 
> wrong conf. Now, the calls do not make sense at all.
> I think we can remove the methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1662) TaskRunner.prepare() and close() can be removed

2010-04-01 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1662:
---

Attachment: patch-1662.txt

Patch removes TaskRunner.prepare(). 
ReduceTaskRunner.close() sets a status "closed" on the Progress object. So, 
patch does not remove TaskRunner.close(), but removes mapOutputFile.removeAll() 
from close() methods in MapTaskRunner and ReduceTaskRunner.

> TaskRunner.prepare() and close() can be removed
> ---
>
> Key: MAPREDUCE-1662
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1662
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1662.txt
>
>
> TaskRunner.prepare() and close() methods call only mapOutputFile.removeAll(). 
> The removeAll() call is a always a no-op in prepare(), because the directory 
> is always empty during start up of the task. The removeAll() call in close() 
> is useless, because it is followed by a attempt directory cleanup. Since the 
> map output files are in attempt directory,  the call to close() is useless.
> After MAPREDUCE-842, these calls are under TaskTracker space, passing the 
> wrong conf. Now, the calls do not make sense at all.
> I think we can remove the methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1653) Add apache header to UserNamePermission.java

2010-04-01 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal reassigned MAPREDUCE-1653:
-

Assignee: Balaji Rajagopalan

> Add apache header to UserNamePermission.java
> 
>
> Key: MAPREDUCE-1653
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1653
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.20.3
> Environment: Herriot
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
>Priority: Trivial
> Attachments: patch_1653.txt
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Add the missing header to the file. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1646) Task Killing tests

2010-04-01 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal reassigned MAPREDUCE-1646:
-

Assignee: Vinay Kumar Thota

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1533) reduce or remove usage of String.format() usage in CapacityTaskScheduler.updateQSIObjects

2010-04-01 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852271#action_12852271
 ] 

Hemanth Yamijala commented on MAPREDUCE-1533:
-

Few comments on the patch:

- Move TaskSchedulingContext.JOB_SCHEDULING_INFO_FORMAT_STRING to 
JobSchedulingInfoHolder
- QueueSchedulingContext does not seem to be the right place to define 
JobSchedulingInfoHolder. The container class manages data related to queues, 
and not a specific Job. Maybe it should be a separate class.
- By making JobSchedulingInfoHolder.toString to be called outside heartbeats, 
we are addressing the core problem in the issue. But there still exist a few 
calls to this method from within a JobTracker lock - like CLI APIs like 
getAllJobs, etc., though they occur very infrequently as compared to 
heartbeats. With that context, should we implement a more optimized version of 
toString, maybe using StringBuilder (as was suggested elsewhere).
- The changes in JobQueue.updateStatsOnRunningJob and 
TaskDataView.getSlotsOccupied can be avoided, but the intent of the change can 
still be met, by changing the algorithm in TaskDataView.getSlotsPerTask. I am 
giving this idea based on preliminary patches in MAPREDUCE-1354. There, we 
optimized getNumSlotsPer{Map|Reduce} to be unsynchronized, by making the 
corresponding variables volatile. Hence, getSlotsPerTask can now be implemented 
as:
{code}
int getSlotsPerTask(JobInProgress job) {
  return job.getNumSlotsPerMap();
}
{code}
and likewise for reduces.
This makes it fewer changes to the patch.
- I would suggest a few documentation changes to better document the contract 
of the scheduling info.
-- Document that getSchedulingInfo returns a stringified representation of the 
job scheduling info set in setJobSchedulingInfo.
-- Document that getJobSchedulingInfo will return the stringified 
representation of the job scheduling info on the Client, but the actual object 
on the server. (Note that we are deserializing the stringified representation 
of the scheduling info on the client, not the actual object itself)
-- Document the intent of setJobSchedulingInfo - i.e. it is for optimization of 
heartbeats and allows lazy construction of the stringified representation on a 
need basis. This is an important design choice to capture, I think.
- Do we need mapred.JobStatus.{get|set}JobSchedulingInfo ? Can we not define 
them only in mapreduce.JobStatus ?
- I don't think we need the cast in JobStatus.readFields casting the scheduling 
info string to an Object, because this is allowed anyway.
- Given we are going to store the stringified representation of the scheduling 
info on the client, should we retain the name of the JobStatus variable as 
schedulingInfo only ?

> reduce or remove usage of String.format() usage in 
> CapacityTaskScheduler.updateQSIObjects
> -
>
> Key: MAPREDUCE-1533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1533
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Rajesh Balamohan
>Assignee: Amar Kamat
> Attachments: mapreduce-1533-v1.4.patch
>
>
> When short jobs are executed in hadoop with OutOfBandHeardBeat=true, JT 
> executes heartBeat() method heavily. This internally makes a call to 
> CapacityTaskScheduler.updateQSIObjects(). 
> CapacityTaskScheduler.updateQSIObjects(), internally calls String.format() 
> for setting the job scheduling information. Based on the datastructure size 
> of "jobQueuesManager" and "queueInfoMap", the number of times String.format() 
> gets executed becomes very high. String.format() internally does pattern 
> matching which turns to be out very heavy (This was revealed while profiling 
> JT. Almost 57% of time was spent in CapacityScheduler.assignTasks(), out of 
> which String.format() took 46%.
> Would it be possible to do String.format() only at the time of invoking 
> JobInProgress.getSchedulingInfo?. This might reduce the pressure on JT while 
> processing heartbeats. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1654) Automate the job killing system test case.

2010-04-01 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal reassigned MAPREDUCE-1654:
-

Assignee: Balaji Rajagopalan

> Automate the job killing system test case. 
> ---
>
> Key: MAPREDUCE-1654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1654
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.20.3
> Environment: Herriot system test case development env. 
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: patch_1654.txt, patch_1654.txt
>
>   Original Estimate: 0.27h
>  Remaining Estimate: 0.27h
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.