date:20091018

[jira] Updated: (MAPREDUCE-1088) JobHistory files should have narrower 0600 perms

2009-10-18 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1088:
-

Attachment: MAPREDUCE-1088_yhadoop20.patch

Emergency bug-fix to yahoo hadoop20 distribution along-with HADOOP-6304 - I'll 
upload one for trunk shortly with narrower perms. 

> JobHistory files should have narrower 0600 perms 
> -
>
> Key: MAPREDUCE-1088
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1088
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.20.1
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.20.2
>
> Attachments: MAPREDUCE-1088_yhadoop20.patch
>
>
> Currently the perms on JobHistory files are 0740, I propose we make it 0600.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1088) JobHistory files should have narrower 0600 perms

2009-10-18 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1088:
-

Affects Version/s: 0.20.1
Fix Version/s: 0.20.2
 Assignee: Arun C Murthy

> JobHistory files should have narrower 0600 perms 
> -
>
> Key: MAPREDUCE-1088
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1088
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.20.1
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.20.2
>
>
> Currently the perms on JobHistory files are 0740, I propose we make it 0600.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1103) Additional JobTracker metrics

2009-10-18 Thread Sharad Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-1103:
--

Attachment: 1103.patch

Updated to trunk

> Additional JobTracker metrics
> -
>
> Key: MAPREDUCE-1103
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1103
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: 1103.patch, 1103.patch
>
>
> It would be useful for tracking the following additional JobTracker metrics:
> running{map|reduce}tasks
> busy{map|reduce}slots
> reserved{map|reduce}slots

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-28) TestQueueManager takes too long and times out some times

2009-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767213#action_12767213
 ] 

Hadoop QA commented on MAPREDUCE-28:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422365/MAPREDUCE-28-8.patch
  against trunk revision 826565.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 34 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/181/console

This message is automatically generated.

> TestQueueManager takes too long and times out some times
> 
>
> Key: MAPREDUCE-28
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-28
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
>Assignee: V.V.Chaitanya Krishna
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-28-1.txt, MAPREDUCE-28-2.txt, 
> MAPREDUCE-28-3.txt, MAPREDUCE-28-4.txt, MAPREDUCE-28-5.txt, 
> MAPREDUCE-28-6.txt, MAPREDUCE-28-7.txt, MAPREDUCE-28-8.patch
>
>
> TestQueueManager takes long time for the run and timeouts sometimes.
> See the failure at 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/.
> Looking at the console output, before the test finsihes, it was timed-out.
> On my machine, the test takes about 5 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Aaron Kimball (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767211#action_12767211
 ] 

Aaron Kimball commented on MAPREDUCE-972:
-

Because there are other operations in the distcp job (e.g., the {{write()}} 
calls made during the actual upload) that should timeout far faster than once 
per thirty minutes in the event of an error. Using a single timeout value for 
all operations makes program execution overall considerably less efficient than 
it should be. Writes and renames in distcp can expect different running times; 
we should treat them this way.


> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

2009-10-18 Thread rahul k singh (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767210#action_12767210
 ] 

rahul k singh commented on MAPREDUCE-1105:
--

Summary for yahoo distribution patch.
==
- Remove the existing 
"mapred.capacity-scheduler.queue..max.map.slots" and 
"mapred.capacity-scheduler.queue..max.reduce.slots" variables , 
these where used to throttle the queue, i.e, these were the hard limit and not 
allowing
queue to grow further.

- Added the new parameter 
"mapred.capacity-scheduler.queue..maximum-capacity" 
maximum-capacity defines a limit beyond which a queue cannot use the capacity 
of the cluster. This provides a means to limit how much excess capacity a queue 
can use. By default, there is no limit.The maximum-capacity of a queue can only 
be greater than or equal to its minimum capacity. Default value of -1 implies a 
queue can use complete capacity of the cluster.
This property could be to curtail certain jobs which are long running in nature 
from occupying more than a certain percentage of the cluster, which in the 
absence of pre-emption, could lead to capacity guarantees of other queues being 
affected. One important thing to note is that maximum-capacity is a percentage 
, so based on the cluster's capacity the max capacity would change. So if large 
no of nodes or racks get added to the cluster , max Capacity in absolute terms 
would increase accordingly.

- Added some testcases for unit testing the maximum-capacity knob.

- remove testcase for max.map.slots and max.reduce.slots.

Summary of changes for patch 21.
===

- Removed "mapred.capacity-scheduler.queue..max.map.slots" and 
"mapred.capacity-scheduler.queue..max.reduce.slots" entries.

- Removed testcases for the same.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's 
> actual capacity
> 
>
> Key: MAPREDUCE-1105
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPRED-1105-21-1.patch, 
> MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a 
> hard-limit is specified to be greater than it's actual capacity. We should 
> allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than 
> #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1105) CapacityScheduler: It should be possible to set queue hard-limit beyond it's actual capacity

2009-10-18 Thread rahul k singh (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767209#action_12767209
 ] 

rahul k singh commented on MAPREDUCE-1105:
--

Summary for yahoo distribution patch.
==
- Remove the existing 
"mapred.capacity-scheduler.queue..max.map.slots" and 
"mapred.capacity-scheduler.queue..max.reduce.slots" variables , 
these where used to throttle the queue, i.e, these were the hard limit and not 
allowing 
queue to grow further.
-Added the new parameter 
"mapred.capacity-scheduler.queue..maximum-capacity" 
maximum-capacity defines a limit beyond which a queue cannot use the capacity 
of the cluster. This provides a means to limit how much excess capacity a queue 
can use. By default, there is no limit.The maximum-capacity of a queue can only 
be greater than or equal to its minimum capacity. Default value of -1 implies a 
queue can use complete capacity of the cluster.

 This property could be to curtail certain jobs which are long running in 
nature from occupying more than a  certain percentage of the cluster, which in 
the absence of pre-emption, could lead to capacity guarantees of  other queues 
being affected. One important thing to note is that maximum-capacity is a 
percentage , so based on the cluster's capacity the max capacity would change. 
So if large no of nodes or racks get added to the cluster , max Capacity in  
absolute terms would increase accordingly.

-Added some testcases for unit testing the maximum-capacity knob.
-remove testcase for max.map.slots and max.reduce.slots.


Summary of changes for patch 21.
===

-Removed "mapred.capacity-scheduler.queue..max.map.slots" and 
"mapred.capacity-scheduler.queue..max.reduce.slots" entries.

-Removed testcases for the same.

> CapacityScheduler: It should be possible to set queue hard-limit beyond it's 
> actual capacity
> 
>
> Key: MAPREDUCE-1105
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1105
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPRED-1105-21-1.patch, 
> MAPREDUCE-1105-version20.patch.txt
>
>
> Currently the CS caps a queue's capacity to it's actual capacity if a 
> hard-limit is specified to be greater than it's actual capacity. We should 
> allow the queue to go upto the hard-limit if specified.
> Also, I propose we change the hard-limit unit to be percentage rather than 
> #slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-28) TestQueueManager takes too long and times out some times

2009-10-18 Thread rahul k singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-28:
---

Status: Patch Available  (was: Open)

> TestQueueManager takes too long and times out some times
> 
>
> Key: MAPREDUCE-28
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-28
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
>Assignee: V.V.Chaitanya Krishna
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-28-1.txt, MAPREDUCE-28-2.txt, 
> MAPREDUCE-28-3.txt, MAPREDUCE-28-4.txt, MAPREDUCE-28-5.txt, 
> MAPREDUCE-28-6.txt, MAPREDUCE-28-7.txt, MAPREDUCE-28-8.patch
>
>
> TestQueueManager takes long time for the run and timeouts sometimes.
> See the failure at 
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/.
> Looking at the console output, before the test finsihes, it was timed-out.
> On my machine, the test takes about 5 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767186#action_12767186
 ] 

Hudson commented on MAPREDUCE-1070:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #84 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/84/])
. Prevent a deadlock in the fair scheduler servlet.
Contributed by Todd Lipcon


> Deadlock in FairSchedulerServlet
> 
>
> Key: MAPREDUCE-1070
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.20.2
>
> Attachments: deadlock.png, mapreduce-1070-branch20.txt, 
> mapreduce-1070.txt
>
>
> FairSchedulerServlet can cause a deadlock with the JobTracker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1113) mumak compiles aspects even if skip.contrib is true

2009-10-18 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767183#action_12767183
 ] 

Chris Douglas commented on MAPREDUCE-1113:
--

I wouldn't oppose adding the ant-contrib dep in principle and binary tarballs 
without contrib would be great, but this is quickly expanding outside its 
original scope. If this is going into 0.21, then a light touch would be 
strongly preferred. Perhaps a separate issue for refactoring the build would be 
appropriate? Either that, or this can be appropriated/renamed for that purpose 
while MAPREDUCE-1038 can track the excessive Mumak aspect generation.

> mumak compiles aspects even if skip.contrib is true
> ---
>
> Key: MAPREDUCE-1113
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1113
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, contrib/mumak
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1113.txt
>
>
> The compile-aspects task in mumak's build.xml runs regardless of the 
> skip.contrib property. Momentarily uploading a patch to fix this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1112) Fix CombineFileInputFormat for hadoop 0.20

2009-10-18 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated MAPREDUCE-1112:
--

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Duplicate of HADOOP-5759.

> Fix CombineFileInputFormat for hadoop 0.20
> --
>
> Key: MAPREDUCE-1112
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1112
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.20.2
>
> Attachments: MAPREDUCE-1112.1.patch, MAPREDUCE-1112.2.patch
>
>
> HADOOP-5759 is already fixed as a part of MAPREDUCE-364  in hadoop 0.21.
> This will fix the same problem with CombineFileInputFormat for hadoop 0.20.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-847) Adding Apache License Headers and reduce releaseaudit warnings to zero

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-847:


Status: Open  (was: Patch Available)

* This should exclude {{src/test/tools/data/rumen}}, perhaps even 
{{src/test/tools/data}} ; some of the files are compressed in trunk 
(MAPREDUCE-1077)
* libhdfs may be moved to the HDFS project soon (MAPREDUCE-665); the excludes 
should be removed when/if it does

Other than that, this looks good

> Adding Apache License Headers and reduce releaseaudit warnings to zero
> --
>
> Key: MAPREDUCE-847
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-847
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Giridharan Kesavan
>Assignee: Giridharan Kesavan
> Attachments: MAPREDUCE-847-v1.PATCH, MAPREDUCE-847.PATCH
>
>
> [rat:report] Summary
> [rat:report] ---
> [rat:report] Notes: 14
> [rat:report] Binaries: 178
> [rat:report] Archives: 49
> [rat:report] Standards: 1364
> [rat:report]
> [rat:report] Apache Licensed: 1152
> [rat:report] Generated Documents: 9
> [rat:report]
> [rat:report] JavaDocs are generated and so license header is optional
> [rat:report] Generated files do not required license headers
> [rat:report]
> [rat:report] 203 Unknown Licenses

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1041) TaskStatuses map in TaskInProgress should be made package private instead of protected

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767178#action_12767178
 ] 

Hudson commented on MAPREDUCE-1041:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #83 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/83/])
. Make TaskInProgress::taskStatuses map package-private.
Contributed by Jothi Padmanabhan


> TaskStatuses map in TaskInProgress should be made package private instead of 
> protected
> --
>
> Key: MAPREDUCE-1041
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1041
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-1041.patch
>
>
> MAPREDUCE-1028 made TaskStatuses protected. As Nigel pointed out in that 
> Jira, making it package private would suffice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1012) Context interfaces should be Public Evolving

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767179#action_12767179
 ] 

Hudson commented on MAPREDUCE-1012:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #83 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/83/])
. Mark Context interfaces as public evolving. Contributed by Tom White


> Context interfaces should be Public Evolving
> 
>
> Key: MAPREDUCE-1012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1012
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.21.0
>Reporter: Tom White
>Assignee: Tom White
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-1012.patch
>
>
> As discussed in MAPREDUCE-954 the nascent context interfaces should be marked 
> as Public Evolving to facilitate future evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767176#action_12767176
 ] 

Chris Douglas commented on MAPREDUCE-972:
-

Why wouldn't one just set the task timeout for the distcp job to 30 minutes?

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1112) Fix CombineFileInputFormat for hadoop 0.20

2009-10-18 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767174#action_12767174
 ] 

Zheng Shao commented on MAPREDUCE-1112:
---

My bad. Yes it's the same as HADOOP-5759, except a file name change.

I will commit HADOOP-5759 to branch-0.20, and mark this one as duplicate.



> Fix CombineFileInputFormat for hadoop 0.20
> --
>
> Key: MAPREDUCE-1112
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1112
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.20.2
>
> Attachments: MAPREDUCE-1112.1.patch, MAPREDUCE-1112.2.patch
>
>
> HADOOP-5759 is already fixed as a part of MAPREDUCE-364  in hadoop 0.21.
> This will fix the same problem with CombineFileInputFormat for hadoop 0.20.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-932) Rumen needs a job trace sorter

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-932:


Status: Open  (was: Patch Available)

Could this compress the data for the testcase, as in MAPREDUCE-1077?

> Rumen needs a job trace sorter
> --
>
> Key: MAPREDUCE-932
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-932
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Attachments: MAPREDUCE-932--2009-09-18-PM.patch, 
> MAPREDUCE-932--2009-09-18.patch, patch-932--2009-08-31--1702.patch
>
>
> Rumen reads job history logs and produces job traces.  The jobs in a job 
> trace do not occur in any promised order.  Certain tools need the jobs to be 
> ordered by job submission time.  We should include, in Rumen, a tool to sort 
> job traces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-892) command line tool to list all tasktrackers and their status

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-892:


Status: Open  (was: Patch Available)

Canceling patch, as it has gone stale.

> command line tool to list all tasktrackers and their status
> ---
>
> Key: MAPREDUCE-892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-892
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: MAPRED-892.patch.3, MAPREDUCE-892.patch, 
> MAPREDUCE-892.patch, MAPREDUCE-892.patch.1
>
>
> The "hadoop mradmin -report" could list all the tasktrackers that the 
> JobTracker knows about. It will also list a brief status summary for each of 
> the TaskTracker. (This is similar to the hadop dfsadmin -report command that 
> lists all Datanodes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1070:
-

   Resolution: Fixed
Fix Version/s: 0.20.2
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I committed this. Thanks, Todd!

> Deadlock in FairSchedulerServlet
> 
>
> Key: MAPREDUCE-1070
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.20.2
>
> Attachments: deadlock.png, mapreduce-1070-branch20.txt, 
> mapreduce-1070.txt
>
>
> FairSchedulerServlet can cause a deadlock with the JobTracker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-962) NPE in ProcfsBasedProcessTree.destroy()

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-962:


Description: 
This causes the following exception in TaskMemoryManagerThread. I observed this 
while running TestTaskTrackerMemoryManager.
{code}
2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread 
(TaskMemoryManagerThread.java:run(239)) - \
Uncaught exception in TaskMemoryManager while managing memory of 
attempt_20090902120812252_0001_m_03_0 : \
java.lang.NullPointerException
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
at 
org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
{code}

  was:
This causes the following exception in TaskMemoryManagerThread. I observed this 
while running TestTaskTrackerMemoryManager.
{code}
2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread 
(TaskMemoryManagerThread.java:run(239)) - Uncaught exception in 
TaskMemoryManager while managing memory of 
attempt_20090902120812252_0001_m_03_0 : java.lang.NullPointerException
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
at 
org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
at 
org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
{code}


> NPE in ProcfsBasedProcessTree.destroy()
> ---
>
> Key: MAPREDUCE-962
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-962
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Vinod K V
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: HADOOP-6232.patch, MR-962.patch, MR-962.v1.patch
>
>
> This causes the following exception in TaskMemoryManagerThread. I observed 
> this while running TestTaskTrackerMemoryManager.
> {code}
> 2009-09-02 12:08:25,835 WARN  mapred.TaskMemoryManagerThread 
> (TaskMemoryManagerThread.java:run(239)) - \
> Uncaught exception in TaskMemoryManager while managing memory of 
> attempt_20090902120812252_0001_m_03_0 : \
> java.lang.NullPointerException
> at 
> org.apache.hadoop.util.ProcfsBasedProcessTree.assertPidPgrpidForMatch(ProcfsBasedProcessTree.java:234)
> at 
> org.apache.hadoop.util.ProcfsBasedProcessTree.assertAndDestroyProcessGroup(ProcfsBasedProcessTree.java:257)
> at 
> org.apache.hadoop.util.ProcfsBasedProcessTree.destroy(ProcfsBasedProcessTree.java:286)
> at 
> org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:229)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1041) TaskStatuses map in TaskInProgress should be made package private instead of protected

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1041:
-

   Resolution: Fixed
Fix Version/s: (was: 0.22.0)
   0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Jothi!

> TaskStatuses map in TaskInProgress should be made package private instead of 
> protected
> --
>
> Key: MAPREDUCE-1041
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1041
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-1041.patch
>
>
> MAPREDUCE-1028 made TaskStatuses protected. As Nigel pointed out in that 
> Jira, making it package private would suffice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Aaron Kimball (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767166#action_12767166
 ] 

Aaron Kimball commented on MAPREDUCE-972:
-

My proposal is slightly different than that:

The progress thread is in one of three states:

1) {{inRename = true && isComplete == false}}
2) {{inRename = false && isComplete == false}}
3) {{isComplete = true}}

When inRename is set to true, the progress thread will call {{progress()}} 
every few seconds, for up to a max of {{distcp.rename.timeout}} seconds. If it 
is still in this state after {{distcp.rename.timeout}} seconds have elapsed 
since the state began, it will set inRename to false.

When inRename is false, it just sits there, waiting for another rename 
operation to start. It sleeps and occasionally polls for a state change on 
inRename or isComplete. Changing inRename back to true again will go into the 
previously-described state; {{distcp.rename.timeout}} starts anew from this 
time point.

If isComplete is true, the thread exits immediately. The {{Mapper.close()}} 
method will set isComplete to true to ensure that the thread shuts down. (As 
the thread is {{setDaemon(true)}}, the JVM will exit even without this detail, 
but it is good hygeine to do so anyway.)

It is not sufficient to simply call progress() right before rename(). 
Experience has shown that when uploading large files to S3, the rename() 
operation itself can take in excess of 10 minutes. rename() in S3 is 
implemented as copy-and-delete. For multi-GB files, this can take a long time.

If we just tell people to set their global task timeout to 30 minutes, then 
this will delay task restarts under other conditions where the timeout value is 
expected to be considerably shorter (e.g., an individual file {{write()}} 
operation). This can adversely affect distcp performance in the general case.

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1012) Context interfaces should be Public Evolving

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1012:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I committed this. Thanks, Tom!

> Context interfaces should be Public Evolving
> 
>
> Key: MAPREDUCE-1012
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1012
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.21.0
>Reporter: Tom White
>Assignee: Tom White
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-1012.patch
>
>
> As discussed in MAPREDUCE-954 the nascent context interfaces should be marked 
> as Public Evolving to facilitate future evolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767161#action_12767161
 ] 

Chris Douglas commented on MAPREDUCE-972:
-

I see. Extending the FileSystem API is a non-starter, so we can move on from 
that. Progress threads in general are discouraged (e.g. HADOOP-5052).

If I understand your proposal, the progress thread would report starting from 
the first rename, but stop after some configurable interval. In most cases, I'm 
not sure how this would differ from simply setting the task timeout higher, 
since progress is reported between renames. Also, this wouldn't help renames 
after the thread exits.

Would it be sufficient to add a call to progress() right before the rename 
(after the delete)? In that case, setting the task timeout higher would extend 
the time allowed for each rename, which is the right level of granularity, 
anyway. It won't do this automatically for s3 destinations, but pushing that 
detail into distcp is not ideal, either. One could add a FilterFileSystem that 
resets a persistent progress thread for each rename, manage all the 
signaling/locking etc., but its behavior seems indistinguishable from this much 
simpler tweak. Would this be sufficient?

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-987) Exposing MiniDFS and MiniMR clusters as a single process command-line

2009-10-18 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767135#action_12767135
 ] 

Chris Douglas commented on MAPREDUCE-987:
-

This seems appropriate for the test jar. Small notes:
* This picks up \-D params like the generic parser; would it make sense to also 
accept \-conf? The other params make less sense in this context, though it may 
be worth considering Tool/ToolRunner
* It'd be better if sleepForever monitored the Mini\*Cluster, rather than 
waking up every minute for no reason. Not sure if it makes sense to include a 
poison pill (Path?) + configurable polling interval that might signal an 
orderly shutdown.
* If this is intended for tests, should {{start}} wait for the TT/DNs to come 
up before returning?

> Exposing MiniDFS and MiniMR clusters as a single process command-line
> -
>
> Key: MAPREDUCE-987
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-987
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: build, test
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HDFS-621-0.20-patch, HDFS-621.patch, MAPREDUCE-987.patch
>
>
> It's hard to test non-Java programs that rely on significant mapreduce 
> functionality.  The patch I'm proposing shortly will let you just type 
> "bin/hadoop jar hadoop-hdfs-hdfswithmr-test.jar minicluster" to start a 
> cluster (internally, it's using Mini{MR,HDFS}Cluster) with a specified number 
> of daemons, etc.  A test that checks how some external process interacts with 
> Hadoop might start minicluster as a subprocess, run through its thing, and 
> then simply kill the java subprocess.
> I've been using just such a system for a couple of weeks, and I like it.  
> It's significantly easier than developing a lot of scripts to start a 
> pseudo-distributed cluster, and then clean up after it.  I figure others 
> might find it useful as well.
> I'm at a bit of a loss as to where to put it in 0.21.  hdfs-with-mr tests 
> have all the required libraries, so I've put it there.  I could conceivably 
> split this into "minimr" and "minihdfs", but it's specifically the fact that 
> they're configured to talk to each other that I like about having them 
> together.  And one JVM is better than two for my test programs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Aaron Kimball (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767129#action_12767129
 ] 

Aaron Kimball commented on MAPREDUCE-972:
-

As discussed earlier, the FileSystem API does not provide a means for 
operations such as rename() to get access to a Progressable. I do not see a 
straightforward way to improve the S3FS / S3N implementations without extending 
the FileSystem API to add operations such as {{rename(src, dst, progress)}}.  
Are you +1 on doing that?

Either way, I agree with your criticisms of the progress thread implementation. 
I have the following plan for improving this:
* Make the progress thread's lifetime equal to that of the mapper. The first 
rename() operation starts it, and the join() moves to close()
* Progress thread is only active when a rename() operation is underway. Use a 
volatile boolean to track this state. Otherwise it just sleeps.
* Use {{Thread.interrupt()}} / {{isInterrupted()}} to interrupt the sleep in 
the main loop, so that we don't have to wait the full three seconds before the 
thread exits.
* Add {{distcp.rename.timeout}} as a parameter which sets a max lifetime for 
the inner loop of the progress thread. Default value will be 10 seconds, but if 
it detects that the destination filesystem is s3n:// or s3fs://, ups this to 
fifteen minutes.

- Aaron

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1077) When rumen reads a truncated job tracker log, it produces a job whose outcome is SUCCESS. Should be null.

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767126#action_12767126
 ] 

Hudson commented on MAPREDUCE-1077:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #82 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/82/])
. Fix Rumen so that truncated tasks do not mark the job as
successful. Contributed by Dick King


> When rumen reads a truncated job tracker log, it produces a job whose outcome 
> is SUCCESS.  Should be null.
> --
>
> Key: MAPREDUCE-1077
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1077
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Dick King
>Assignee: Dick King
> Fix For: 0.21.0
>
> Attachments: mapreduce-1077--2009-10-14.patch, 
> mapreduce-1077--2009-10-16.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-931) rumen should use its own interpolation classes to create runtimes for simulated tasks

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767121#action_12767121
 ] 

Hudson commented on MAPREDUCE-931:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #81 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/81/])
. Use built-in interpolation classes for making up task
runtimes in Rumen. Contributed by Dick King


> rumen should use its own interpolation classes to create runtimes for 
> simulated tasks
> -
>
> Key: MAPREDUCE-931
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-931
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Dick King
>Assignee: Dick King
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-931--2009-09-16--1633.patch, patch-931-b.patch
>
>
> Currently, when a simulator or benchmark is running and simulating hadoop 
> jobs using rumen data, and rumen's runtime system is used to get execution 
> times for the tasks in the simulated jobs, rumen would use some ad hoc code, 
> despite the fact that rumen has a perfectly good interpolation framework to 
> generate random variables that fit discrete CDFs.
> We should use the interpolation framework.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1077) When rumen reads a truncated job tracker log, it produces a job whose outcome is SUCCESS. Should be null.

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1077:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I committed this. Thanks, Dick!

> When rumen reads a truncated job tracker log, it produces a job whose outcome 
> is SUCCESS.  Should be null.
> --
>
> Key: MAPREDUCE-1077
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1077
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Dick King
>Assignee: Dick King
> Fix For: 0.21.0
>
> Attachments: mapreduce-1077--2009-10-14.patch, 
> mapreduce-1077--2009-10-16.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-931) rumen should use its own interpolation classes to create runtimes for simulated tasks

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-931:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I committed this. Thanks, Dick!

> rumen should use its own interpolation classes to create runtimes for 
> simulated tasks
> -
>
> Key: MAPREDUCE-931
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-931
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Dick King
>Assignee: Dick King
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-931--2009-09-16--1633.patch, patch-931-b.patch
>
>
> Currently, when a simulator or benchmark is running and simulating hadoop 
> jobs using rumen data, and rumen's runtime system is used to get execution 
> times for the tasks in the simulated jobs, rumen would use some ad hoc code, 
> despite the fact that rumen has a perfectly good interpolation framework to 
> generate random variables that fit discrete CDFs.
> We should use the interpolation framework.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-990) Making distributed cache getters in JobContext never return null

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-990:


Status: Open  (was: Patch Available)

The patch is stale

> Making distributed cache getters in JobContext never return null
> 
>
> Key: MAPREDUCE-990
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-990
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: MAPREDUCE-990.patch, MAPREDUCE-990.patch.txt
>
>
> MAPREDUCE-898 moved distributed cache setters and getters into Job and 
> JobContext.  Since the API is new, I'd like to propose that those getters 
> never return null, but instead always return an array, even if it's empty.
> If people don't like this change, I can instead merely update the javadoc to 
> reflect the fact that null may be returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-906) Updated Sqoop documentation

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767079#action_12767079
 ] 

Hudson commented on MAPREDUCE-906:
--

Integrated in Hadoop-Mapreduce-trunk #116 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/])
. Update Sqoop documentation. Contributed by Aaron Kimball


> Updated Sqoop documentation
> ---
>
> Key: MAPREDUCE-906
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-906
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, 
> MAPREDUCE-906.4.patch, MAPREDUCE-906.patch
>
>
> Here's the latest documentation for Sqoop, in both user-guide and manpage 
> form. Built with asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767080#action_12767080
 ] 

Hudson commented on MAPREDUCE-1104:
---

Integrated in Hadoop-Mapreduce-trunk #116 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/])
. Initialize RecoveryManager in JobTracker cstr called by Mumak.
Contributed by Hong Tang


> RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty 
> server
> 
>
> Key: MAPREDUCE-1104
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch
>
>
> RecoveryManager initialization is not copied to the JobTracker constructor 
> Mumak depends on. This leads to NPE in JT Jetty server.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767081#action_12767081
 ] 

Hudson commented on MAPREDUCE-1061:
---

Integrated in Hadoop-Mapreduce-trunk #116 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/116/])
. Add unit test validating byte specifications for gridmix jobs.


> Gridmix unit test should validate input/output bytes
> 
>
> Key: MAPREDUCE-1061
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Affects Versions: 0.21.0
>Reporter: Chris Douglas
>Assignee: Chris Douglas
> Fix For: 0.21.0
>
> Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch
>
>
> TestGridmixSubmission currently verifies only that the correct number of jobs 
> have been run. The test should validate the I/O parameters it claims to 
> satisfy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767044#action_12767044
 ] 

Hudson commented on MAPREDUCE-1061:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #80 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/80/])
. Add unit test validating byte specifications for gridmix jobs.


> Gridmix unit test should validate input/output bytes
> 
>
> Key: MAPREDUCE-1061
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Affects Versions: 0.21.0
>Reporter: Chris Douglas
>Assignee: Chris Douglas
> Fix For: 0.21.0
>
> Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch
>
>
> TestGridmixSubmission currently verifies only that the correct number of jobs 
> have been run. The test should validate the I/O parameters it claims to 
> satisfy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767040#action_12767040
 ] 

Hudson commented on MAPREDUCE-1104:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #79 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/79/])
. Initialize RecoveryManager in JobTracker cstr called by Mumak.
Contributed by Hong Tang


> RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty 
> server
> 
>
> Key: MAPREDUCE-1104
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch
>
>
> RecoveryManager initialization is not copied to the JobTracker constructor 
> Mumak depends on. This leads to NPE in JT Jetty server.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1061) Gridmix unit test should validate input/output bytes

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1061:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I committed this.

> Gridmix unit test should validate input/output bytes
> 
>
> Key: MAPREDUCE-1061
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1061
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Affects Versions: 0.21.0
>Reporter: Chris Douglas
>Assignee: Chris Douglas
> Fix For: 0.21.0
>
> Attachments: 1061-0.patch, M1061-1.patch, M1061-2.patch
>
>
> TestGridmixSubmission currently verifies only that the correct number of jobs 
> have been run. The test should validate the I/O parameters it claims to 
> satisfy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-906) Updated Sqoop documentation

2009-10-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767035#action_12767035
 ] 

Hudson commented on MAPREDUCE-906:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #78 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/78/])
. Update Sqoop documentation. Contributed by Aaron Kimball


> Updated Sqoop documentation
> ---
>
> Key: MAPREDUCE-906
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-906
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, 
> MAPREDUCE-906.4.patch, MAPREDUCE-906.patch
>
>
> Here's the latest documentation for Sqoop, in both user-guide and manpage 
> form. Built with asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1104) RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty server

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1104:
-

   Resolution: Fixed
Fix Version/s: (was: 0.22.0)
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Hong!

> RecoveryManager not initialized in SimulatorJobTracker led to NPE in JT Jetty 
> server
> 
>
> Key: MAPREDUCE-1104
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1104
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Hong Tang
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: mapreduce-1104-20091014.patch, mapreduce-1104.patch
>
>
> RecoveryManager initialization is not copied to the JobTracker constructor 
> Mumak depends on. This leads to NPE in JT Jetty server.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-906) Updated Sqoop documentation

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-906:


   Resolution: Fixed
Fix Version/s: 0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Aaron!

> Updated Sqoop documentation
> ---
>
> Key: MAPREDUCE-906
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-906
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-906.2.patch, MAPREDUCE-906.3.patch, 
> MAPREDUCE-906.4.patch, MAPREDUCE-906.patch
>
>
> Here's the latest documentation for Sqoop, in both user-guide and manpage 
> form. Built with asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-972:


Status: Open  (was: Patch Available)

Really sorry to find this issue so late, but a progress thread that forbids 
tasks from timing out is not a good solution, particularly for distcp, where 
task timeouts are both legal and useful. If s3 requires a more elaborate rename 
mechanism, is there a way to push this into its implementation? While distcp 
may be a heavier user than most user jobs, the latter would also appreciate a 
more robust solution.

Starting and waiting a thread for every rename is also not an ideal design; the 
current polls {{isComplete}} only every three seconds, slowing all the renames.

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.4.patch, MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent

2009-10-18 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767022#action_12767022
 ] 

Chris Douglas commented on MAPREDUCE-64:


bq. one simple way might be to simply add TRACE level log messages at every 
collect() call with the current values of every index plus the spill number 
[...]

That could be an interesting visualization. I'd already made up the diagrams, 
but anything that helps the analysis and validation would be welcome. I'd 
rather not add a trace to the committed code, but data from it sounds great.

bq. I ran a simple test where I was running a sort of 10 byte records, and it 
turned out that the "optimal" io.sort.record.percent caused my job to be 
significantly slower. It was the case then that a small number of large spills 
actually ran slower than a large number of small spills. Did we ever determine 
what that issue was? I think we should try to understand why the theory isn't 
agreeing with observations here.

IIRC those tests used a non-RawComparator, right? Runping reported similar 
results, where hits to concurrent collection were more expensive than small 
spills. The current theory is that keeping the map thread unblocked is usually 
better for performance. Based on this observation, I'm hoping that the 
spill.percent can also be eliminated at some point in the future, though the 
performance we're leaving on the table there is probably not as severe and is 
more difficult to generalize. Microbenchmarks may also not capture the expense 
of merging many small spills in a busy, shared cluster, where HDFS and other 
tasks are completing for disk bandwidth. I'll be very interested in metrics 
from MAPREDUCE-1115, as they would help to flesh out this hypothesis.



The documentation (such as it is) in HADOOP-2919 describes the existing code. 
The metadata and serialization data are tracked using a set of indices marking 
the start and end of a spill ({{kvstart}}, {{kvend}}) and the current position 
({{kvindex}}) while the serialization data are described by similar markers 
({{bufstart}}, {{bufend}}, {{bufindex}}). There are two other indices carried 
over from the existing design. {{bufmark}} is the position in the serialized 
record data of the end of the last fully serialized record. {{bufvoid}} is 
necessary for the RawComparator interface, which requires contiguous ranges for 
key compares; if a serialized key crosses the end of the buffer, it must be 
copied to the front to satisfy the aforementioned API spec. All of these are 
retained; the role of each is largely unchanged.

The proposed design adds another parameter, the {{equator}} (while {{kvstart}} 
and {{bufstart}} could be replaced with a single variable similar to 
{{equator}} the effort seemed misspent). The record serialization moves 
"forward" in the buffer, while the metadata are allocated in 16 byte blocks in 
the opposite direction. This is illustrated in the following diagram:

!M64-0i.png|thumbnail!

The role played by kvoffsets and kvindices is preserved; logically, 
particularly in the spill, each is interpreted in roughly the same way. In the 
new code, the allocation is not static, but will instead expand with the 
serialized records. This avoids degenerate cases for combiners and multilevel 
merges (though not necessarily optimal performance).

Spills are triggered in two conditions: either the soft limit is reached 
(collection proceeds concurrently with the spill) or a record is large enough 
to require a spill before it can be written to the buffer (collection is 
blocked). In the former case is illustrated here:

!M64-1i.png|thumbnail!

The {{equator}} is moved to an offset proportional to the average record size 
(caveats 
[above|https://issues.apache.org/jira/browse/MAPREDUCE-64?focusedCommentId=12765984&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12765984]),
 {{kvindex}} is moved off the equator, aligned with the end of the array (int 
alignment, also so no metadata block will span the end of the array). 
Collection proceeds again from the equator, growing toward the ends of the 
spill. Should either run out of space, collection will block until the spill 
completes. Note that there is no partially written data when the soft limit is 
reached; it can only be triggered in collect, not in the blocking buffer.

The other case to consider is when record data are partially written into the 
collection buffer, but the available space is exhausted:

!M64-2i.png|thumbnail!

Here, the equator is moved to the beginning of the partial record and 
collection blocks. When the spill completes, the metadata are written off the 
equator and serialization of the record can continue.

During collection, indices are adjusted only when holding a lock. As in the 
current code, the lock is only obtained in collect when one of the possible 
conditions fo

[jira] Updated: (MAPREDUCE-64) Map-side sort is hampered by io.sort.record.percent

2009-10-18 Thread Chris Douglas (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-64:
---

Attachment: M64-0i.png
M64-1i.png
M64-2i.png

> Map-side sort is hampered by io.sort.record.percent
> ---
>
> Key: MAPREDUCE-64
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-64
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Chris Douglas
> Attachments: M64-0.patch, M64-0i.png, M64-1.patch, M64-1i.png, 
> M64-2.patch, M64-2i.png, M64-3.patch
>
>
> Currently io.sort.record.percent is a fairly obscure, per-job configurable, 
> expert-level parameter which controls how much accounting space is available 
> for records in the map-side sort buffer (io.sort.mb). Typically values for 
> io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store 
> ~350,000 records in the buffer before necessitating a sort/combine/spill.
> However for many applications which deal with small records e.g. the 
> world-famous wordcount and it's family this implies we can only use 5-10% of 
> io.sort.mb i.e. (5-10M) before we spill inspite of having _much_ more memory 
> available in the sort-buffer. The word-count for e.g. results in ~12 spills 
> (given hdfs block size of 64M). The presence of a combiner exacerbates the 
> problem by piling serialization/deserialization of records too...
> Sure, jobs can configure io.sort.record.percent, but it's tedious and 
> obscure; we really can do better by getting the framework to automagically 
> pick it by using all available memory (upto io.sort.mb) for either the data 
> or accounting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

43 matches

Mail list logo