[jira] [Commented] (MAPREDUCE-2187) map tasks timeout during sorting

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084086#comment-13084086
 ] 

Hudson commented on MAPREDUCE-2187:
---

Integrated in Hadoop-Mapreduce-trunk #753 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/753/])
MAPREDUCE-2187 - Missed adding the file

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156962
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestCombineOutputCollector.java


> map tasks timeout during sorting
> 
>
> Key: MAPREDUCE-2187
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2187
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.20.2, 0.20.205.0
>Reporter: Gianmarco De Francisci Morales
>Assignee: Anupam Seth
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2187-20-security-v2.patch, 
> MAPREDUCE-2187-20-security.patch, MAPREDUCE-2187-22.patch, 
> MAPREDUCE-2187-MR-279-v2.patch, MAPREDUCE-2187-branch-MR-279.patch, 
> MAPREDUCE-2187-trunk-v2.patch, MAPREDUCE-2187-trunk-v3.patch, 
> MAPREDUCE-2187-trunk.patch
>
>
> During the execution of a large job, the map tasks timeout:
> {code}
> INFO mapred.JobClient: Task Id : attempt_201010290414_60974_m_57_1, 
> Status : FAILED
> Task attempt_201010290414_60974_m_57_1 failed to report status for 609 
> seconds. Killing!
> {code}
> The bug is in the fact that the mapper has already finished, and, according 
> to the logs, the timeout occurs during the merge sort phase.
> The intermediate data generated by the map task is quite large. So I think 
> this is the problem.
> The logs show that the merge-sort was running for 10 minutes when the task 
> was killed.
> I think the mapred.Merger should call Reporter.progress() somewhere.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2765) DistCp Rewrite

2011-08-12 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated MAPREDUCE-2765:


Attachment: distcpv2_trunk.patch

Patch for hadoop-trunk.

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch, distcpv2_trunk.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and creates the 
> list-of-source-paths, it attempts to filter out files that might be skipped 
> (by comparing file-sizes, checksums, etc.) This significantly increases the 
> startup time (or the time spent in serial processing till the MR job is 
> launched), blocking the calling-thread. This becomes pronounced as nFiles 
> increases. (Internal benchmarks have seen situations where more time is spent 
> setting up the job than on the actual transfer.)
>   DistCpV2 postpones as much work as possible to the MR job. The 
> file-listing isn't filtered until the map-task runs (at which time, identical 
> files are skipped). DistCpV2 can now be run "asynchronously". The program 
> quits at job-launch, logging the job-id for tracking. Programmatically, the 
> DistCp.execute() returns a Job instance for progress-tracking.
>   
> 3. New features:
>   (a)   -async: As described in #2.
>   (b)   -atomic: Data is copied to a (user-specifiable) tmp-location, and 
> then moved atomically to destination.
>   (c)   -bandwidth: Enforces a limit on the bandwidth consumed per map.
>   (d)   -strategy: As above.
>   
> A more comprehensive description the newer features, how the dynamic-strategy 
> works, etc. is available in src/site/xdoc/, and in the pdf that's generated 
> therefrom, during the build.
> High on the list of things to do is support to parallelize copies on a 
> per-block level. (i.e. Incorporation of HDFS-222.)
> I look forward to comments, suggestions and discussion that will hopefully 
> ensue. I have this running agains

[jira] [Updated] (MAPREDUCE-2649) MR279: Fate of finished Applications on RM

2011-08-12 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2649:
-

Attachment: MAPREDUCE-2649-v5.patch

removed some uneeded code.

> MR279: Fate of finished Applications on RM
> --
>
> Key: MAPREDUCE-2649
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2649
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2649-patch-mr279.txt, MAPREDUCE-2649-v2.patch, 
> MAPREDUCE-2649-v3.patch, MAPREDUCE-2649-v4.patch, MAPREDUCE-2649-v5.patch
>
>
> Today RM keeps the references of finished application for ever. Though this 
> is not sustainable long term, it keeps
> the user experience saner. Users can revisit RM UI and check the status of 
> their apps.
> We need to think of purging old references yet keeping the UX sane.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite

2011-08-12 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084108#comment-13084108
 ] 

Mithun Radhakrishnan commented on MAPREDUCE-2765:
-

I've now attached a patch that applies to hadoop-trunk. This takes care of 
nearly all the review comments so far, save the following:
1. Preserve-status on the root-directory: I've left it out, because this can't 
really be handled. (For instance, if there are more than one source-path, it is 
ambiguous to determine the target-root's permissions.)
2. The count in deleteMissing(): It turns out that this is accurate.

I've also modified -atomic not to blow away the target-path, if it exists. This 
will need removing before an atomic commit can be made. As per Amareshwari's 
advice, this is checked before the copy is initiated. (There's also a check 
just before the commit is done, as a corner case. The copied data is retained 
in the temporary workspace.) Tests have been altered accordingly.

(We briefly toyed with the idea of using -overwrite along with -atomic, to 
forcibly overwrite the target, but this mixes disjoint concerns, while only 
offering the dubious convenience of deleting pre-existing targets. We're 
leaving -atomic and -overwrite mutually exclusive.)

I'm leaving it mavenized, so that it may be built from 
mapreduce/src/contrib/distcpv2/ by "mvn package". The jar may then be used with 
"bin/hadoop jar hadoop-distcp.jar [OPTIONS]". The usage is described at greater 
length in the pdf generated in the build.

Thank you, reviewers.

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch, distcpv2_trunk.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and creates the 
> list-of-source-paths, it attempts to filter out files that might be skipped 
> (by comparing file-sizes, checksums, etc.) This significantly increases the 
> startup time (or the time spent in serial p

[jira] [Updated] (MAPREDUCE-2649) MR279: Fate of finished Applications on RM

2011-08-12 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2649:
-

Status: Patch Available  (was: Open)

> MR279: Fate of finished Applications on RM
> --
>
> Key: MAPREDUCE-2649
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2649
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2649-patch-mr279.txt, MAPREDUCE-2649-v2.patch, 
> MAPREDUCE-2649-v3.patch, MAPREDUCE-2649-v4.patch, MAPREDUCE-2649-v5.patch
>
>
> Today RM keeps the references of finished application for ever. Though this 
> is not sustainable long term, it keeps
> the user experience saner. Users can revisit RM UI and check the status of 
> their apps.
> We need to think of purging old references yet keeping the UX sane.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084118#comment-13084118
 ] 

Thomas Graves commented on MAPREDUCE-2655:
--

Going to add a RMAuditLogger class that has methods to log success/failure 
similar to jobtracker Auditlogger class.  

Certain of these don't make sense to log since the authentication will throw 
away unauthorized requests.

Right now I plan on logging the following things:

RM:
- all refresh commands - success and failure
- Application submissions
- Application kills
- AM registrations - failure and success
- illegal container allocations from AM - invalid container id or container not 
owned by App
- successful container allocations and deallocations from AM

NM:
- Container allocation successes

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084191#comment-13084191
 ] 

Thomas Graves commented on MAPREDUCE-2655:
--

Example on how you would config for rm and nm audit logs:

rm.audit.logger=INFO,RMAUDIT
log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger=${rm.audit.logger}
log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger=false
log4j.appender.RMAUDIT=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RMAUDIT.File=${hadoop.log.dir}/rm-audit.log
log4j.appender.RMAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RMAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RMAUDIT.DatePattern=.-MM-dd

nm.audit.logger=INFO,NMAUDIT
log4j.logger.org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger=${nm.audit.logger}
log4j.additivity.org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger=false
log4j.appender.NMAUDIT=org.apache.log4j.DailyRollingFileAppender
log4j.appender.NMAUDIT.File=${hadoop.log.dir}/nm-audit.log
log4j.appender.NMAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.NMAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.NMAUDIT.DatePattern=.-MM-dd

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite

2011-08-12 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084215#comment-13084215
 ] 

Koji Noguchi commented on MAPREDUCE-2765:
-

bq. Would the reviewers/watchers kindly comment on whether it's alright to 
deprecate the "-filelimit" and "-sizelimit" options, in DistCpV2?
bq.
+1.  I think we (Yahoo) requested but ended up not using it at all.  

Just to be clear
bq. The file-listing isn't filtered until the map-task runs 
bq.
This used to be the case in old old distcp.  We changed that when we added this 
-filelimit feature (that we never used).
 

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch, distcpv2_trunk.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and creates the 
> list-of-source-paths, it attempts to filter out files that might be skipped 
> (by comparing file-sizes, checksums, etc.) This significantly increases the 
> startup time (or the time spent in serial processing till the MR job is 
> launched), blocking the calling-thread. This becomes pronounced as nFiles 
> increases. (Internal benchmarks have seen situations where more time is spent 
> setting up the job than on the actual transfer.)
>   DistCpV2 postpones as much work as possible to the MR job. The 
> file-listing isn't filtered until the map-task runs (at which time, identical 
> files are skipped). DistCpV2 can now be run "asynchronously". The program 
> quits at job-launch, logging the job-id for tracking. Programmatically, the 
> DistCp.execute() returns a Job instance for progress-tracking.
>   
> 3. New features:
>   (a)   -async: As described in #2.
>   (b)   -atomic: Data is copied to a (user-specifiable) tmp-location, and 
> then moved atomically to destination.
>   (c)   -bandwidth: Enforces a limit on the bandwidth consumed per map.
>   (d)   -strategy: As above.
>   
> A more comprehensive descri

[jira] [Created] (MAPREDUCE-2840) mr279 TestUberAM.testSleepJob test fails

2011-08-12 Thread Thomas Graves (JIRA)
mr279 TestUberAM.testSleepJob test fails


 Key: MAPREDUCE-2840
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2840
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Priority: Minor
 Fix For: 0.23.0


Currently the TestUberAM.testSleepJob  is failing on the mr279 branch. 

snippet of failure:
junit.framework.AssertionFailedError: null
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at junit.framework.Assert.assertTrue(Assert.java:27)
at 
org.apache.hadoop.mapreduce.v2.TestMRJobs.testSleepJob(TestMRJobs.java:150)
at 
org.apache.hadoop.mapreduce.v2.TestUberAM.testSleepJob(TestUberAM.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2655:
-

Attachment: rm-audit.log

sample of what rm audit log would look like. This is running successful 
wordcount on small input file.

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084226#comment-13084226
 ] 

Thomas Graves commented on MAPREDUCE-2655:
--

I attached 2 samples file of what audit logs would look like from simple 
wordcount on small file. Note, I noticed cut and paste error in the 
rm-audit.log.  The APPID of a couple of those should be either CONTAINERID OR 
APPATTEMPTID.

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2655:
-

Attachment: nm-audit.log

sample nodemanager audit log.  Corresponds to attached rm audit log.  was from 
successful run of simple wordcount.

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-2655:
-

Fix Version/s: 0.23.0

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084267#comment-13084267
 ] 

Mahadev konar commented on MAPREDUCE-2655:
--

Thomas,
  Are the NM audit logs for container start  only?

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084272#comment-13084272
 ] 

Thomas Graves commented on MAPREDUCE-2655:
--

yes currently that is all its doing.  my understand was that the authentication 
would block illegal allocations.  Is there others we want to log?

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2655) MR279: Audit logs for YARN

2011-08-12 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084304#comment-13084304
 ] 

Mahadev konar commented on MAPREDUCE-2655:
--

What abt done containers? killed containers? failed containers?

> MR279: Audit logs for YARN 
> ---
>
> Key: MAPREDUCE-2655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 0.23.0
>
> Attachments: nm-audit.log, rm-audit.log
>
>
> We need audit logs for YARN components:
> ResourceManager:
>  - All the refresh* protocol access points - refreshQueues, refreshNodes, 
> refreshProxyUsers,
> refreshUserToGroupMappings.
>  - All app-submissions, app-kills to RM.
>  - Illegal and successful(?) AM registrations.
>  - Illegal container allocations/deallocations from AMs
>  - Successful container allocations/deallocations from AMs too?
> NodeManager:
>  - Illegal container launches from AMs
>  - Successful container launches from AMs too?
> Not sure if we need audit logs from MR AMs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2835) Make per-job counter limits configurable

2011-08-12 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-2835:
-

Attachment: MAPREDUCE-2835.patch

I've updated the patch to use the same parameter names as MAPREDUCE-901, which 
will be in 0.23.

I successfully ran test-patch and unit tests.

> Make per-job counter limits configurable
> 
>
> Key: MAPREDUCE-2835
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2835
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.204.0
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2835.patch, MAPREDUCE-2835.patch
>
>
> The per-job counter limits introduced in MAPREDUCE-1943 are fixed, except for 
> the total number allowed per job (mapreduce.job.counters.limit). It would be 
> useful to make them all configurable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2669) Some new examples and test cases for them.

2011-08-12 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084362#comment-13084362
 ] 

Konstantin Shvachko commented on MAPREDUCE-2669:


Good examples! Some comments:
# You need to address findbugs and javac warning if they are related to the 
patch.
# Test failures do not seem to be relates to the new examples.
# I recommend using forward slashes in INPUT and OUTPUT constants
{code}
+  private final static String INPUT = 
"src/test/mapred/org/apache/hadoop/examples/input";
+  private final static String OUTPUT = 
"src/test/mapred/org/apache/hadoop/examples/robust_stddev_output";
{code}
# For INPUT directory I'd propose to use just {{INPUT = 
"src/test/mapred/org/apache/hadoop/examples/}}.
As much as I love Shakespeare, committing "A Midsummer Night's Dream" under 
Apache License doesn't seem to be required to exemplify MR statistics.
Having said that, it was interesting to learn that Shakespeare used pretty 
short words - around 4.42 on average - compared to modern texts from the Web.
# OUTPUT directory should point to a directory in {{build/data}}, so that it is 
cleaned up with the clean target.
# The tests should be converted to JUnit 4 format.
# Also it is better to combine all three test under common roof 
{{TestWordStats}} or similar. It will be less code and will save test execution 
time.


> Some new examples and test cases for them.
> --
>
> Key: MAPREDUCE-2669
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2669
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: examples
>Affects Versions: 0.22.0
>Reporter: Plamen Jeliazkov
>Priority: Minor
> Attachments: MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
> MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, 
> MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, mapreduce-new-examples-0.22.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Looking to add some more examples such as Mean, Median, and Standard 
> Deviation to the examples.
> I have some generic JUnit testcases as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2097) java.lang.NullPointerException in reduce task

2011-08-12 Thread Suraj Menon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084371#comment-13084371
 ] 

Suraj Menon commented on MAPREDUCE-2097:


Hello, 

I faced a similar error. Found out that it is causing for me because of '=' 
sign in my hostname.
As an example, please read the following snippet:

URI u = URI.create("http://abcd--proto=dhcp:5006";);
String host = u.getHost();
System.out.println("u.getHost = " + host);

It prints - 

u.getHost = null

>From http://download.oracle.com/javase/1.4.2/docs/api/java/net/URI.html , I 
>found out that '=' sign is a valid character.
Correct me if I am wrong. Is there a way out of this. I can't change my 
hostname. I hard-coded the address and built the jar for now. :)

Thanks,
Suraj





> java.lang.NullPointerException in reduce task
> -
>
> Key: MAPREDUCE-2097
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2097
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.20.2
> Environment: RHEL 5.5 
>Reporter: vaibhav negi
>
> While executing hive query "select count(*) from table " , i got this error 
> message 
> Ended Job = job_201009291356_0003 with errors
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> On investigating task tracker logs , i found out
> 2010-09-29 14:31:46,839 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: mapred.local.dir;  Ignoring.
> 2010-09-29 14:31:46,862 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: mapred.job.tracker;  Ignoring.
> 2010-09-29 14:31:46,943 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
> Initializing JVM Metrics with processName=SHUFFLE, sessionId=
> 2010-09-29 14:31:47,133 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: mapred.local.dir;  Ignoring.
> 2010-09-29 14:31:47,133 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: dfs.data.dir;  Ignoring.
> 2010-09-29 14:31:47,143 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: mapred.job.tracker;  Ignoring.
> 2010-09-29 14:31:47,143 WARN org.apache.hadoop.conf.Configuration: 
> /home/hadoop/mapred/local/taskTracker/jobcache/job_201009291356_0003/attempt_201009291356_0003_r_00_3/job.xml:a
>  attempt to override final parameter: dfs.name.dir;  Ignoring.
> 2010-09-29 14:31:47,209 INFO org.apache.hadoop.mapred.ReduceTask: 
> ShuffleRamManager: MemoryLimit=141387360, MaxSingleShuffleLimit=35346840
> 2010-09-29 14:31:47,219 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Thread started: Thread for merging 
> on-disk files
> 2010-09-29 14:31:47,220 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Thread waiting: Thread for merging 
> on-disk files
> 2010-09-29 14:31:47,222 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Thread started: Thread for merging in 
> memory files
> 2010-09-29 14:31:47,223 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Need another 17 map output(s) where 0 is 
> already in progress
> 2010-09-29 14:31:47,224 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Scheduled 0 outputs (0 slow hosts and0 
> dup hosts)
> 2010-09-29 14:31:47,224 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201009291356_0003_r_00_3 Thread started: Thread for polling Map 
> Completion Events
> 2010-09-29 14:31:47,233 FATAL org.apache.hadoop.mapred.TaskRunner: 
> attempt_201009291356_0003_r_00_3 GetMapEventsThread Ignoring exception : 
> java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
>   at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)
> ReduceTask.java:2683 is :  List loc = 
> mapLocations.get(host)
> where host is:-URI u = URI.create(event.getTaskTracke

[jira] [Updated] (MAPREDUCE-2837) MR-279: Bug fixes ported from y-merge

2011-08-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-2837:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this after running tests manually.

> MR-279: Bug fixes ported from y-merge
> -
>
> Key: MAPREDUCE-2837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
> Attachments: rest.patch, rest.patch
>
>
> Similar to MAPREDUCE-2679.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084375#comment-13084375
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #763 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/763/])
MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 
merge.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157249
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java
* /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java
* /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java


> Map-Reduce 2.0
> --
>
> Key: MAPREDUCE-279
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Fix For: 0.23.0
>
> Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
> MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
> capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
> multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
> yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
> yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
> yarn-state-machine.task.png
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job, 
> user-defined component that manages the application execution.
> Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2837) MR-279: Bug fixes ported from y-merge

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084376#comment-13084376
 ] 

Hudson commented on MAPREDUCE-2837:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #763 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/763/])
MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 
merge.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157249
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java
* /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java
* /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java


> MR-279: Bug fixes ported from y-merge
> -
>
> Key: MAPREDUCE-2837
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2837
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arun C Murthy
> Attachments: rest.patch, rest.patch
>
>
> Similar to MAPREDUCE-2679.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

2011-08-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-2037:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this after manually running tests. Thanks Dick!

> Capturing interim progress times, CPU usage, and memory usage, when tasks 
> reach certain progress thresholds
> ---
>
> Key: MAPREDUCE-2037
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2037.patch, MAPREDUCE-2037.patch
>
>
> We would like to capture the following information at certain progress 
> thresholds as a task runs:
>* Time taken so far
>* CPU load [either at the time the data are taken, or exponentially 
> smoothed]
>* Memory load [also either at the time the data are taken, or 
> exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  
> For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and 
> (2/3-3/3] -- where fundamentally different activities happen.  Mappers have 
> different boundaries, I understand, that are not symmetrically placed.  Data 
> capture boundaries should coincide with activity boundaries.  For the state 
> information capture [CPU and memory] we should average over the covered 
> interval.
> This data would flow in with the heartbeats.  It would be placed in the job 
> history as part of the task attempt completion event, so it could be 
> processed by rumen or some similar tool and could drive a benchmark engine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2037) Capturing interim progress times, CPU usage, and memory usage, when tasks reach certain progress thresholds

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084382#comment-13084382
 ] 

Hudson commented on MAPREDUCE-2037:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #764 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/764/])
MAPREDUCE-2037. Capture intermediate progress, CPU and memory usage for 
tasks. Contributed by Dick King.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157253
Files : 
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/AvroArrayUtils.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/MapTaskAttemptInfo.java
* /hadoop/common/trunk/mapreduce/src/java/mapred-default.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskInProgress.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Counters.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/Events.avpr
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptUnsuccessfulCompletionEvent.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/StatePeriodicStats.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/tools/rumen/TestRumenJobTraces.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/ReduceAttemptFinishedEvent.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestTaskPerformanceSplits.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ZombieJob.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/TaskAttemptInfo.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEvents.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/CumulativePeriodicStats.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ReduceTaskAttemptInfo.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/TaskAttempt20LineEventEmitter.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/PeriodicStatsAccumulator.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/JobBuilder.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ProgressSplitsBlock.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/LoggedTaskAttempt.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/MapAttemptFinishedEvent.java
* 
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java


> Capturing interim progress times, CPU usage, and memory usage, when tasks 
> reach certain progress thresholds
> ---
>
> Key: MAPREDUCE-2037
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2037.patch, MAPREDUCE-2037.patch
>
>
> We would like to capture the following information at certain progress 
> thresholds as a task runs:
>* Time taken so far
>* CPU load [either at the time the data are taken, or exponentially 
> smoothed]
>* Memory load [also either at the time the data are taken, or 
> exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  
> For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and 
> (2/3-3/3] -- where fundamentally different activities happen.  Mappers have 
> different boundaries, I understand, that are not symmetrically placed.  Data 
> capture boundaries should coincide with activity boundaries.  For the state 
> information capture [CPU and memory] we should average over the covered 
> interval.
> This data would flow in with the heartbeats.  It would be placed in the job 
> history as part of the task attempt completion event, so it could be 
> processed by rumen or some similar tool and could drive a benchmark engine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2764) Fix renewal of dfs delegation tokens

2011-08-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084404#comment-13084404
 ] 

Owen O'Malley commented on MAPREDUCE-2764:
--

I think it is far cleaner to fix the hftp tokens to properly reflect that they 
are for hftp instead of hdfs. It removes all of the current guess work that is 
currently done.

hdfs tokens -> always use nn:rpc-port, only used for rpc
hftp tokens -> always use nn:http-port, only used for hftp

by having a renewer for each kind of token, they are simpler.

The only hiccup is needing to have the hftp client translate the token to hdfs 
to provide wire compatibility.

Of course the long term solution has to be to move to a serialization library 
that allows us to add/remove fields. Writable doesn't. :(



> Fix renewal of dfs delegation tokens
> 
>
> Key: MAPREDUCE-2764
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2764
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2764.patch
>
>
> The JT may have issues renewing hftp tokens which disrupt long distcp jobs.  
> The problem is the JT's delegation token renewal code is built on brittle 
> assumptions.  The token's service field contains only the "ip:port" pair.  
> The renewal process assumes that the scheme must be hdfs.  If that fails due 
> to a {{VersionMismatchException}}, it tries https based on another assumption 
> that it must be hftp if it's not hdfs.  A number of other exceptions, most 
> commonly {{IOExceptions}}, can be generated which fouls up the renewal since 
> it won't fallback to https.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite

2011-08-12 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084409#comment-13084409
 ] 

Luke Lu commented on MAPREDUCE-2765:


Does the trunk patch incorporate MAPREDUCE-2257 (parallel block copy)? 

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch, distcpv2_trunk.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and creates the 
> list-of-source-paths, it attempts to filter out files that might be skipped 
> (by comparing file-sizes, checksums, etc.) This significantly increases the 
> startup time (or the time spent in serial processing till the MR job is 
> launched), blocking the calling-thread. This becomes pronounced as nFiles 
> increases. (Internal benchmarks have seen situations where more time is spent 
> setting up the job than on the actual transfer.)
>   DistCpV2 postpones as much work as possible to the MR job. The 
> file-listing isn't filtered until the map-task runs (at which time, identical 
> files are skipped). DistCpV2 can now be run "asynchronously". The program 
> quits at job-launch, logging the job-id for tracking. Programmatically, the 
> DistCp.execute() returns a Job instance for progress-tracking.
>   
> 3. New features:
>   (a)   -async: As described in #2.
>   (b)   -atomic: Data is copied to a (user-specifiable) tmp-location, and 
> then moved atomically to destination.
>   (c)   -bandwidth: Enforces a limit on the bandwidth consumed per map.
>   (d)   -strategy: As above.
>   
> A more comprehensive description the newer features, how the dynamic-strategy 
> works, etc. is available in src/site/xdoc/, and in the pdf that's generated 
> therefrom, during the build.
> High on the list of things to do is support to parallelize copies on a 
> per-block level. (i.e. Incorporation of HDFS-222.)
> I look forward to comments, suggestions and discussion that will hopef

[jira] [Updated] (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure

2011-08-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-901:


Attachment: MAPREDUCE-901.patch_2

Updated patch to fix a couple of unit tests, which now pass.

I think with MAPREDUCE-279, the counter limits are less interesting given that 
they do not affect other jobs - unlike MRv1 where this would affect the 
JobTracker. For now, I propose we commit this and re-visit Counter limits via a 
follow-on (blocker) jira. Thoughts?

> Move Framework Counters into a TaskMetric structure
> ---
>
> Key: MAPREDUCE-901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: 901_1.patch, 901_1.patch, FrameworkCounterGroup.java, 
> MAPREDUCE-901.patch, MAPREDUCE-901.patch, MAPREDUCE-901.patch, 
> MAPREDUCE-901.patch_2, mr-901-trunk-v1.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for 
> each of the metrics like input records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the 
> Counters down to just the user's counters) and decrease the latency for 
> updates to the JobTracker (since Counters are sent at most 1/minute instead 
> of 1/heartbeat).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite

2011-08-12 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084446#comment-13084446
 ] 

Mithun Radhakrishnan commented on MAPREDUCE-2765:
-

No, Luke. Not at this time. What we have now is a port of the code that works 
on 0.20.203.

Parallel block copy is at the top of the TODO list, though.

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch, distcpv2_trunk.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and creates the 
> list-of-source-paths, it attempts to filter out files that might be skipped 
> (by comparing file-sizes, checksums, etc.) This significantly increases the 
> startup time (or the time spent in serial processing till the MR job is 
> launched), blocking the calling-thread. This becomes pronounced as nFiles 
> increases. (Internal benchmarks have seen situations where more time is spent 
> setting up the job than on the actual transfer.)
>   DistCpV2 postpones as much work as possible to the MR job. The 
> file-listing isn't filtered until the map-task runs (at which time, identical 
> files are skipped). DistCpV2 can now be run "asynchronously". The program 
> quits at job-launch, logging the job-id for tracking. Programmatically, the 
> DistCp.execute() returns a Job instance for progress-tracking.
>   
> 3. New features:
>   (a)   -async: As described in #2.
>   (b)   -atomic: Data is copied to a (user-specifiable) tmp-location, and 
> then moved atomically to destination.
>   (c)   -bandwidth: Enforces a limit on the bandwidth consumed per map.
>   (d)   -strategy: As above.
>   
> A more comprehensive description the newer features, how the dynamic-strategy 
> works, etc. is available in src/site/xdoc/, and in the pdf that's generated 
> therefrom, during the build.
> High on the list of things to do is support to parallelize copies on a 
> per-block level.

[jira] [Commented] (MAPREDUCE-2835) Make per-job counter limits configurable

2011-08-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084447#comment-13084447
 ] 

Arun C Murthy commented on MAPREDUCE-2835:
--

Tom, we purposely kept these not configurable per job (by the user) since they 
would start to workaround these limits.

We saw lots of cases where users took down jobs due to this and hence it's only 
configurable by admins. OTOH other limits like #groups, counter-name-len etc. 
are generous enough that we never had issues, but was done to be super-safe. 

> Make per-job counter limits configurable
> 
>
> Key: MAPREDUCE-2835
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2835
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.204.0
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2835.patch, MAPREDUCE-2835.patch
>
>
> The per-job counter limits introduced in MAPREDUCE-1943 are fixed, except for 
> the total number allowed per job (mapreduce.job.counters.limit). It would be 
> useful to make them all configurable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure

2011-08-12 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084448#comment-13084448
 ] 

Tom White commented on MAPREDUCE-901:
-

+1 to addressing the points I raised in a follow-on JIRA.

> Move Framework Counters into a TaskMetric structure
> ---
>
> Key: MAPREDUCE-901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: 901_1.patch, 901_1.patch, FrameworkCounterGroup.java, 
> MAPREDUCE-901.patch, MAPREDUCE-901.patch, MAPREDUCE-901.patch, 
> MAPREDUCE-901.patch_2, mr-901-trunk-v1.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for 
> each of the metrics like input records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the 
> Counters down to just the user's counters) and decrease the latency for 
> updates to the JobTracker (since Counters are sent at most 1/minute instead 
> of 1/heartbeat).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure

2011-08-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-901:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks Tom.

I just committed this. Thanks Luke!

> Move Framework Counters into a TaskMetric structure
> ---
>
> Key: MAPREDUCE-901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: 901_1.patch, 901_1.patch, FrameworkCounterGroup.java, 
> MAPREDUCE-901.patch, MAPREDUCE-901.patch, MAPREDUCE-901.patch, 
> MAPREDUCE-901.patch_2, mr-901-trunk-v1.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for 
> each of the metrics like input records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the 
> Counters down to just the user's counters) and decrease the latency for 
> updates to the JobTracker (since Counters are sent at most 1/minute instead 
> of 1/heartbeat).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084465#comment-13084465
 ] 

Hudson commented on MAPREDUCE-901:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #765 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/765/])
MAPREDUCE-901. Efficient framework counters. Contributed by Luke Lu.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157290
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/FileSystemCounter.properties
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskStatus.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskInProgress.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Counter.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Counters.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/InterTrackerProtocol.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/LimitExceededException.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/TestCounters.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/JobCounter.properties
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/Limits.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/FrameworkCounterGroup.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/package-info.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/CounterGroupBase.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java.orig
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/CounterGroup.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/FileSystemCounterGroup.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/util/CountersStrings.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMiniMRWithDFS.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestCombineOutputCollector.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Counters.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/GenericCounter.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/AbstractCounterGroup.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/util/ResourceBundles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/protocol/ClientProtocol.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/CounterGroupFactory.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/FileSystemCounter.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestSeveral.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMiniMRDFSSort.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/TaskCounter.properties
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/counters/AbstractCounter.java


> Move Framework Counters into a TaskMetric structure
> ---
>
> Key: MAPREDUCE-901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: 901_1.patch, 901_1.patch, FrameworkCounterGroup.java, 
> MAPREDUCE-901.patch, MAPREDUCE-901.patch, MAPREDUCE-901.patch, 
> MAPREDUCE-901.patch_2, mr-901-trunk-v1.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for