[jira] [Commented] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows

2012-08-03 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427851#comment-13427851
 ] 

Bikas Saha commented on MAPREDUCE-4374:
---

bq.Notice this is tmp directory. On Linux /tmp exists by default which is not 
the case on Windows.
I think this is not limited to /tmp. It could be any directory set in the 
config.
btw, in the if condition is tmpDir exists then the check for it being a dir is 
skipped I think.

bq. Let’s not push this to Shell as I think the code is simply enough to be 
understood and we can work towards a better abstraction for this (handling 
‘set’ vs ‘export’) in the future.
I see what the issue is. At the same time, this code is duplicated in 4 
different places. So IMO, it makes sense to put these in a helper method call. 
When we make a better abstraction, we will know the single place to look for 
the old code instead of multiple places. Thoughts?

 Fix child task environment variable config and add support for Windows
 --

 Key: MAPREDUCE-4374
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-4374-branch-1-win-2.patch, 
 MAPREDUCE-4374-branch-1-win.patch


 In HADOOP-2838, a new feature was introduced to set environment variables via 
 the Hadoop config 'mapred.child.env' for child tasks. There are some further 
 fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix; 
 MAPREDUCE-478 broke the config into 'mapred.map.child.env' and 
 'mapred.reduce.child.env'.  However the current implementation is still not 
 complete. It does not match its documentation or original intend as I 
 believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration 
 syntax, we will have problems using them on Windows because ‘:’ appears very 
 often in Windows path as in “C:\”, and environment variables are used very 
 often to hold path names. The Jira is created to fix the problem and provide 
 support on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4393) PaaS on YARN: an YARN application to demonstrate that YARN can be used as a PaaS

2012-08-03 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4393:
-

Status: Open  (was: Patch Available)

Jaigak - I spent some more thinking about this in light of MAPREDUCE-4495.

Unfortunately, it seems that we are running the risk of turning YARN into an 
'umbrella' project by accepting applications built on top of YARN into the 
project itself...

Essentially, as folks like Chris Mattman have pointed out in MAPREDUCE-4495, 
the PaaS prototype is better off being a standalone project in Apache Incubator 
since the Apache Software Foundation frowns upon one 'umbrella' project housing 
several smaller projects i.e. YARN vis-a-vis PaaS, Workflow AM etc.

If you are interested, I'm more than happy to help you through the Apache 
Incubator process and we collaborate via the Incubator. Do you mind doing that? 
Thanks!

 PaaS on YARN: an YARN application to demonstrate that YARN can be used as a 
 PaaS
 

 Key: MAPREDUCE-4393
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4393
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: examples
Affects Versions: 0.23.1
Reporter: Jaigak Song
Assignee: Jaigak Song
 Fix For: 3.0.0

 Attachments: HADOOPasPAAS_Architecture.pdf, MAPREDUCE-4393.patch, 
 MAPREDUCE-4393.patch, MAPREDUCE-4393.patch, MAPREDUCE4393.patch, 
 MAPREDUCE4393.patch

   Original Estimate: 336h
  Time Spent: 336h
  Remaining Estimate: 0h

 This application is to demonstrate that YARN can be used for non-mapreduce 
 applications. As Hadoop has already been adopted and deployed widely and its 
 deployment in future will be highly increased, we thought that it's a good 
 potential to be used as PaaS.  
 I have implemented a proof of concept to demonstrate that YARN can be used as 
 a PaaS (Platform as a Service). I have done a gap analysis against VMware's 
 Cloud Foundry and tried to achieve as many PaaS functionalities as possible 
 on YARN.
 I'd like to check in this POC as a YARN example application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4393) PaaS on YARN: an YARN application to demonstrate that YARN can be used as a PaaS

2012-08-03 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427854#comment-13427854
 ] 

Arun C Murthy commented on MAPREDUCE-4393:
--

Here is more information about proposing this via the incubator: 
http://incubator.apache.org/guides/proposal.html

I do apologize for not seeing the danger of this (i.e. turning YARN into an 
umbrella project) earlier - I'm willing to make up for it by helping you 
through the Incubator. However, it is something the ASF cares deeply about and 
is something I have to follow as part of the responsibility of the Hadoop PMC.

Again, apologies - but I do hope we can collaborate through the Incubator and 
my offer of help stands. Thanks!

 PaaS on YARN: an YARN application to demonstrate that YARN can be used as a 
 PaaS
 

 Key: MAPREDUCE-4393
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4393
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: examples
Affects Versions: 0.23.1
Reporter: Jaigak Song
Assignee: Jaigak Song
 Fix For: 3.0.0

 Attachments: HADOOPasPAAS_Architecture.pdf, MAPREDUCE-4393.patch, 
 MAPREDUCE-4393.patch, MAPREDUCE-4393.patch, MAPREDUCE4393.patch, 
 MAPREDUCE4393.patch

   Original Estimate: 336h
  Time Spent: 336h
  Remaining Estimate: 0h

 This application is to demonstrate that YARN can be used for non-mapreduce 
 applications. As Hadoop has already been adopted and deployed widely and its 
 deployment in future will be highly increased, we thought that it's a good 
 potential to be used as PaaS.  
 I have implemented a proof of concept to demonstrate that YARN can be used as 
 a PaaS (Platform as a Service). I have done a gap analysis against VMware's 
 Cloud Foundry and tried to achieve as many PaaS functionalities as possible 
 on YARN.
 I'd like to check in this POC as a YARN example application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427856#comment-13427856
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

Santosh - I agree that both PaaS and Workflow-AM are similar. 

I think both show that we could easily turn YARN into an umbrella project with 
a proliferation of YARN applications. 

Hence, I have re-considered my opinion on MAPREDUCE-4393 and asked them to go 
the Incubator route too: http://s.apache.org/1K5

I look forward to collaborating on both projects in the Incubator. Thanks.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4068) Jars in lib subdirectory of the submittable JAR are not added to the classpath

2012-08-03 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427876#comment-13427876
 ] 

Harsh J commented on MAPREDUCE-4068:


This is a major regression if its true. Are there no tests covering this 
unpacking feature?

 Jars in lib subdirectory of the submittable JAR are not added to the classpath
 --

 Key: MAPREDUCE-4068
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4068
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan
 Fix For: 0.23.2


 Prior to hadoop 0.23, users could add third party jars to the lib 
 subdirectory of the submitted job jar and they become available in the task's 
 classpath. I see this functionality was in TaskRunner.java, but I can't see 
 similar functionality in hadoop 0.23 (neither in MapReduceChildJVM.java nor 
 other places).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4501) couldn't compile hadoop-2.0 successfully because of errors in build files

2012-08-03 Thread Yan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427902#comment-13427902
 ] 

Yan Liu commented on MAPREDUCE-4501:


This error happens after merging MAPREDUCE-4438, in the pom.xml for 
hadoop-yarn-applications. Now it's ok in current trunk version.

 couldn't compile hadoop-2.0 successfully because of errors in build files
 -

 Key: MAPREDUCE-4501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Yan Liu

 hadoop-yarn-applications relies on is 2.0.1-SNAPSHOT, however, the commit 
 makes it 3.0.0-SNAPSHOT. This makes the compile fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)
Ahmed Radwan created MAPREDUCE-4511:
---

 Summary: Add IFile readahead
 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427968#comment-13427968
 ] 

Ahmed Radwan commented on MAPREDUCE-4511:
-

Here is the updated branch-1 patch based on Todd's HADOOP-7714 patches. Note 
that this patch requires HADOOP-7754 patch.

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan

 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4511:


Attachment: MAPREDUCE-4511_branch1.patch

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-3.txt

Do not create processtree instance from resourcecalculator plugin. Make them 
separated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427979#comment-13427979
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539020/plugable-pstree-3.txt
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2702//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4431) killing already completed job gives ambiguous message as Killed job job id

2012-08-03 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427989#comment-13427989
 ] 

Devaraj K commented on MAPREDUCE-4431:
--

Hi Harsh, can you have a look into the updated patch when you find some time?

 killing already completed job gives ambiguous message as Killed job job id
 --

 Key: MAPREDUCE-4431
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4431
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
Priority: Minor
 Attachments: MAPREDUCE-4431-1.patch, MAPREDUCE-4431.patch


 If we try to kill the already completed job by the following command it gives 
 ambiguous message as Killed job job id
 ./mapred job -kill already completed job id

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2012-08-03 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427990#comment-13427990
 ] 

Devaraj K commented on MAPREDUCE-3193:
--

Hi Harsh, can you have a look into the updated patch when you find some time?

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.0.2, 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193.patch, 
 MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4507) IdentityMapper is being triggered when the type of the Input Key at class level and method level has a conflict

2012-08-03 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427993#comment-13427993
 ] 

Harsh J commented on MAPREDUCE-4507:


The {{map()}} function is to be properly overriden when using the new API. 
Using @Override annotations on map() (and for that matter, reduce() too) will 
help you catch your mistake here.

As discussed on http://search-hadoop.com/m/hSxqz1vsQPc, this is a user-side 
mistake, but in no way a bug. See 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Mapper.html#map(KEYIN,%20VALUEIN,%20org.apache.hadoop.mapreduce.Mapper.Context).

We can add a javadoc improvement (and a tutorial improvement) to state the 
right answer to avoiding this issue: Always use @Override annotations when 
overriding methods. (Any IDE today provides support for this).

 IdentityMapper is being triggered when the type of the Input Key at class 
 level and method level has a conflict
 ---

 Key: MAPREDUCE-4507
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4507
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
 Environment: linux ubuntu
Reporter: Bejoy KS

 If we use the default InputFormat (TextInputFormat) but specify the Key type 
 in mapper as IntWritable instead of Long Writable. The framework is supposed 
 throw a class cast exception.Such an exception is thrown only if the key 
 types at class level and method level are the same (IntWritable). But if we 
 provide the Input key type as IntWritable on the class level but LongWritable 
 on the method level (map method), instead of throwing a compile time error, 
 the code compliles fine . In addition to it on execution the framework 
 triggers Identity Mapper instead of the custom mapper provided with the 
 configuration. In this case the 'mapreduce.map.class' in job.xml shows mapper 
 as Custom Mapper itself , it should show IdentityMapper in cases where 
 IdentityMapper is triggered to avoid confusion and easy debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-4.txt

check if ProcessTree is available before enabling monitoring

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427998#comment-13427998
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539022/plugable-pstree-4.txt
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2703//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-4-with-whitespace.txt

now without removed whitespace lines

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428037#comment-13428037
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12539036/plugable-pstree-4-with-whitespace.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor
  org.apache.hadoop.yarn.server.nodemanager.TestEventFlow
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
  
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2704//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2704//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3289) Make use of fadvise in the NM's shuffle handler

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428052#comment-13428052
 ] 

Hudson commented on MAPREDUCE-3289:
---

Integrated in Hadoop-Hdfs-trunk #1124 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1124/])
MAPREDUCE-3289. Make use of fadvise in the NM's shuffle handler. 
(Contributed by Todd Lipcon and Siddharth Seth) (Revision 1368718)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1368718
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedChunkedFile.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java


 Make use of fadvise in the NM's shuffle handler
 ---

 Key: MAPREDUCE-3289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3289
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2, nodemanager, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: 3289-1.txt, 3289-2.txt, MAPREDUCE-3289.branch-1.patch, 
 MAPREDUCE-3289.branch-1.patch, MR3289_trunk.txt, MR3289_trunk_2.txt, 
 MR3289_trunk_3.txt, mr-3289.txt


 Using the new NativeIO fadvise functions, we can make the NodeManager 
 prefetch map output before it's send over the socket, and drop it out of the 
 fs cache once it's been sent (since it's very rare for an output to have to 
 be re-sent). This improves IO efficiency and reduces cache pollution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-5-with-whitespace.txt

avoid null pointer dereference in init()

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428094#comment-13428094
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12539050/plugable-pstree-5-with-whitespace.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2705//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2705//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3289) Make use of fadvise in the NM's shuffle handler

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428100#comment-13428100
 ] 

Hudson commented on MAPREDUCE-3289:
---

Integrated in Hadoop-Mapreduce-trunk #1156 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1156/])
MAPREDUCE-3289. Make use of fadvise in the NM's shuffle handler. 
(Contributed by Todd Lipcon and Siddharth Seth) (Revision 1368718)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1368718
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedChunkedFile.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java


 Make use of fadvise in the NM's shuffle handler
 ---

 Key: MAPREDUCE-3289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3289
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2, nodemanager, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: 3289-1.txt, 3289-2.txt, MAPREDUCE-3289.branch-1.patch, 
 MAPREDUCE-3289.branch-1.patch, MR3289_trunk.txt, MR3289_trunk_2.txt, 
 MR3289_trunk_3.txt, mr-3289.txt


 Using the new NativeIO fadvise functions, we can make the NodeManager 
 prefetch map output before it's send over the socket, and drop it out of the 
 fs cache once it's been sent (since it's very rare for an output to have to 
 be re-sent). This improves IO efficiency and reduces cache pollution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4488) Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to branch-1

2012-08-03 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428188#comment-13428188
 ] 

Tom White commented on MAPREDUCE-4488:
--

Alejandro - the code is from MAPREDUCE-463. Can I make the changes you suggest 
in another JIRA so that branches 1 and 2 are kept the same?

 Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to 
 branch-1
 ---

 Key: MAPREDUCE-4488
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4488
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, performance
Affects Versions: 1.0.3
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4488.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428194#comment-13428194
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4495:
---

I don't think PaaS and Workflow-AM are similar. 

Workflow-AM aims to provide a AM can that can run multiple MR jobs and do 
intra-AM processing all from the same AM. This would be enough for projects 
that typically run multiple MR jobs as single unit of processing, like 
Pig/Hive/Sqoop/Oozie. Workflow-AM will need to tap into the MapReduce AM 
private classes, as the intention is to fully leverage what has been done 
already. And most likely will require changes in the MapReduce AM, such as 
making it thread-safe and multi-mr-job safe (which I believe it is not the case 
today). 

Because of this, I think that it belongs in MapReduce. And having it outside, 
at least during its inception, it will make much more difficult its development.

Said this, I don't have any issue, quite the opposite, once we finalize the 
initial implementation to see how it can be generalized and move out.




 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4488) Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to branch-1

2012-08-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428195#comment-13428195
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4488:
---

+1

 Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to 
 branch-1
 ---

 Key: MAPREDUCE-4488
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4488
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, performance
Affects Versions: 1.0.3
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4488.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)
Gelesh created MAPREDUCE-4512:
-

 Summary: TextInputFormat delimiter  bug:- Input Text portion ends 
with  Delimiter starts with same char/char sequence
 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
 Fix For: 0.20.204.0


TextInputFormat delimiter  bug scenario , a character sequence of the input 
text,  in which the first character matches with the first character of 
delimiter, and reaming input text character sequence  matches with the entire 
delimiter character sequence from the  starting position of the delimiter.

eg   delimiter =record;
and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com Location 
Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name  

Here string =Bangalorrecord 3:  satisfy two condition 
1) contains the delimiter record
2) The character / character sequence immediately b4 the delimiter (ie 'r') 
matches with first character (or character sequence ) of delimiter.  (ie 
=Bangalor ends with and Delimiter starts with same character/char sequence 
'r' ),

Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4512:
--

Status: Patch Available  (was: Open)

just one line of code change @ LineReader, would do. Tested
Any issues please let me know to help further
gelesh.had...@gmail.com

 TextInputFormat delimiter  bug:- Input Text portion ends with  Delimiter 
 starts with same char/char sequence
 -

 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
  Labels: patch
 Fix For: 0.20.204.0

   Original Estimate: 1m
  Remaining Estimate: 1m

 TextInputFormat delimiter  bug scenario , a character sequence of the input 
 text,  in which the first character matches with the first character of 
 delimiter, and reaming input text character sequence  matches with the entire 
 delimiter character sequence from the  starting position of the delimiter.
 eg   delimiter =record;
 and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
 Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
  
 Here string =Bangalorrecord 3:  satisfy two condition 
 1) contains the delimiter record
 2) The character / character sequence immediately b4 the delimiter (ie 'r') 
 matches with first character (or character sequence ) of delimiter.  (ie 
 =Bangalor ends with and Delimiter starts with same character/char sequence 
 'r' ),
 Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4512:
--

Attachment: MAPREDUCE-4512.txt

Just One line code change at LineRecord. Tested  in case there is any issue 
please mail me gelesh.had...@gmail.com

 TextInputFormat delimiter  bug:- Input Text portion ends with  Delimiter 
 starts with same char/char sequence
 -

 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
  Labels: patch
 Fix For: 0.20.204.0

 Attachments: MAPREDUCE-4512.txt

   Original Estimate: 1m
  Remaining Estimate: 1m

 TextInputFormat delimiter  bug scenario , a character sequence of the input 
 text,  in which the first character matches with the first character of 
 delimiter, and reaming input text character sequence  matches with the entire 
 delimiter character sequence from the  starting position of the delimiter.
 eg   delimiter =record;
 and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
 Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
  
 Here string =Bangalorrecord 3:  satisfy two condition 
 1) contains the delimiter record
 2) The character / character sequence immediately b4 the delimiter (ie 'r') 
 matches with first character (or character sequence ) of delimiter.  (ie 
 =Bangalor ends with and Delimiter starts with same character/char sequence 
 'r' ),
 Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-3902:
--

Attachment: MAPREDUCE-3902.2.patch

As a first step, I fixed the patch by Arun to pass compile against current 
source code.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428211#comment-13428211
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-3902:
---

IMHO, the 2nd topic(combining per container) should be moved, because the 
change seems to be too big.
If there are no counter opinion, I'm going to create new ticket to deal with 
the 2nd topic as a sub-task of MAPREDUCe-3902.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428212#comment-13428212
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-3902:
---

s/should be moved/should be moved to the new ticket/

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428214#comment-13428214
 ] 

Hadoop QA commented on MAPREDUCE-4512:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539059/MAPREDUCE-4512.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//console

This message is automatically generated.

 TextInputFormat delimiter  bug:- Input Text portion ends with  Delimiter 
 starts with same char/char sequence
 -

 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
  Labels: patch
 Fix For: 0.20.204.0

 Attachments: MAPREDUCE-4512.txt

   Original Estimate: 1m
  Remaining Estimate: 1m

 TextInputFormat delimiter  bug scenario , a character sequence of the input 
 text,  in which the first character matches with the first character of 
 delimiter, and reaming input text character sequence  matches with the entire 
 delimiter character sequence from the  starting position of the delimiter.
 eg   delimiter =record;
 and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
 Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
  
 Here string =Bangalorrecord 3:  satisfy two condition 
 1) contains the delimiter record
 2) The character / character sequence immediately b4 the delimiter (ie 'r') 
 matches with first character (or character sequence ) of delimiter.  (ie 
 =Bangalor ends with and Delimiter starts with same character/char sequence 
 'r' ),
 Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Bo Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428237#comment-13428237
 ] 

Bo Wang commented on MAPREDUCE-4495:


I agree with Alejandro. The goals of workflow-AM are beyond job scheduling and 
include local resource management and optimization. These goals require a tight 
interaction of workflow AM and MR AM. It can be regarded as an extension to MR 
AM. I noticed MAPREDUCE-3902 on reusing containers in MR AM. Workflow AM can 
reuse containers across jobs, which is a more general case.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3600) Add Minimal Fair Scheduler to MR2

2012-08-03 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3600.


Resolution: Fixed

Fixed by parent ticket.

 Add Minimal Fair Scheduler to MR2
 -

 Key: MAPREDUCE-3600
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3600
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3600.v1.patch, MAPREDUCE-3600.v2.patch


 This covers the addition of the Fair Scheduler to the MR2 infrastructure. 
 This patch will represent the minimum functional FairScheduler in MR2. It 
 will be limited to a configuration file reader, functionality to calculate 
 fair shares, and hooks into the actual MR2 scheduling code. 
 It will not include delay scheduling, preemption, or a web UI, which will be 
 handled in separate JIRA's. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3602) Add Preemption to MR2 Fair Scheduler

2012-08-03 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3602.


Resolution: Fixed

Solved with parent ticket.

 Add Preemption to MR2 Fair Scheduler
 

 Key: MAPREDUCE-3602
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3602
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3602.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3601) Add Delay Scheduling to MR2 Fair Scheduler

2012-08-03 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3601.


Resolution: Fixed

Fixed with parent ticket.

 Add Delay Scheduling to MR2 Fair Scheduler
 --

 Key: MAPREDUCE-3601
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3601
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3601.v1.patch


 JIRA for delay scheduling component.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428257#comment-13428257
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

Alejandro, making MR AM thread-safe is a good goal. We can do that 
independently of the new AM. I have opened MAPREDUCE-4513 for the same.

I don't which other 'private' classes you need - frankly that concerns me. It 
means you are adding new requirements on MR-AM which isn't an 'api' of that 
nature.

Also, if we are going that route I strongly suggest we do not import code from 
Oozie and merely take JobControl api and support it. That should be a trivial 
exercise without adding any new 'interfaces' to MapReduce.

So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster 
thread-safe. This will allow for multiple objects of MRAppMaster to be created. 
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from Oozie, come up with a 
new set of interfaces etc. etc. and do it in a separate Incubator project.

As I indicated previously, my preference is option #2 and I have already 
offered help to deal with the specifics so you and Bo can concentrate on 
getting the code out.

Thoughts?

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4513) Make MR AM thread-safe

2012-08-03 Thread Arun C Murthy (JIRA)
Arun C Murthy created MAPREDUCE-4513:


 Summary: Make MR AM thread-safe
 Key: MAPREDUCE-4513
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4513
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy


Currently MR-AM has a bunch of statics making it thread unsafe. We should fix 
that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428270#comment-13428270
 ] 

Owen O'Malley commented on MAPREDUCE-4495:
--

The Hadoop project has gone down the path of having large contrib components 
before and it created substantial difficulties for the Hadoop community. Hadoop 
should be about creating a platform for other projects to build on rather than 
bundling all components within itself. Since many of the people interested in 
working on this are in the Oozie project, it might make sense to host it there. 
Otherwise incubator would be a great place to go while you build the project 
and community. 

Any work that you can do to help YARN become a better platform is appreciated, 
but I expect there to be a lot of YARN-based frameworks and they will all need 
be managed from outside of Hadoop.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428272#comment-13428272
 ] 

Arun C Murthy commented on MAPREDUCE-4495:
--

{quote}
So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster 
thread-safe. This will allow for multiple objects of MRAppMaster to be created. 
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from oozie, come up with a 
new set of interfaces for generic DAG mgmt infrastructure etc. etc. and do it 
in a separate Incubator project.
{quote}

I think this is coming to a point where we are arguing too much in the 
abstract. Frankly, this is really not how I want to spend my time.

Maybe we can wait for a detailed proposal from Bo or Alejandro and then revisit 
this discussion. I believe I have laid my thoughts out clearly with respect to 
the options etc. Let's discuss when we actually have something concrete (design 
or code).

OTOH, if we can agree on the Incubator proposal I'm happy to do the legwork for 
Alejandro right-away. At least that is tractable and not merely abstract.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4431) killing already completed job gives ambiguous message as Killed job job id

2012-08-03 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428273#comment-13428273
 ] 

Mayank Bansal commented on MAPREDUCE-4431:
--

+1 Looks good.

Thanks,
Mayank

 killing already completed job gives ambiguous message as Killed job job id
 --

 Key: MAPREDUCE-4431
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4431
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
Priority: Minor
 Attachments: MAPREDUCE-4431-1.patch, MAPREDUCE-4431.patch


 If we try to kill the already completed job by the following command it gives 
 ambiguous message as Killed job job id
 ./mapred job -kill already completed job id

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428279#comment-13428279
 ] 

Bikas Saha commented on MAPREDUCE-4275:
---

Thanks for incorporating my comments. +1.

Minor typo in unavailable
{code}
+if (resourceCalculatorPlugin == null) {
+LOG.info(ResourceCalculatorPlugin is unavaiable on this system. 
++ this.getClass().getName() +  is disabled.);
+return false;
+}
+if (ResourceCalculatorProcessTree.getResourceCalculatorProcessTree(0, 
processTreeClass, conf) == null) {
+LOG.info(ResourceCalculatorProcessTree is unavaiable on this system. 
++ this.getClass().getName() +  is disabled.);
{code}

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-03 Thread Anil Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428280#comment-13428280
 ] 

Anil Gupta commented on MAPREDUCE-4508:
---

Hi Hitesh,

If you think that MAPREDUCE-3796 will cover the test case of checking that 
yarn.nodemanager.resource.memory-mb   yarn.app.mapreduce.am.resource.mb and 
take appropriate actions accordingly then you can close it as dup of 
MAPREDUCE-3796.

Thanks,
Anil Gupta

 YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
 mapred.xml and report errors accordingly.
 ---

 Key: MAPREDUCE-4508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.0-alpha
 Environment: CentOs6.0, Hadoop2.0.0 Alpha
Reporter: Anil Gupta
  Labels: Map, Reduce, YARN

 Please refer to this discussion on the Hadoop Mailing list:
 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
 Summary:
 I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
 Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
 configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
 to 1200. After setting the property if i run any Yarn Job then the 
 NodemManager wont be able to start any Map task since by default the 
 yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
 mapred-site.xml. 
 Expected Behavior: NodeManager should give an error if 
 yarn.app.mapreduce.am.resource.mb = yarn.nodemanager.resource.memory-mb.
 Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Patrick Wendell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428293#comment-13428293
 ] 

Patrick Wendell commented on MAPREDUCE-4495:


Just caught up with this - there are several issues being debated here 
simultaneously.

It is really pointless to start arguing about them until we have a clear and 
thorough design doc along with a preliminary discussion of technical merit. 
This description needs a lot more color given the scope of the proposal.

I agree with Arun - we should wait until that happens to continue discussion.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428297#comment-13428297
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4495:
---

bq. Maybe we can wait for a detailed proposal from Bo or Alejandro and then 
revisit this discussion. I believe I have laid my thoughts out clearly with 
respect to the options etc. Let's discuss when we actually have something 
concrete (design or code).

Sounds like a plan.

 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428352#comment-13428352
 ] 

Jonathan Eagles commented on MAPREDUCE-4503:


+1

 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-4503:
---

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
   0.23.3
   Status: Resolved  (was: Patch Available)

Looks great. Thanks, Bobby.

 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets

2012-08-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428392#comment-13428392
 ] 

Daryn Sharp commented on MAPREDUCE-4323:


{{FileSystem.closeAllForUGI}} is actually a reasonable approach.  Each request 
is creating a new ugi so there's no issue with pulling the rug out from 
underneath other fs users.

 NM leaks sockets
 

 Key: MAPREDUCE-4323
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
Reporter: Daryn Sharp
Priority: Critical

 The NM is exhausting its fds because it's not closing fs instances when the 
 app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4323) NM leaks sockets

2012-08-03 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned MAPREDUCE-4323:
--

Assignee: Daryn Sharp

 NM leaks sockets
 

 Key: MAPREDUCE-4323
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical

 The NM is exhausting its fds because it's not closing fs instances when the 
 app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails

2012-08-03 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4466:
--

Fix Version/s: (was: trunk)
 Target Version/s: 2.2.0-alpha
Affects Version/s: 0.23.3
   Status: Open  (was: Patch Available)

Looks like actual log rendering will also be broken - further up in 
ContainerLogsPage {{new File(this.dirsHandler.getLogPathToRead(}}. Also, 
changing {{getContainerLogDirs}} may be a cleaner fix.

If testNMWebServer.testNMWebApp is modified to use file:// - it ends up 
creating a dir structure with file:// being the top level directory under the 
current working dir. That could be modified to verify the patch.

All access to the local-dirs and log-dirs happens via the 
LocalDirsHandlerService - maybe we should have this convert URIs to simple 
strings. file:// works in other places - since {{Path}} is used instead of 
{{File}}.

 Using URI for yarn.nodemanager log dirs fails
 -

 Key: MAPREDUCE-4466
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Eli Collins
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4466-trunk-v1.patch


 If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs 
 or yarn.nodemanager.remote-app-log-dir the container log servlet fails with 
 an NPE (works if I remove the file scheme). Using a URI for 
 yarn.nodemanager.local-dirs works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428398#comment-13428398
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2572 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2572/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = FAILURE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

2012-08-03 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-4514:
-

 Summary: Symlinks to peer distributed cache files no longer work
 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe


Trying to create a symlink to another file that is specified for the 
distributed cache will fail to create the link.  For example:

hadoop jar ... -files x,y,x#z

will localize the files x and y as x and y, but the z symlink for x will not be 
created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428412#comment-13428412
 ] 

Siddharth Seth commented on MAPREDUCE-3902:
---

@Tsuyoshi; I'd spoken with Vinod and others about this a while ago. Should have 
posted this earlier.. Adding the functionality to the AM in the current state 
is possible - but will further complicate some components which are already 
quite complicated - and tough to change.

The TaskAttempt state machine is currently really a mix of TaskAttempt 
transitions as well as Container transitions. The RMContaienrAllocator is also 
dealing with more than it should - Nodes, Containers as well as scheduling. 

The idea was to split the functionality into a separate TaskAttempt, Container 
and Node state machine, along with reduced functionality in the scheduler (also 
decoupling the RM request and AM scheduling). This would make the code cleaner 
and make re-use (as well as other improvements like handling retired nodes) 
easier to implement.

Had worked with Vinod on the state transitions, and have been working on the 
implementation in bits and pieces to see how feasible it is. The code is at 
https://github.com/sidseth/h2-container-reuse . It's a little bit of a mess at 
the moment, with lots of TODOs, etc splattered all over, but is just about 
functional. There's no explicit re-use scheduling yet - but re-use can be 
tested by running a job which requires more containers than available on the 
cluster (and some config changes).

bq. the 2nd topic(combining per container) should be moved, because the change 
seems to be too big.
I believe this was, at least initially, meant to ensure that output from all 
taskAttempts in one container, would be fetched only once by a reducer (without 
a common combiner). Either way, that could be a separate jira.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428422#comment-13428422
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2618 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2618/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428425#comment-13428425
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Common-trunk-Commit #2553 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2553/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428429#comment-13428429
 ] 

eric baldeschwieler commented on MAPREDUCE-4495:


Agree with discussing a particular proposal.

I want to point out that the whole point of YARN is to open up the ability to 
try lots of different changes to MR and to implement lots of alternatives to it 
in parallel.  As a community, we need to be clear that to move fast we need to 
let lots of different people try lots of different things on top of a stable 
platform.  Pig and Hive folks want to radically change what MR is.  There are 
lots of different ideas for how to do this. 

With open APIs everyone is empowered to try new things without asking to get 
their code into the core project.  If we don't embrace the principle of new AMs 
starting outside the core, we are going to have a huge number of arguments like 
this without making anyone happy.  That's not the best way for us to spend our 
time.  I'm not trying to stop anyone from trying anything, I'm trying to reduce 
friction.

My last point is the overhead argument.  Arguing that one doesn't want to go to 
incubator because that adds cost to your project really doesn't look at the 
whole picture.  Adding a new module or sub-project to an existing Apache 
project creates as much work as doing it in the incubator.  It just tosses that 
work into the lap of the folks maintaining the existing project.  When one 
talks about Apache being about community before code, that doesn't mean one has 
a right to do anything in the code.  One needs to first build consensus that 
your coding idea is aligned with the community.  Any time you add something to 
a project, you are implicitly asking the others in the community to do a lot of 
work to support you.  That only makes sense if you are working in a direction 
that the community sees as aligned with the larger goals of the project.

Going full circle, Yarn's open APIs have as a goal allowing people to try a lot 
more things much less expensively.  They don't need to get permission to merge 
their work into MR, which is good for experimenters.  Hadoop committers are not 
burdened with vetting and support many different experiments in Hadoop.  The 
experimenters carry the burden of building community and supporting / selling 
their ideas.  This should save us a lot of time arguing on this list!  ;-)



 Workflow Application Master in YARN
 ---

 Key: MAPREDUCE-4495
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

 It is useful to have a workflow application master, which will be capable of 
 running a DAG of jobs. The workflow client submits a DAG request to the AM 
 and then the AM will manage the life cycle of this application in terms of 
 requesting the needed resources from the RM, and starting, monitoring and 
 retrying the application's individual tasks.
 Compared to running Oozie with the current MapReduce Application Master, 
 these are some of the advantages:
  - Less number of consumed resources, since only one application master will 
 be spawned for the whole workflow.
  - Reuse of resources, since the same resources can be used by multiple 
 consecutive jobs in the workflow (no need to request/wait for resources for 
 every individual job from the central RM).
  - More optimization opportunities in terms of collective resource requests.
  - Optimization opportunities in terms of rewriting and composing jobs in the 
 workflow (e.g. pushing down Mappers).
  - This Application Master can be reused/extended by higher systems like Pig 
 and hive to provide an optimized way of running their workflows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-6-typofix.txt

typo fixed

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree-6-typofix.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

2012-08-03 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428472#comment-13428472
 ] 

Jason Lowe commented on MAPREDUCE-4514:
---

This also breaks when trying to create multiple symlinks to the same file, 
e.g.: {{x#a,x#b,x#c}} only creates the symlink for {{a}} instead of all three.

The problem is Container holds a map from resource Path to symlink String, but 
there could be multiple symlinks to the same source Path.

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe

 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4515) Add test to check if userlogs are retained across TaskTracker restarts

2012-08-03 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created MAPREDUCE-4515:
---

 Summary: Add test to check if userlogs are retained across 
TaskTracker restarts
 Key: MAPREDUCE-4515
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4515
 Project: Hadoop Map/Reduce
  Issue Type: Test
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

2012-08-03 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4514:
--

Attachment: MAPREDUCE-4514.patch

Patch that changes Container to map pending and localized resources to 
ListString instead of String so resources can have multiple symlink 
destinations.

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4514.patch


 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-08-03 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428482#comment-13428482
 ] 

Mayank Bansal commented on MAPREDUCE-4367:
--

I don't see this in trunk. Is it still the issue?

Thanks,
Mayank

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor

 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

2012-08-03 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4514:
--

Target Version/s: 0.23.3, 2.2.0-alpha
  Status: Patch Available  (was: Open)

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4514.patch


 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-08-03 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428492#comment-13428492
 ] 

Jason Lowe commented on MAPREDUCE-4367:
---

Yes, it's still happening for me.  From a recent trunk pull on a single-node 
cluster where the history server isn't running yet:

{noformat}
$ mapred job -kill job_1344038428359_0002
2012-08-04 00:09:56,871 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(255)) - Application state is completed. 
FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-08-04 00:09:57,886 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 0 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:09:58,887 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 1 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:09:59,890 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 2 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:10:00,891 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 3 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
...
{noformat}

And here's what it says after I start the history server:

{noformat}
$ mapred job -kill job_1344038428359_0002
2012-08-04 00:12:52,226 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(255)) - Application state is completed. 
FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-08-04 00:12:53,195 INFO  mapred.ResourceMgrDelegate 
(ResourceMgrDelegate.java:killApplication(329)) - Killing application 
application_1344038428359_0002
Killed job job_1344038428359_0002
{noformat}

Note that in both cases it says the application state is completed and is 
redirecting.  If the application state is completed, there's no point in 
redirecting to the history server if we're trying to kill the application.  
Knowing the application state is completed means we can short-circuit the kill 
attempt before the redirect.

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor

 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428522#comment-13428522
 ] 

Hadoop QA commented on MAPREDUCE-4514:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539121/MAPREDUCE-4514.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2707//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2707//console

This message is automatically generated.

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4514.patch


 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428526#comment-13428526
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12539110/plugable-pstree-6-typofix.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2708//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2708//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree-6-typofix.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4431) killing already completed job gives ambiguous message as Killed job job id

2012-08-03 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428541#comment-13428541
 ] 

Harsh J commented on MAPREDUCE-4431:


+1, looks good to me too.

One comment though (just want your thought):

{code}
+System.out.println(The job  + jobid +  has already been 
killed.);
+exitCode = -1;
{code}

In case the job was already killed, should we perhaps return 0 exitCode (since 
the kill was (already) successful?

 killing already completed job gives ambiguous message as Killed job job id
 --

 Key: MAPREDUCE-4431
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4431
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
Priority: Minor
 Attachments: MAPREDUCE-4431-1.patch, MAPREDUCE-4431.patch


 If we try to kill the already completed job by the following command it gives 
 ambiguous message as Killed job job id
 ./mapred job -kill already completed job id

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4516) Error reading task output Server returned HTTP response code: 400 for URL: http://hadoop03:8080/tasklog?plaintext=trueattemptid=attempt_1344047400780_0002_m_000000_0

2012-08-03 Thread jiafeng.zhang (JIRA)
jiafeng.zhang created MAPREDUCE-4516:


 Summary: Error reading task output Server returned HTTP response 
code: 400 for URL: 
http://hadoop03:8080/tasklog?plaintext=trueattemptid=attempt_1344047400780_0002_m_00_0filter=stdout
 Key: MAPREDUCE-4516
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4516
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.1
 Environment: hadoop-0.23.1 JDK_1.6.0_31
Centos-6.0
Reporter: jiafeng.zhang
 Fix For: 0.23.1


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.1.jar 
teragen 100 /in_test
12/08/04 11:01:47 WARN conf.Configuration: fs.default.name is deprecated. 
Instead, use fs.defaultFS
12/08/04 11:01:47 WARN conf.Configuration: mapred.used.genericoptionsparser is 
deprecated. Instead, use mapreduce.client.genericoptionsparser.used
12/08/04 11:01:49 INFO terasort.TeraSort: Generating 100 using 2
12/08/04 11:01:50 INFO mapreduce.JobSubmitter: number of splits:2
12/08/04 11:01:52 INFO mapred.ResourceMgrDelegate: Submitted application 
application_1344047400780_0002 to ResourceManager at 
hadoop01/192.168.37.101:8032
12/08/04 11:01:52 INFO mapreduce.Job: The url to track the job: 
http://hadoop01:50030/proxy/application_1344047400780_0002/
12/08/04 11:01:52 INFO mapreduce.Job: Running job: job_1344047400780_0002
12/08/04 11:02:11 INFO mapreduce.Job: Job job_1344047400780_0002 running in 
uber mode : false
12/08/04 11:02:11 INFO mapreduce.Job:  map 0% reduce 0%
12/08/04 11:02:19 INFO mapreduce.Job: Task Id : 
attempt_1344047400780_0002_m_00_0, Status : FAILED
12/08/04 11:02:20 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://hadoop03:8080/tasklog?plaintext=trueattemptid=attempt_1344047400780_0002_m_00_0filter=stdout
12/08/04 11:02:20 WARN mapreduce.Job: Error reading task output Server returned 
HTTP response code: 400 for URL: 
http://hadoop03:8080/tasklog?plaintext=trueattemptid=attempt_1344047400780_0002_m_00_0filter=stderr
12/08/04 11:02:25 INFO mapreduce.Job:  map 9% reduce 0%
12/08/04 11:02:30 INFO mapreduce.Job:  map 13% reduce 0%
12/08/04 11:02:33 INFO mapreduce.Job:  map 15% reduce 0%
12/08/04 11:02:40 INFO mapreduce.Job:  map 17% reduce 0%
12/08/04 11:02:46 INFO mapreduce.Job:  map 18% reduce 0%
12/08/04 11:02:52 INFO mapreduce.Job:  map 25% reduce 0%
12/08/04 11:02:56 INFO mapreduce.Job:  map 29% reduce 0%
12/08/04 11:03:01 INFO mapreduce.Job:  map 31% reduce 0%
12/08/04 11:03:08 INFO mapreduce.Job:  map 34% reduce 0%
12/08/04 11:03:11 INFO mapreduce.Job:  map 38% reduce 0%
12/08/04 11:03:14 INFO mapreduce.Job:  map 42% reduce 0%
12/08/04 11:03:15 INFO mapreduce.Job:  map 46% reduce 0%
12/08/04 11:03:17 INFO mapreduce.Job:  map 51% reduce 0%
12/08/04 11:03:18 INFO mapreduce.Job:  map 55% reduce 0%
12/08/04 11:03:20 INFO mapreduce.Job:  map 56% reduce 0%
12/08/04 11:03:24 INFO mapreduce.Job:  map 58% reduce 0%
12/08/04 11:03:25 INFO mapreduce.Job:  map 59% reduce 0%
12/08/04 11:03:26 INFO mapreduce.Job:  map 62% reduce 0%
12/08/04 11:03:28 INFO mapreduce.Job:  map 67% reduce 0%
12/08/04 11:03:29 INFO mapreduce.Job:  map 71% reduce 0%
12/08/04 11:03:32 INFO mapreduce.Job:  map 73% reduce 0%
12/08/04 11:03:33 INFO mapreduce.Job:  map 74% reduce 0%
12/08/04 11:03:35 INFO mapreduce.Job:  map 76% reduce 0%
12/08/04 11:03:36 INFO mapreduce.Job:  map 78% reduce 0%
12/08/04 11:03:38 INFO mapreduce.Job:  map 79% reduce 0%
12/08/04 11:03:39 INFO mapreduce.Job:  map 81% reduce 0%
12/08/04 11:03:41 INFO mapreduce.Job:  map 84% reduce 0%
12/08/04 11:03:44 INFO mapreduce.Job:  map 87% reduce 0%
12/08/04 11:03:48 INFO mapreduce.Job:  map 90% reduce 0%
12/08/04 11:03:51 INFO mapreduce.Job:  map 100% reduce 0%
12/08/04 11:03:52 INFO mapreduce.Job: Job job_1344047400780_0002 completed 
successfully
12/08/04 11:03:52 INFO mapreduce.Job: Counters: 28
File System Counters
FILE: Number of bytes read=240
FILE: Number of bytes written=118412
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=167
HDFS: Number of bytes written=1
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters 
Failed map tasks=1
Launched map tasks=3
Other local map tasks=3
Total time spent by all maps in occupied slots (ms)=193607
Map-Reduce Framework
Map input records=100
Map output records=100
Input split bytes=167
Spilled Records=0
Failed Shuffles=0
Merged