date:20120803

[
https://issues.apache.org/jira/browse/MAPREDUCE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun C Murthy updated MAPREDUCE-4393:
-

Status: Open (was: Patch Available)

Jaigak - I spent some more thinking about this in light of MAPREDUCE-4495.

Unfortunately, it seems that we are running the risk of turning YARN into an
'umbrella' project by accepting applications built on top of YARN into the
project itself...

Essentially, as folks like Chris Mattman have pointed out in MAPREDUCE-4495,
the PaaS prototype is better off being a standalone project in Apache Incubator
since the Apache Software Foundation frowns upon one 'umbrella' project housing
several smaller projects i.e. YARN vis-a-vis PaaS, Workflow AM etc.

If you are interested, I'm more than happy to help you through the Apache
Incubator process and we collaborate via the Incubator. Do you mind doing that?
Thanks!

PaaS on YARN: an YARN application to demonstrate that YARN can be used as a
PaaS

Key: MAPREDUCE-4393
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4393
Project: Hadoop Map/Reduce
Issue Type: Task
Components: examples
Affects Versions: 0.23.1
Reporter: Jaigak Song
Assignee: Jaigak Song
Fix For: 3.0.0

Attachments: HADOOPasPAAS_Architecture.pdf, MAPREDUCE-4393.patch,
MAPREDUCE-4393.patch, MAPREDUCE-4393.patch, MAPREDUCE4393.patch,
MAPREDUCE4393.patch

Original Estimate: 336h
Time Spent: 336h
Remaining Estimate: 0h

This application is to demonstrate that YARN can be used for non-mapreduce
applications. As Hadoop has already been adopted and deployed widely and its
deployment in future will be highly increased, we thought that it's a good
potential to be used as PaaS.
I have implemented a proof of concept to demonstrate that YARN can be used as
a PaaS (Platform as a Service). I have done a gap analysis against VMware's
Cloud Foundry and tried to achieve as many PaaS functionalities as possible
on YARN.
I'd like to check in this POC as a YARN example application.

[jira] [Commented] (MAPREDUCE-4393) PaaS on YARN: an YARN application to demonstrate that YARN can be used as a PaaS


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427854#comment-13427854
 ] 

Arun C Murthy commented on MAPREDUCE-4393:
--

Here is more information about proposing this via the incubator: 
http://incubator.apache.org/guides/proposal.html

I do apologize for not seeing the danger of this (i.e. turning YARN into an 
umbrella project) earlier - I'm willing to make up for it by helping you 
through the Incubator. However, it is something the ASF cares deeply about and 
is something I have to follow as part of the responsibility of the Hadoop PMC.

Again, apologies - but I do hope we can collaborate through the Incubator and 
my offer of help stands. Thanks!

 PaaS on YARN: an YARN application to demonstrate that YARN can be used as a 
 PaaS
 

 Key: MAPREDUCE-4393
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4393
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: examples
Affects Versions: 0.23.1
Reporter: Jaigak Song
Assignee: Jaigak Song
 Fix For: 3.0.0

 Attachments: HADOOPasPAAS_Architecture.pdf, MAPREDUCE-4393.patch, 
 MAPREDUCE-4393.patch, MAPREDUCE-4393.patch, MAPREDUCE4393.patch, 
 MAPREDUCE4393.patch

   Original Estimate: 336h
  Time Spent: 336h
  Remaining Estimate: 0h

 This application is to demonstrate that YARN can be used for non-mapreduce 
 applications. As Hadoop has already been adopted and deployed widely and its 
 deployment in future will be highly increased, we thought that it's a good 
 potential to be used as PaaS.  
 I have implemented a proof of concept to demonstrate that YARN can be used as 
 a PaaS (Platform as a Service). I have done a gap analysis against VMware's 
 Cloud Foundry and tried to achieve as many PaaS functionalities as possible 
 on YARN.
 I'd like to check in this POC as a YARN example application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427856#comment-13427856
]

Arun C Murthy commented on MAPREDUCE-4495:
--

Santosh - I agree that both PaaS and Workflow-AM are similar.

I think both show that we could easily turn YARN into an umbrella project with
a proliferation of YARN applications.

Hence, I have re-considered my opinion on MAPREDUCE-4393 and asked them to go
the Incubator route too: http://s.apache.org/1K5

I look forward to collaborating on both projects in the Incubator. Thanks.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

It is useful to have a workflow application master, which will be capable of
running a DAG of jobs. The workflow client submits a DAG request to the AM
and then the AM will manage the life cycle of this application in terms of
requesting the needed resources from the RM, and starting, monitoring and
retrying the application's individual tasks.
Compared to running Oozie with the current MapReduce Application Master,
these are some of the advantages:
- Less number of consumed resources, since only one application master will
be spawned for the whole workflow.
- Reuse of resources, since the same resources can be used by multiple
consecutive jobs in the workflow (no need to request/wait for resources for
every individual job from the central RM).
- More optimization opportunities in terms of collective resource requests.
- Optimization opportunities in terms of rewriting and composing jobs in the
workflow (e.g. pushing down Mappers).
- This Application Master can be reused/extended by higher systems like Pig
and hive to provide an optimized way of running their workflows.

[jira] [Commented] (MAPREDUCE-4068) Jars in lib subdirectory of the submittable JAR are not added to the classpath

2012-08-03 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427876#comment-13427876
 ] 

Harsh J commented on MAPREDUCE-4068:


This is a major regression if its true. Are there no tests covering this 
unpacking feature?

 Jars in lib subdirectory of the submittable JAR are not added to the classpath
 --

 Key: MAPREDUCE-4068
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4068
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1
Reporter: Ahmed Radwan
 Fix For: 0.23.2


 Prior to hadoop 0.23, users could add third party jars to the lib 
 subdirectory of the submitted job jar and they become available in the task's 
 classpath. I see this functionality was in TaskRunner.java, but I can't see 
 similar functionality in hadoop 0.23 (neither in MapReduceChildJVM.java nor 
 other places).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4501) couldn't compile hadoop-2.0 successfully because of errors in build files

2012-08-03 Thread Yan Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427902#comment-13427902
 ] 

Yan Liu commented on MAPREDUCE-4501:


This error happens after merging MAPREDUCE-4438, in the pom.xml for 
hadoop-yarn-applications. Now it's ok in current trunk version.

 couldn't compile hadoop-2.0 successfully because of errors in build files
 -

 Key: MAPREDUCE-4501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Yan Liu

 hadoop-yarn-applications relies on is 2.0.1-SNAPSHOT, however, the commit 
 makes it 3.0.0-SNAPSHOT. This makes the compile fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)

Ahmed Radwan created MAPREDUCE-4511:
---

 Summary: Add IFile readahead
 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427968#comment-13427968
 ] 

Ahmed Radwan commented on MAPREDUCE-4511:
-

Here is the updated branch-1 patch based on Todd's HADOOP-7714 patches. Note 
that this patch requires HADOOP-7754 patch.

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan

 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead

2012-08-03 Thread Ahmed Radwan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4511:


Attachment: MAPREDUCE-4511_branch1.patch

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4275) Plugable process tree


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-3.txt

Do not create processtree instance from resourcecalculator plugin. Make them 
separated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427979#comment-13427979
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539020/plugable-pstree-3.txt
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2702//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4431) killing already completed job gives ambiguous message as Killed job job id

2012-08-03 Thread Devaraj K (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427989#comment-13427989
 ] 

Devaraj K commented on MAPREDUCE-4431:
--

Hi Harsh, can you have a look into the updated patch when you find some time?

 killing already completed job gives ambiguous message as Killed job job id
 --

 Key: MAPREDUCE-4431
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4431
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
Priority: Minor
 Attachments: MAPREDUCE-4431-1.patch, MAPREDUCE-4431.patch


 If we try to kill the already completed job by the following command it gives 
 ambiguous message as Killed job job id
 ./mapred job -kill already completed job id

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2012-08-03 Thread Devaraj K (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427990#comment-13427990
 ] 

Devaraj K commented on MAPREDUCE-3193:
--

Hi Harsh, can you have a look into the updated patch when you find some time?

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.0.2, 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193.patch, 
 MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4507) IdentityMapper is being triggered when the type of the Input Key at class level and method level has a conflict

2012-08-03 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427993#comment-13427993
 ] 

Harsh J commented on MAPREDUCE-4507:


The {{map()}} function is to be properly overriden when using the new API. 
Using @Override annotations on map() (and for that matter, reduce() too) will 
help you catch your mistake here.

As discussed on http://search-hadoop.com/m/hSxqz1vsQPc, this is a user-side 
mistake, but in no way a bug. See 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Mapper.html#map(KEYIN,%20VALUEIN,%20org.apache.hadoop.mapreduce.Mapper.Context).

We can add a javadoc improvement (and a tutorial improvement) to state the 
right answer to avoiding this issue: Always use @Override annotations when 
overriding methods. (Any IDE today provides support for this).

 IdentityMapper is being triggered when the type of the Input Key at class 
 level and method level has a conflict
 ---

 Key: MAPREDUCE-4507
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4507
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
 Environment: linux ubuntu
Reporter: Bejoy KS

 If we use the default InputFormat (TextInputFormat) but specify the Key type 
 in mapper as IntWritable instead of Long Writable. The framework is supposed 
 throw a class cast exception.Such an exception is thrown only if the key 
 types at class level and method level are the same (IntWritable). But if we 
 provide the Input key type as IntWritable on the class level but LongWritable 
 on the method level (map method), instead of throwing a compile time error, 
 the code compliles fine . In addition to it on execution the framework 
 triggers Identity Mapper instead of the custom mapper provided with the 
 configuration. In this case the 'mapreduce.map.class' in job.xml shows mapper 
 as Custom Mapper itself , it should show IdentityMapper in cases where 
 IdentityMapper is triggered to avoid confusion and easy debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4275) Plugable process tree


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-4.txt

check if ProcessTree is available before enabling monitoring

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427998#comment-13427998
 ] 

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12539022/plugable-pstree-4.txt
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2703//console

This message is automatically generated.

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4275) Plugable process tree


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-4-with-whitespace.txt

now without removed whitespace lines

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

[
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428037#comment-13428037
]

Hadoop QA commented on MAPREDUCE-4275:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12539036/plugable-pstree-4-with-whitespace.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService

org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor
org.apache.hadoop.yarn.server.nodemanager.TestEventFlow

org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager

org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater

org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2704//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2704//console

This message is automatically generated.

Plugable process tree
-

Key: MAPREDUCE-4275
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: nodemanager
Affects Versions: 3.0.0
Environment: FreeBSD 64 bit
Reporter: Radim Kolar
Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt,
plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt,
plugable-pstree-4.txt, plugable-pstree.txt

Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

[jira] [Commented] (MAPREDUCE-3289) Make use of fadvise in the NM's shuffle handler


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428052#comment-13428052
 ] 

Hudson commented on MAPREDUCE-3289:
---

Integrated in Hadoop-Hdfs-trunk #1124 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1124/])
MAPREDUCE-3289. Make use of fadvise in the NM's shuffle handler. 
(Contributed by Todd Lipcon and Siddharth Seth) (Revision 1368718)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1368718
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedChunkedFile.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java


 Make use of fadvise in the NM's shuffle handler
 ---

 Key: MAPREDUCE-3289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3289
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2, nodemanager, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: 3289-1.txt, 3289-2.txt, MAPREDUCE-3289.branch-1.patch, 
 MAPREDUCE-3289.branch-1.patch, MR3289_trunk.txt, MR3289_trunk_2.txt, 
 MR3289_trunk_3.txt, mr-3289.txt


 Using the new NativeIO fadvise functions, we can make the NodeManager 
 prefetch map output before it's send over the socket, and drop it out of the 
 fs cache once it's been sent (since it's very rare for an output to have to 
 be re-sent). This improves IO efficiency and reduces cache pollution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4275) Plugable process tree


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-5-with-whitespace.txt

avoid null pointer dereference in init()

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

[
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428094#comment-13428094
]

Hadoop QA commented on MAPREDUCE-4275:
--

+1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12539050/plugable-pstree-5-with-whitespace.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2705//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2705//console

This message is automatically generated.

Plugable process tree
-

Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

[jira] [Commented] (MAPREDUCE-3289) Make use of fadvise in the NM's shuffle handler


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428100#comment-13428100
 ] 

Hudson commented on MAPREDUCE-3289:
---

Integrated in Hadoop-Mapreduce-trunk #1156 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1156/])
MAPREDUCE-3289. Make use of fadvise in the NM's shuffle handler. 
(Contributed by Todd Lipcon and Siddharth Seth) (Revision 1368718)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1368718
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedChunkedFile.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java


 Make use of fadvise in the NM's shuffle handler
 ---

 Key: MAPREDUCE-3289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3289
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2, nodemanager, performance
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: 3289-1.txt, 3289-2.txt, MAPREDUCE-3289.branch-1.patch, 
 MAPREDUCE-3289.branch-1.patch, MR3289_trunk.txt, MR3289_trunk_2.txt, 
 MR3289_trunk_3.txt, mr-3289.txt


 Using the new NativeIO fadvise functions, we can make the NodeManager 
 prefetch map output before it's send over the socket, and drop it out of the 
 fs cache once it's been sent (since it's very rare for an output to have to 
 be re-sent). This improves IO efficiency and reduces cache pollution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4488) Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to branch-1

2012-08-03 Thread Tom White (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428188#comment-13428188
 ] 

Tom White commented on MAPREDUCE-4488:
--

Alejandro - the code is from MAPREDUCE-463. Can I make the changes you suggest 
in another JIRA so that branches 1 and 2 are kept the same?

 Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to 
 branch-1
 ---

 Key: MAPREDUCE-4488
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4488
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, performance
Affects Versions: 1.0.3
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4488.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Alejandro Abdelnur (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428194#comment-13428194
]

Alejandro Abdelnur commented on MAPREDUCE-4495:
---

I don't think PaaS and Workflow-AM are similar.

Workflow-AM aims to provide a AM can that can run multiple MR jobs and do
intra-AM processing all from the same AM. This would be enough for projects
that typically run multiple MR jobs as single unit of processing, like
Pig/Hive/Sqoop/Oozie. Workflow-AM will need to tap into the MapReduce AM
private classes, as the intention is to fully leverage what has been done
already. And most likely will require changes in the MapReduce AM, such as
making it thread-safe and multi-mr-job safe (which I believe it is not the case
today).

Because of this, I think that it belongs in MapReduce. And having it outside,
at least during its inception, it will make much more difficult its development.

Said this, I don't have any issue, quite the opposite, once we finalize the
initial implementation to see how it can be generalized and move out.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Commented] (MAPREDUCE-4488) Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to branch-1

2012-08-03 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428195#comment-13428195
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4488:
---

+1

 Port MAPREDUCE-463 (The job setup and cleanup tasks should be optional) to 
 branch-1
 ---

 Key: MAPREDUCE-4488
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4488
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, performance
Affects Versions: 1.0.3
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4488.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)

Gelesh created MAPREDUCE-4512:
-

 Summary: TextInputFormat delimiter  bug:- Input Text portion ends 
with  Delimiter starts with same char/char sequence
 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
 Fix For: 0.20.204.0


TextInputFormat delimiter  bug scenario , a character sequence of the input 
text,  in which the first character matches with the first character of 
delimiter, and reaming input text character sequence  matches with the entire 
delimiter character sequence from the  starting position of the delimiter.

eg   delimiter =record;
and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com Location 
Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name  

Here string =Bangalorrecord 3:  satisfy two condition 
1) contains the delimiter record
2) The character / character sequence immediately b4 the delimiter (ie 'r') 
matches with first character (or character sequence ) of delimiter.  (ie 
=Bangalor ends with and Delimiter starts with same character/char sequence 
'r' ),

Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4512:
--

Status: Patch Available  (was: Open)

just one line of code change @ LineReader, would do. Tested
Any issues please let me know to help further
gelesh.had...@gmail.com

 TextInputFormat delimiter  bug:- Input Text portion ends with  Delimiter 
 starts with same char/char sequence
 -

 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
  Labels: patch
 Fix For: 0.20.204.0

   Original Estimate: 1m
  Remaining Estimate: 1m

 TextInputFormat delimiter  bug scenario , a character sequence of the input 
 text,  in which the first character matches with the first character of 
 delimiter, and reaming input text character sequence  matches with the entire 
 delimiter character sequence from the  starting position of the delimiter.
 eg   delimiter =record;
 and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
 Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
  
 Here string =Bangalorrecord 3:  satisfy two condition 
 1) contains the delimiter record
 2) The character / character sequence immediately b4 the delimiter (ie 'r') 
 matches with first character (or character sequence ) of delimiter.  (ie 
 =Bangalor ends with and Delimiter starts with same character/char sequence 
 'r' ),
 Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

2012-08-03 Thread Gelesh (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gelesh updated MAPREDUCE-4512:
--

Attachment: MAPREDUCE-4512.txt

Just One line code change at LineRecord. Tested  in case there is any issue 
please mail me gelesh.had...@gmail.com

 TextInputFormat delimiter  bug:- Input Text portion ends with  Delimiter 
 starts with same char/char sequence
 -

 Key: MAPREDUCE-4512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
 Environment: Lynux
Reporter: Gelesh
  Labels: patch
 Fix For: 0.20.204.0

 Attachments: MAPREDUCE-4512.txt

   Original Estimate: 1m
  Remaining Estimate: 1m

 TextInputFormat delimiter  bug scenario , a character sequence of the input 
 text,  in which the first character matches with the first character of 
 delimiter, and reaming input text character sequence  matches with the entire 
 delimiter character sequence from the  starting position of the delimiter.
 eg   delimiter =record;
 and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com 
 Location Bangalore record 2: name = sdf  ..  location =Bangalorrecord 3: name 
  
 Here string =Bangalorrecord 3:  satisfy two condition 
 1) contains the delimiter record
 2) The character / character sequence immediately b4 the delimiter (ie 'r') 
 matches with first character (or character sequence ) of delimiter.  (ie 
 =Bangalor ends with and Delimiter starts with same character/char sequence 
 'r' ),
 Hear the delimiter is skipped

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-3902:
--

Attachment: MAPREDUCE-3902.2.patch

As a first step, I fixed the patch by Arun to pass compile against current 
source code.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428211#comment-13428211
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-3902:
---

IMHO, the 2nd topic(combining per container) should be moved, because the 
change seems to be too big.
If there are no counter opinion, I'm going to create new ticket to deal with 
the 2nd topic as a sub-task of MAPREDUCe-3902.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428212#comment-13428212
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-3902:
---

s/should be moved/should be moved to the new ticket/

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4512) TextInputFormat delimiter bug:- Input Text portion ends with Delimiter starts with same char/char sequence

[
https://issues.apache.org/jira/browse/MAPREDUCE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428214#comment-13428214
]

Hadoop QA commented on MAPREDUCE-4512:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12539059/MAPREDUCE-4512.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-common-project/hadoop-common.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2706//console

This message is automatically generated.

TextInputFormat delimiter bug:- Input Text portion ends with Delimiter
starts with same char/char sequence
-

Key: MAPREDUCE-4512
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4512
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: contrib/mumak, mr-am, mrv1, mrv2, task
Affects Versions: 2.0.0-alpha
Environment: Lynux
Reporter: Gelesh
Labels: patch
Fix For: 0.20.204.0

Attachments: MAPREDUCE-4512.txt

Original Estimate: 1m
Remaining Estimate: 1m

TextInputFormat delimiter bug scenario , a character sequence of the input
text, in which the first character matches with the first character of
delimiter, and reaming input text character sequence matches with the entire
delimiter character sequence from the starting position of the delimiter.
eg delimiter =record;
and Text = record 1:- name = Gelesh e mail = gelesh.had...@gmail.com
Location Bangalore record 2: name = sdf .. location =Bangalorrecord 3: name

Here string =Bangalorrecord 3: satisfy two condition
1) contains the delimiter record
2) The character / character sequence immediately b4 the delimiter (ie 'r')
matches with first character (or character sequence ) of delimiter. (ie
=Bangalor ends with and Delimiter starts with same character/char sequence
'r' ),
Hear the delimiter is skipped

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Bo Wang (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428237#comment-13428237
]

Bo Wang commented on MAPREDUCE-4495:

I agree with Alejandro. The goals of workflow-AM are beyond job scheduling and
include local resource management and optimization. These goals require a tight
interaction of workflow AM and MR AM. It can be regarded as an extension to MR
AM. I noticed MAPREDUCE-3902 on reusing containers in MR AM. Workflow AM can
reuse containers across jobs, which is a more general case.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Resolved] (MAPREDUCE-3600) Add Minimal Fair Scheduler to MR2


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3600.


Resolution: Fixed

Fixed by parent ticket.

 Add Minimal Fair Scheduler to MR2
 -

 Key: MAPREDUCE-3600
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3600
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3600.v1.patch, MAPREDUCE-3600.v2.patch


 This covers the addition of the Fair Scheduler to the MR2 infrastructure. 
 This patch will represent the minimum functional FairScheduler in MR2. It 
 will be limited to a configuration file reader, functionality to calculate 
 fair shares, and hooks into the actual MR2 scheduling code. 
 It will not include delay scheduling, preemption, or a web UI, which will be 
 handled in separate JIRA's. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3602) Add Preemption to MR2 Fair Scheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3602.


Resolution: Fixed

Solved with parent ticket.

 Add Preemption to MR2 Fair Scheduler
 

 Key: MAPREDUCE-3602
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3602
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3602.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3601) Add Delay Scheduling to MR2 Fair Scheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell resolved MAPREDUCE-3601.


Resolution: Fixed

Fixed with parent ticket.

 Add Delay Scheduling to MR2 Fair Scheduler
 --

 Key: MAPREDUCE-3601
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3601
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: scheduler
Reporter: Patrick Wendell
Assignee: Patrick Wendell
 Attachments: MAPREDUCE-3601.v1.patch


 JIRA for delay scheduling component.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428257#comment-13428257
]

Arun C Murthy commented on MAPREDUCE-4495:
--

Alejandro, making MR AM thread-safe is a good goal. We can do that
independently of the new AM. I have opened MAPREDUCE-4513 for the same.

I don't which other 'private' classes you need - frankly that concerns me. It
means you are adding new requirements on MR-AM which isn't an 'api' of that
nature.

Also, if we are going that route I strongly suggest we do not import code from
Oozie and merely take JobControl api and support it. That should be a trivial
exercise without adding any new 'interfaces' to MapReduce.

So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster
thread-safe. This will allow for multiple objects of MRAppMaster to be created.
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from Oozie, come up with a
new set of interfaces etc. etc. and do it in a separate Incubator project.

As I indicated previously, my preference is option #2 and I have already
offered help to deal with the specifics so you and Bo can concentrate on
getting the code out.

Thoughts?

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Created] (MAPREDUCE-4513) Make MR AM thread-safe

Arun C Murthy created MAPREDUCE-4513:


 Summary: Make MR AM thread-safe
 Key: MAPREDUCE-4513
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4513
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy


Currently MR-AM has a bunch of statics making it thread unsafe. We should fix 
that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Owen O'Malley (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428270#comment-13428270
]

Owen O'Malley commented on MAPREDUCE-4495:
--

The Hadoop project has gone down the path of having large contrib components
before and it created substantial difficulties for the Hadoop community. Hadoop
should be about creating a platform for other projects to build on rather than
bundling all components within itself. Since many of the people interested in
working on this are in the Oozie project, it might make sense to host it there.
Otherwise incubator would be a great place to go while you build the project
and community.

Any work that you can do to help YARN become a better platform is appreciated,
but I expect there to be a lot of YARN-based frameworks and they will all need
be managed from outside of Hadoop.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428272#comment-13428272
]

Arun C Murthy commented on MAPREDUCE-4495:
--

{quote}
So, I see two options:
# Enhance JobControl api to work in AM by making MR-AM, specifially MRAppMaster
thread-safe. This will allow for multiple objects of MRAppMaster to be created.
This means there are no new interfaces to MapReduce.
# Go the full distance, make it generic, import code from oozie, come up with a
new set of interfaces for generic DAG mgmt infrastructure etc. etc. and do it
in a separate Incubator project.
{quote}

I think this is coming to a point where we are arguing too much in the
abstract. Frankly, this is really not how I want to spend my time.

Maybe we can wait for a detailed proposal from Bo or Alejandro and then revisit
this discussion. I believe I have laid my thoughts out clearly with respect to
the options etc. Let's discuss when we actually have something concrete (design
or code).

OTOH, if we can agree on the Incubator proposal I'm happy to do the legwork for
Alejandro right-away. At least that is tractable and not merely abstract.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Commented] (MAPREDUCE-4431) killing already completed job gives ambiguous message as Killed job job id

2012-08-03 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428273#comment-13428273
 ] 

Mayank Bansal commented on MAPREDUCE-4431:
--

+1 Looks good.

Thanks,
Mayank

 killing already completed job gives ambiguous message as Killed job job id
 --

 Key: MAPREDUCE-4431
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4431
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
Priority: Minor
 Attachments: MAPREDUCE-4431-1.patch, MAPREDUCE-4431.patch


 If we try to kill the already completed job by the following command it gives 
 ambiguous message as Killed job job id
 ./mapred job -kill already completed job id

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree

2012-08-03 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428279#comment-13428279
 ] 

Bikas Saha commented on MAPREDUCE-4275:
---

Thanks for incorporating my comments. +1.

Minor typo in unavailable
{code}
+if (resourceCalculatorPlugin == null) {
+LOG.info(ResourceCalculatorPlugin is unavaiable on this system. 
++ this.getClass().getName() +  is disabled.);
+return false;
+}
+if (ResourceCalculatorProcessTree.getResourceCalculatorProcessTree(0, 
processTreeClass, conf) == null) {
+LOG.info(ResourceCalculatorProcessTree is unavaiable on this system. 
++ this.getClass().getName() +  is disabled.);
{code}

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4508) YARN needs to properly check the NM,AM memory properties in yarn-site.xml and mapred.xml and report errors accordingly.

2012-08-03 Thread Anil Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428280#comment-13428280
 ] 

Anil Gupta commented on MAPREDUCE-4508:
---

Hi Hitesh,

If you think that MAPREDUCE-3796 will cover the test case of checking that 
yarn.nodemanager.resource.memory-mb   yarn.app.mapreduce.am.resource.mb and 
take appropriate actions accordingly then you can close it as dup of 
MAPREDUCE-3796.

Thanks,
Anil Gupta

 YARN needs to properly check the NM,AM memory properties in yarn-site.xml and 
 mapred.xml and report errors accordingly.
 ---

 Key: MAPREDUCE-4508
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4508
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.0-alpha
 Environment: CentOs6.0, Hadoop2.0.0 Alpha
Reporter: Anil Gupta
  Labels: Map, Reduce, YARN

 Please refer to this discussion on the Hadoop Mailing list:
 http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110
 Summary:
 I was running YARN(Hadoop2.0.0 Alpha) on a 8 datanode, 4 admin node 
 Hadoop/HBase cluster. My datanodes were only having 3.2GB of memory. So, i 
 configured the yarn.nodemanager.resource.memory-mb property in yarn-site.xml 
 to 1200. After setting the property if i run any Yarn Job then the 
 NodemManager wont be able to start any Map task since by default the 
 yarn.app.mapreduce.am.resource.mb property is set to 1500 MB in 
 mapred-site.xml. 
 Expected Behavior: NodeManager should give an error if 
 yarn.app.mapreduce.am.resource.mb = yarn.nodemanager.resource.memory-mb.
 Please let me know if more information is required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428293#comment-13428293
]

Patrick Wendell commented on MAPREDUCE-4495:

Just caught up with this - there are several issues being debated here
simultaneously.

It is really pointless to start arguing about them until we have a clear and
thorough design doc along with a preliminary discussion of technical merit.
This description needs a lot more color given the scope of the proposal.

I agree with Arun - we should wait until that happens to continue discussion.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread Alejandro Abdelnur (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428297#comment-13428297
]

Alejandro Abdelnur commented on MAPREDUCE-4495:
---

bq. Maybe we can wait for a detailed proposal from Bo or Alejandro and then
revisit this discussion. I believe I have laid my thoughts out clearly with
respect to the options etc. Let's discuss when we actually have something
concrete (design or code).

Sounds like a plan.

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428352#comment-13428352
 ] 

Jonathan Eagles commented on MAPREDUCE-4503:


+1

 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-03 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated MAPREDUCE-4503:
---

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
   0.23.3
   Status: Resolved  (was: Patch Available)

Looks great. Thanks, Bobby.

 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4323) NM leaks sockets

2012-08-03 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428392#comment-13428392
 ] 

Daryn Sharp commented on MAPREDUCE-4323:


{{FileSystem.closeAllForUGI}} is actually a reasonable approach.  Each request 
is creating a new ugi so there's no issue with pulling the rug out from 
underneath other fs users.

 NM leaks sockets
 

 Key: MAPREDUCE-4323
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
Reporter: Daryn Sharp
Priority: Critical

 The NM is exhausting its fds because it's not closing fs instances when the 
 app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4323) NM leaks sockets

2012-08-03 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned MAPREDUCE-4323:
--

Assignee: Daryn Sharp

 NM leaks sockets
 

 Key: MAPREDUCE-4323
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4323
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical

 The NM is exhausting its fds because it's not closing fs instances when the 
 app is finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails

2012-08-03 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4466:
--

Fix Version/s: (was: trunk)
 Target Version/s: 2.2.0-alpha
Affects Version/s: 0.23.3
   Status: Open  (was: Patch Available)

Looks like actual log rendering will also be broken - further up in 
ContainerLogsPage {{new File(this.dirsHandler.getLogPathToRead(}}. Also, 
changing {{getContainerLogDirs}} may be a cleaner fix.

If testNMWebServer.testNMWebApp is modified to use file:// - it ends up 
creating a dir structure with file:// being the top level directory under the 
current working dir. That could be modified to verify the patch.

All access to the local-dirs and log-dirs happens via the 
LocalDirsHandlerService - maybe we should have this convert URIs to simple 
strings. file:// works in other places - since {{Path}} is used instead of 
{{File}}.

 Using URI for yarn.nodemanager log dirs fails
 -

 Key: MAPREDUCE-4466
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Eli Collins
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4466-trunk-v1.patch


 If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs 
 or yarn.nodemanager.remote-app-log-dir the container log servlet fails with 
 an NPE (works if I remove the file scheme). Using a URI for 
 yarn.nodemanager.local-dirs works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428398#comment-13428398
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2572 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2572/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = FAILURE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

Jason Lowe created MAPREDUCE-4514:
-

 Summary: Symlinks to peer distributed cache files no longer work
 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe


Trying to create a symlink to another file that is specified for the 
distributed cache will fail to create the link.  For example:

hadoop jar ... -files x,y,x#z

will localize the files x and y as x and y, but the z symlink for x will not be 
created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Siddharth Seth (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428412#comment-13428412
]

Siddharth Seth commented on MAPREDUCE-3902:
---

@Tsuyoshi; I'd spoken with Vinod and others about this a while ago. Should have
posted this earlier.. Adding the functionality to the AM in the current state
is possible - but will further complicate some components which are already
quite complicated - and tough to change.

The TaskAttempt state machine is currently really a mix of TaskAttempt
transitions as well as Container transitions. The RMContaienrAllocator is also
dealing with more than it should - Nodes, Containers as well as scheduling.

The idea was to split the functionality into a separate TaskAttempt, Container
and Node state machine, along with reduced functionality in the scheduler (also
decoupling the RM request and AM scheduling). This would make the code cleaner
and make re-use (as well as other improvements like handling retired nodes)
easier to implement.

Had worked with Vinod on the state transitions, and have been working on the
implementation in bits and pieces to see how feasible it is. The code is at
https://github.com/sidseth/h2-container-reuse . It's a little bit of a mess at
the moment, with lots of TODOs, etc splattered all over, but is just about
functional. There's no explicit re-use scheduling yet - but re-use can be
tested by running a job which requires more containers than available on the
cluster (and some config changes).

bq. the 2nd topic(combining per container) should be moved, because the change
seems to be too big.
I believe this was, at least initially, meant to ensure that output from all
taskAttempts in one container, would be fetched only once by a reducer (without
a common combiner). Either way, that could be a separate jira.

MR AM should reuse containers for map tasks, there-by allowing fine-grained
control on num-maps for users without need for CombineFileInputFormat etc.
--

Key: MAPREDUCE-3902
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch

The MR AM is now in a great position to reuse containers across (map) tasks.
This is something similar to JVM re-use we had in 0.20.x, but in a
significantly better manner:
# Consider data-locality when re-using containers
# Consider the new shuffle - ensure that reduces fetch output of the whole
container at once (i.e. all maps)

[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428422#comment-13428422
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2618 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2618/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428425#comment-13428425
 ] 

Hudson commented on MAPREDUCE-4503:
---

Integrated in Hadoop-Common-trunk-Commit #2553 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2553/])
MAPREDUCE-4503. Should throw InvalidJobConfException if duplicates found in 
cacheArchives or cacheFiles (Robert Evans via jeagles) (Revision 1369197)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1369197
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java


 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4495) Workflow Application Master in YARN

2012-08-03 Thread eric baldeschwieler (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428429#comment-13428429
]

eric baldeschwieler commented on MAPREDUCE-4495:

Agree with discussing a particular proposal.

I want to point out that the whole point of YARN is to open up the ability to
try lots of different changes to MR and to implement lots of alternatives to it
in parallel. As a community, we need to be clear that to move fast we need to
let lots of different people try lots of different things on top of a stable
platform. Pig and Hive folks want to radically change what MR is. There are
lots of different ideas for how to do this.

With open APIs everyone is empowered to try new things without asking to get
their code into the core project. If we don't embrace the principle of new AMs
starting outside the core, we are going to have a huge number of arguments like
this without making anyone happy. That's not the best way for us to spend our
time. I'm not trying to stop anyone from trying anything, I'm trying to reduce
friction.

My last point is the overhead argument. Arguing that one doesn't want to go to
incubator because that adds cost to your project really doesn't look at the
whole picture. Adding a new module or sub-project to an existing Apache
project creates as much work as doing it in the incubator. It just tosses that
work into the lap of the folks maintaining the existing project. When one
talks about Apache being about community before code, that doesn't mean one has
a right to do anything in the code. One needs to first build consensus that
your coding idea is aligned with the community. Any time you add something to
a project, you are implicitly asking the others in the community to do a lot of
work to support you. That only makes sense if you are working in a direction
that the community sees as aligned with the larger goals of the project.

Going full circle, Yarn's open APIs have as a goal allowing people to try a lot
more things much less expensively. They don't need to get permission to merge
their work into MR, which is good for experimenters. Hadoop committers are not
burdened with vetting and support many different experiments in Hadoop. The
experimenters carry the burden of building community and supporting / selling
their ideas. This should save us a lot of time arguing on this list! ;-)

Workflow Application Master in YARN
---

Key: MAPREDUCE-4495
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4495
Project: Hadoop Map/Reduce
Issue Type: New Feature
Affects Versions: 2.0.0-alpha
Reporter: Bo Wang
Assignee: Bo Wang

[jira] [Updated] (MAPREDUCE-4275) Plugable process tree


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4275:
---

Attachment: plugable-pstree-6-typofix.txt

typo fixed

 Plugable process tree
 -

 Key: MAPREDUCE-4275
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4275
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
 Environment: FreeBSD 64 bit
Reporter: Radim Kolar
 Attachments: plugable-pstree-1.txt, plugable-pstree-2.txt, 
 plugable-pstree-3.txt, plugable-pstree-4-with-whitespace.txt, 
 plugable-pstree-4.txt, plugable-pstree-5-with-whitespace.txt, 
 plugable-pstree-6-typofix.txt, plugable-pstree.txt


 Trunk version of Pluggable process tree. Work based on MAPREDUCE-4204

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428472#comment-13428472
 ] 

Jason Lowe commented on MAPREDUCE-4514:
---

This also breaks when trying to create multiple symlinks to the same file, 
e.g.: {{x#a,x#b,x#c}} only creates the symlink for {{a}} instead of all three.

The problem is Container holds a map from resource Path to symlink String, but 
there could be multiple symlinks to the same source Path.

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe

 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4515) Add test to check if userlogs are retained across TaskTracker restarts

2012-08-03 Thread Karthik Kambatla (JIRA)

Karthik Kambatla created MAPREDUCE-4515:
---

 Summary: Add test to check if userlogs are retained across 
TaskTracker restarts
 Key: MAPREDUCE-4515
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4515
 Project: Hadoop Map/Reduce
  Issue Type: Test
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4514:
--

Attachment: MAPREDUCE-4514.patch

Patch that changes Container to map pending and localized resources to 
ListString instead of String so resources can have multiple symlink 
destinations.

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4514.patch


 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-08-03 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428482#comment-13428482
 ] 

Mayank Bansal commented on MAPREDUCE-4367:
--

I don't see this in trunk. Is it still the issue?

Thanks,
Mayank

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor

 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-4514:
--

Target Version/s: 0.23.3, 2.2.0-alpha
  Status: Patch Available  (was: Open)

 Symlinks to peer distributed cache files no longer work
 ---

 Key: MAPREDUCE-4514
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4514.patch


 Trying to create a symlink to another file that is specified for the 
 distributed cache will fail to create the link.  For example:
 hadoop jar ... -files x,y,x#z
 will localize the files x and y as x and y, but the z symlink for x will not 
 be created.  This is a regression from 1.x behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428492#comment-13428492
 ] 

Jason Lowe commented on MAPREDUCE-4367:
---

Yes, it's still happening for me.  From a recent trunk pull on a single-node 
cluster where the history server isn't running yet:

{noformat}
$ mapred job -kill job_1344038428359_0002
2012-08-04 00:09:56,871 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(255)) - Application state is completed. 
FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-08-04 00:09:57,886 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 0 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:09:58,887 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 1 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:09:59,890 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 2 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2012-08-04 00:10:00,891 INFO  ipc.Client 
(Client.java:handleConnectionFailure(715)) - Retrying connect to server: 
includespoke.champ.corp.yahoo.com/10.74.91.112:10020. Already tried 3 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
...
{noformat}

And here's what it says after I start the history server:

{noformat}
$ mapred job -kill job_1344038428359_0002
2012-08-04 00:12:52,226 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(255)) - Application state is completed. 
FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2012-08-04 00:12:53,195 INFO  mapred.ResourceMgrDelegate 
(ResourceMgrDelegate.java:killApplication(329)) - Killing application 
application_1344038428359_0002
Killed job job_1344038428359_0002
{noformat}

Note that in both cases it says the application state is completed and is 
redirecting.  If the application state is completed, there's no point in 
redirecting to the history server if we're trying to kill the application.  
Knowing the application state is completed means we can short-circuit the kill 
attempt before the redirect.

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor

 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4514) Symlinks to peer distributed cache files no longer work

[
https://issues.apache.org/jira/browse/MAPREDUCE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428522#comment-13428522
]

Hadoop QA commented on MAPREDUCE-4514:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12539121/MAPREDUCE-4514.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2707//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2707//console

This message is automatically generated.

Symlinks to peer distributed cache files no longer work
---

Key: MAPREDUCE-4514
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4514
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: distributed-cache, mrv2
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
Attachments: MAPREDUCE-4514.patch

Trying to create a symlink to another file that is specified for the
distributed cache will fail to create the link. For example:
hadoop jar ... -files x,y,x#z
will localize the files x and y as x and y, but the z symlink for x will not
be created. This is a regression from 1.x behavior.

[jira] [Commented] (MAPREDUCE-4275) Plugable process tree