[jira] [Commented] (MAPREDUCE-4460) Refresh queue throws IO exception after configuring wrong queue capacity

2012-08-13 Thread nemon lou (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432985#comment-13432985
 ] 

nemon lou commented on MAPREDUCE-4460:
--

The same to MAPREDUCE-3763.Any updates here?

 Refresh queue throws IO exception after configuring wrong queue capacity
 

 Key: MAPREDUCE-4460
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4460
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha
Reporter: Nishan Shetty
Assignee: Arun C Murthy
Priority: Critical

 Scenario:
 1.My setup has a,b queues(each with capacity say 50%) under root queue
 2.Start the process
 3.Add one more queue 'c' under root
 4.Configure some capacity for 'c' such that total capacity of a,b,c is not 
 equal to 100
 5.Now do refresh queues, it will throw exception as wrong capacity(This is 
 expected as capacity was not equal to 100).
 6.Now reconfigure queue capacities of a,b,c such that total capacity is 100
 5.Now do refresh queues again
 Observed that it throws IO exception
 {noformat}
 java.io.IOException: Failed to re-init queues
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:216)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:174)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62)
 at 
 org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
 Caused by: org.apache.hadoop.metrics2.MetricsException: Metrics source 
 QueueMetrics,q0=root,q1=c already exists!
 at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:126)
 at 
 org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:107)
 at 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:216)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.forQueue(QueueMetrics.java:129)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.forQueue(QueueMetrics.java:119)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:136)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:313)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:246)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:213)
 ... 11 more
  at LocalTrace:
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 Failed to re-init queues
 at 
 org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:50)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:40)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:184)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62)
 at 
 org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
 at 

[jira] [Commented] (MAPREDUCE-3542) Support FileSystemCounter legacy counter group name for compatibility

2012-08-13 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432987#comment-13432987
 ] 

Jarek Jarcec Cecho commented on MAPREDUCE-3542:
---

Hi Guys,
I was investigating related issue in Sqoop  project(http://sqoop.apache.org/). 
Basically we are reporting number of written filesystem bytes back to the user 
and on Hadoop 0.23/2.x we're always getting 0. I've noticed that there was some 
refactorization in FileSystem counter related code and found this issue 
requesting backward compatibility. 

Included patch seems to be adding counter FileSystemCounter:

{code:title=hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java:84}
legacyMap.put(FileSystemCounter, FileSystemCounter.class.getName());
{code}

But it appears that original name is FileSystemCounters (Notice the plural 
s at the end of name):

{code:title=src/mapred/org/apache/hadoop/mapred/Task.java:91 (0.20.2)}
protected static final String FILESYSTEM_COUNTER_GROUP = FileSystemCounters;
{code}

{code:title=src/mapred/org/apache/hadoop/mapred/Task.java:109 (1.0.3)}
protected static final String FILESYSTEM_COUNTER_GROUP = FileSystemCounters;
{code}

I therefore believe that this counter should be renamed in order to provide 
backward compatibility. I might fix this discrepancy in Sqoop, but I believe 
that other projects/users might also be affected and therefore it would be 
better to fix it in upstream. I wanted to reopen this ticket, but apparently I 
do not have enough privileges to do so. Could I ask anyone with proper 
privileges to do that or should I create new JIRA instead?


 Support FileSystemCounter legacy counter group name for compatibility
 ---

 Key: MAPREDUCE-3542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3542.patch


 The group name changed from FileSystemCounter to 
 org.apache.hadoop.mapreduce.FileSystemCounter, but we should support the 
 old one for compatibility's sake. This came up in PIG-2347. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3542) Support FileSystemCounter legacy counter group name for compatibility

2012-08-13 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated MAPREDUCE-3542:
--

Attachment: MAPREDUCE-3542-name-fix.patch

I've provided fix to rename the backward compatible name to the original name 
that is present in Hadoop 0.20 and Hadoop 1.0.3.

I've also checked linked Pig sources to see if this change won't break them. 
They seems to have special classes for handling differences between hadoop 
versions:

{code:title=shims/src/hadoop23/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java:83}
static public String getFsCounterGroupName() {
  return org.apache.hadoop.mapreduce.FileSystemCounter;
}
{code}

{code:title=shims/src/hadoop20/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java:85}
static public String getFsCounterGroupName() {
  return FileSystemCounters;
}
{code}

I therefore believe that this change will not affect pig.

 Support FileSystemCounter legacy counter group name for compatibility
 ---

 Key: MAPREDUCE-3542
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3542
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.1

 Attachments: MAPREDUCE-3542-name-fix.patch, MAPREDUCE-3542.patch


 The group name changed from FileSystemCounter to 
 org.apache.hadoop.mapreduce.FileSystemCounter, but we should support the 
 old one for compatibility's sake. This came up in PIG-2347. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4549) Distributed cache conflicts breaks backwards compatability

2012-08-13 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created MAPREDUCE-4549:
--

 Summary: Distributed cache conflicts breaks backwards compatability
 Key: MAPREDUCE-4549
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4549
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Critical


I recently put in MAPREDUCE-4503 which went a bit too far, and broke backwards 
compatibility with 1.0 in distribtued cache entries.  This is to change the 
behavior of the distributed cache to more closely match that of 1.0.

In 1.0 when adding in a cache archive link the first link would win (be the one 
that was created), not the last one as is the current behavior, when there were 
conflicts then all of the others were ignored and just did not get a symlink 
created, and finally no symlink was created for archives that had did not have 
a fragment in the URL.  

To simulate this behavior after we parse the cache files and cache archives 
configuration we should walk through all conflicting links and pick the first 
link that has a fragment to win.  If no link has a fragment then it is just the 
first link wins.  All other conflicting links will have a warning an the name 
of the link will be changed to include a UUID.  If the same file is both in the 
distributed cache as a cache file and a cache archive we will throw an 
exception, for backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames

2012-08-13 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433174#comment-13433174
 ] 

Robert Joseph Evans commented on MAPREDUCE-4538:


Correct and since this JIRA is messed up still, I will probably just switch 
over to MAPREDUCE-4053, and post my patch there.

 add Legacy Counter support to getGroupNames
 ---

 Key: MAPREDUCE-4538
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4538.txt


 Oozie loops through counters using getGroupNames().  This does not include 
 with it legacy counter names, so they get missed, and can result in a 
 backwards compatibility issue in the oozie counter API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4332) Add a yarn-client module

2012-08-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433179#comment-13433179
 ] 

Jason Lowe commented on MAPREDUCE-4332:
---

Took a quick look, in general looks great.  Couple of comments:

* Need to update patch after yarn moved out of mapreduce in YARN-1.
* We should probably mark YarnClient as Public and Unstable or Evolving.
* Would also be nice to add javadocs to the interface methods since most other 
public interfaces have them.

 Add a yarn-client module
 

 Key: MAPREDUCE-4332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4332
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.0.0-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.1.0-alpha

 Attachments: MAPREDUCE-4332-20120621.txt, 
 MAPREDUCE-4332-20120621-with-common-changes.txt, MAPREDUCE-4332-20120622.txt, 
 MAPREDUCE-4332-20120625.txt


 I see that we are duplicating (some) code for talking to RM via client API. 
 In this light, a yarn-client module will be useful so that clients of all 
 frameworks can use/extend it.
 And that same module can be the destination for all the YARN's command line 
 tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reassigned MAPREDUCE-4053:
--

Assignee: Robert Joseph Evans

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans

 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4053:
---

Attachment: MR-4053.txt

This adds in deprecated names to the getCounterNames.

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4053:
---

Target Version/s: 0.23.3, 2.1.0-alpha, 3.0.0
  Status: Patch Available  (was: Open)

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433202#comment-13433202
 ] 

Hadoop QA commented on MAPREDUCE-4053:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540665/MR-4053.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2724//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2724//console

This message is automatically generated.

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433237#comment-13433237
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

To make the reviewing this patch easier, I am dividing this patch  into smaller 
patches. I am opening sub tasks under this jira issue and attaching the patches 
to those liras.

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, 
 MR_4491_1.1.patch, MR_4491_trunk.patch


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4550:
---

 Summary: Key Protection : Define Encryption and Key Protection 
interfaces and default implementations
 Key: MAPREDUCE-4550
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony


A secret key is read from a Key Store and then encrypted during transport 
between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
and provide the secrets to child tasks which part of the job.

This jira defines the interfaces to accomplish the above :

1) KeyProvider - to read keys from a KeyStore

2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.

The default/dummy implementations will also be added. This includes a 
KeyProvider implementation to read keys from a Java KeyStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4550:


Attachment: MR_4550_1_1.patch
MR_4550_trunk.patch

 Key Protection : Define Encryption and Key Protection interfaces and default 
 implementations
 

 Key: MAPREDUCE-4550
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4550_1_1.patch, MR_4550_trunk.patch


 A secret key is read from a Key Store and then encrypted during transport 
 between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets 
 and provide the secrets to child tasks which part of the job.
 This jira defines the interfaces to accomplish the above :
 1) KeyProvider - to read keys from a KeyStore
 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data.
 The default/dummy implementations will also be added. This includes a 
 KeyProvider implementation to read keys from a Java KeyStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4551:
---

 Summary: Key Protection :  Add ability to read keys and protect 
keys  in  JobClient and TTS/NodeManagers
 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


The following requirements are addressed.

•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.

Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4551:


Description: 
Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.

The task addresses the following requirements:


•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.


  was:
The following requirements are addressed.

•   Plug in different key store mechanisms.
•   Retrieve specified keys from a configured keystore as part of job 
submission
•   Protect keys during its transport through the cluster.
•   Make sure that keys are handed over only to the tasks of the correct 
job.

Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  to 
decrypt the job's secrets.
Based on Job configuration, JobClient reads secrets from a KeyStore using a 
Keyprovider implementation and encrypts them using the cluster's public key.

The encrypted secrets are stored in Job Credentials.


 Key Protection :  Add ability to read keys and protect keys  in  JobClient 
 and TTS/NodeManagers
 ---

 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch


 Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  
 to decrypt the job's secrets.
 Based on Job configuration, JobClient reads secrets from a KeyStore using a 
 Keyprovider implementation and encrypts them using the cluster's public key.
 The encrypted secrets are stored in Job Credentials.
 The task addresses the following requirements:
 • Plug in different key store mechanisms.
 • Retrieve specified keys from a configured keystore as part of job 
 submission
 • Protect keys during its transport through the cluster.
 • Make sure that keys are handed over only to the tasks of the correct 
 job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4551:


Attachment: MR_4551_trunk.patch
MR_4551_1_1.patch

 Key Protection :  Add ability to read keys and protect keys  in  JobClient 
 and TTS/NodeManagers
 ---

 Key: MAPREDUCE-4551
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch


 Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters  
 to decrypt the job's secrets.
 Based on Job configuration, JobClient reads secrets from a KeyStore using a 
 Keyprovider implementation and encrypts them using the cluster's public key.
 The encrypted secrets are stored in Job Credentials.
 The task addresses the following requirements:
 • Plug in different key store mechanisms.
 • Retrieve specified keys from a configured keystore as part of job 
 submission
 • Protect keys during its transport through the cluster.
 • Make sure that keys are handed over only to the tasks of the correct 
 job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4552:
---

 Summary: Encryption:  Add support for PGP Encryption
 Key: MAPREDUCE-4552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony


Provide support for PGP encryption by implementing Encrypter and Decrypter 
interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
data. 

Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
transparently like compressed data . The aliases to the keys can be specified 
as part of Job. 

Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
data in cluster.  They include

1.  DistributedSplitter – Split an encrypted file into smaller files.
2.  DistributedEncrypter – encrypt files in a cluster.
3.  DistributedDecrypter – decrypt encrypted files in a cluster.
4.  DistributedRecrypter – decrypt an encrypted file and encrypt it with 
another key.

Uitlities are added to encrypt/decrypt files in local file system

1.  Genkey - Generate an asymmetric key pair (public and private keys) of a 
specified strength
2.  Encrypt - Encrypt a file 
3.  Decrypt – Decrypt a file

Added as a contrib project -  hadoop-crypto.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4552:


Attachment: MR_4552_1_1.patch
MR_4552_trunk.patch

 Encryption:  Add support for PGP Encryption
 ---

 Key: MAPREDUCE-4552
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4552_1_1.patch, MR_4552_trunk.patch


 Provide support for PGP encryption by implementing Encrypter and Decrypter 
 interfaces defined in MAPREDUCE-4450.  This can be used by the cluster to 
 protect the job secrets. This also be used map reduce jobs to encrypt/decrypt 
 data. 
 Add PGPCodec as a CompressionCodec  so that encrypted data can be processed 
 transparently like compressed data . The aliases to the keys can be specified 
 as part of Job. 
 Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the 
 data in cluster.  They include
 1.DistributedSplitter – Split an encrypted file into smaller files.
 2.DistributedEncrypter – encrypt files in a cluster.
 3.DistributedDecrypter – decrypt encrypted files in a cluster.
 4.DistributedRecrypter – decrypt an encrypted file and encrypt it with 
 another key.
 Uitlities are added to encrypt/decrypt files in local file system
 1.Genkey - Generate an asymmetric key pair (public and private keys) of a 
 specified strength
 2.Encrypt - Encrypt a file 
 3.Decrypt – Decrypt a file
 Added as a contrib project -  hadoop-crypto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4553:
---

 Summary: Key Protection :  Implement KeyProvider to read key from 
a WebService Based KeyStore
 Key: MAPREDUCE-4553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


Normally keys have to be stored in a central location suing custom key 
management system.  organizations can implement KeyProvider to integrate their 
custom key management system to Hadoop. This interface is specified in 
MAPREDUCE-4550

Optionally , developers can use Safe to integrate custom key management system 
with Hadoop. 
Safe is an open source web service based keystore to securely store secret keys 
and passwords. 
Safe authenticates the user using SPNego, checks whether the user is authorized 
to read the secret and returns the secret. 
It is easy to plug in different mechanisms for authentication,authorization and 
Key storage. 
Safe is kept as a separate open source project at 
(http://benoyantony.github.com/safe/)

The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4553:


Attachment: MR_4553_trunk.patch
MR_4553_1_1.patch

 Key Protection :  Implement KeyProvider to read key from a WebService Based 
 KeyStore
 

 Key: MAPREDUCE-4553
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4553_1_1.patch, MR_4553_trunk.patch


 Normally keys have to be stored in a central location suing custom key 
 management system.  organizations can implement KeyProvider to integrate 
 their custom key management system to Hadoop. This interface is specified in 
 MAPREDUCE-4550
 Optionally , developers can use Safe to integrate custom key management 
 system with Hadoop. 
 Safe is an open source web service based keystore to securely store secret 
 keys and passwords. 
 Safe authenticates the user using SPNego, checks whether the user is 
 authorized to read the secret and returns the secret. 
 It is easy to plug in different mechanisms for authentication,authorization 
 and Key storage. 
 Safe is kept as a separate open source project at 
 (http://benoyantony.github.com/safe/)
 The hadoop proxy to safe is added as a contrib project -  hadoop-safe. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: (was: MR_4491_1.1.patch)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Attachment: (was: MR_4491_trunk.patch)

 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4491:


Description: 
When dealing with sensitive data, it is required to keep the data encrypted 
wherever it is stored. Common use case is to pull encrypted data out of a 
datasource and store in HDFS for analysis. The keys are stored in an external 
keystore. 

The feature adds a customizable framework to integrate different types of 
keystores, support for Java KeyStore, read keys from keystores, and transport 
keys from JobClient to Tasks.
The feature adds PGP encryption as a codec and additional utilities to perform 
encryption related steps.


The design document is attached. It explains the requirement, design and use 
cases.
Kindly review and comment. Collaboration is very much welcome.

I have a tested patch for this for 1.1 and will upload it soon as an initial 
work for further refinement.

Update: The patches are uploaded to subtasks. 








  was:
When dealing with sensitive data, it is required to keep the data encrypted 
wherever it is stored. Common use case is to pull encrypted data out of a 
datasource and store in HDFS for analysis. The keys are stored in an external 
keystore. 

The feature adds a customizable framework to integrate different types of 
keystores, support for Java KeyStore, read keys from keystores, and transport 
keys from JobClient to Tasks.
The feature adds PGP encryption as a codec and additional utilities to perform 
encryption related steps.


The design document is attached. It explains the requirement, design and use 
cases.
Kindly review and comment. Collaboration is very much welcome.

I have a tested patch for this for 1.1 and will upload it soon as an initial 
work for further refinement. 









 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)
Benoy Antony created MAPREDUCE-4554:
---

 Summary: Job Credentials are not transmitted if security is turned 
off
 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony


Credentials (secret keys) can be passed to a job via 
mapreduce.job.credentials.json or mapreduce.job.credentials.binary .

These credentials get submitted during job submission and are made available to 
the task processes.

In HADOOP 1, these credentials get submitted and routed to task processes even 
if security was off.
In HADOOP 2 , these credentials are transmitted only when the security is 
turned on.

This should be changed for two reasons:
1) It is not backward compatible.

2) Credentials should be passed even if security is turned off .


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Attachment: MR_4554_1_1.patch
MR_4554_trunk.patch

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated MAPREDUCE-4554:


Affects Version/s: 2.0.0-alpha

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433301#comment-13433301
 ] 

Benoy Antony commented on MAPREDUCE-4554:
-

The patch adds a test case for 1.1  and 2.0 .
It also removes security on/off checks when transmitting credentials.

 Job Credentials are not transmitted if security is turned off
 -

 Key: MAPREDUCE-4554
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission, security
Affects Versions: 2.0.0-alpha
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch


 Credentials (secret keys) can be passed to a job via 
 mapreduce.job.credentials.json or mapreduce.job.credentials.binary .
 These credentials get submitted during job submission and are made available 
 to the task processes.
 In HADOOP 1, these credentials get submitted and routed to task processes 
 even if security was off.
 In HADOOP 2 , these credentials are transmitted only when the security is 
 turned on.
 This should be changed for two reasons:
 1) It is not backward compatible.

 2) Credentials should be passed even if security is turned off .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1340#comment-1340
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4511:
---

We need HADOOP-7754 in order for this patch to apply and be test-patch-ed, in 
the mean time, a better name for the method {{determineFd}} would be 
{{getFileDescriptorIfAvail}}.

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_trunk.patch, 
 MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, 
 MAPREDUCE-4511_trunk_rev4.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy

2012-08-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1348#comment-1348
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4469:
---

Ahmed, your suggested approach means that in the case of streaming jobs we may 
lose the info for up to 9 updates (default), right?

 Resource calculation in child tasks is CPU-heavy
 

 Key: MAPREDUCE-4469
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: performance, task
Affects Versions: 1.0.3
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4068) Jars in lib subdirectory of the submittable JAR are not added to the classpath

2012-08-13 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433363#comment-13433363
 ] 

Robert Kanter commented on MAPREDUCE-4068:
--

The TestTaskTrackerLocalization file looks like it tests this, but only for mr1

 Jars in lib subdirectory of the submittable JAR are not added to the classpath
 --

 Key: MAPREDUCE-4068
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4068
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Priority: Blocker
 Fix For: 2.2.0-alpha


 Prior to hadoop 0.23, users could add third party jars to the lib 
 subdirectory of the submitted job jar and they become available in the task's 
 classpath. I see this functionality was in TaskRunner.java, but I can't see 
 similar functionality in hadoop 0.23 (neither in MapReduceChildJVM.java nor 
 other places).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection

2012-08-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433401#comment-13433401
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-

One of the goals of this feature is to achieve encryption of files in transit 
and at rest(when stored on disk). One way to achieve this goal is to depend on 
a software/hardware which allows encryption in the local file system plus rely 
on HDFS-3637  and MR shuffle encryption.

This jira  explores an alternative approach to the problem without depending on 
s special software to do local file system encryption. 

The key advantages of this approach over the local file system encryption 
approach are

1)  A file can be decrypted only if the user provides the correct key. So even 
if someone managed to read the file, he cannot read its contents without key. 
So user's possession of the key is required in addition to his read permission. 
So there are two levels of protection. 

There could be cases where a user accidentally set read permissions for 
everyone. There could be cases where a superuser reads the file. But  this 
scheme protects the data.

2) No dependency on local file system encryption software.  This approach 
allows encryption without such special setup.

3) A file is decrypted/encrypted only during processing and not when it is 
read.  So this results in a less number of encryption/decryption.


Other key points will be :

1) Encrypted and plain text files can coexist in a normal file system. 

2) Developers can plugin other encryption algorithms/standards - CMS, AES, 
custom encryption and thus have more flexibility.

3) Allows transporting keys/password/tokens  from JobClient to tasks for use 
cases other than encryption like connecting to a webservice . MAPREDUCE-4491 
adds keyProtection and encryption uses it.

4) Can manage keys in one central location. JobClient  gets on behalf of user 
like any other application. 

If we look at these two approaches from a higher level, we can see that one 
local file system approach is an internal approach to encryption and 
MAPREDUCE-4491 approach is an external approach. These two choices are 
available in normal (non-distributed) application development also where 
developers can rely on the file system to provide encryption or do encryption 
themselves. There are tradeoffs and flexibilities in the both the approaches 
and we choose it based on our use cases and needs.  So I believe , we should 
provide  these two alternatives  in Hadoop.

In addition, this feature allows key protection in general, which can be used 
for purposes other than encryption. The keys also will be encrypted when stored 
on disk and decrypted only in memory.


 Encryption and Key Protection
 -

 Key: MAPREDUCE-4491
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: documentation, security, task-controller, tasktracker
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf


 When dealing with sensitive data, it is required to keep the data encrypted 
 wherever it is stored. Common use case is to pull encrypted data out of a 
 datasource and store in HDFS for analysis. The keys are stored in an external 
 keystore. 
 The feature adds a customizable framework to integrate different types of 
 keystores, support for Java KeyStore, read keys from keystores, and transport 
 keys from JobClient to Tasks.
 The feature adds PGP encryption as a codec and additional utilities to 
 perform encryption related steps.
 The design document is attached. It explains the requirement, design and use 
 cases.
 Kindly review and comment. Collaboration is very much welcome.
 I have a tested patch for this for 1.1 and will upload it soon as an initial 
 work for further refinement.
 Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4518:


Attachment: trunk-MR-4518.patch

Updated the patch for trunk
- Added constructor to FSQueueSchedulable for testing purposes
- Test checks if the demand is less than or equal to maxResources
- Verified right number of iterations via the logs in the loop in updateDemand()

 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, 
 trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433465#comment-13433465
 ] 

Ahmed Radwan commented on MAPREDUCE-4511:
-

Here are the updated patches with the new method name.

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, 
 MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, 
 MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, 
 MAPREDUCE-4511_trunk_rev5.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4511:


Attachment: MAPREDUCE-4511_trunk_rev5.patch

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, 
 MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, 
 MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, 
 MAPREDUCE-4511_trunk_rev5.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4511:


Attachment: MAPREDUCE-4511_branch-1_rev5.patch

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, 
 MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, 
 MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, 
 MAPREDUCE-4511_trunk_rev5.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433470#comment-13433470
 ] 

Hadoop QA commented on MAPREDUCE-4511:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12540738/MAPREDUCE-4511_trunk_rev5.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 javac.  The patch appears to cause the build to fail.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2725//console

This message is automatically generated.

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, 
 MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, 
 MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, 
 MAPREDUCE-4511_trunk_rev5.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles

2012-08-13 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4503:
---

Fix Version/s: (was: 0.23.3)

I just pulled this out of 0.23.3.  We may add it back in later once we 
determine how MAPREDUCE-4549 is to be addressed.

 Should throw InvalidJobConfException if duplicates found in cacheArchives or 
 cacheFiles
 ---

 Key: MAPREDUCE-4503
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: MR-4503.txt, MR-4503.txt


 in 1.0 if a file was both in a jobs cache archives and cache files, and 
 InvalidJobConfException was thrown.  We should replicate this behavior on 
 mrv2.  We should also extend it so that if a cache archive or cache file is 
 not going to be downloaded at all because of conflicts in the names of the 
 symlinks a similar exception is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running

2012-08-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4288:
---

Assignee: (was: Karthik Kambatla)

Couldn't get around to this, marking it as unassigned should anyone be 
interested. 

Will pick it up again if still available later.

 ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one 
 when no job is running
 ---

 Key: MAPREDUCE-4288
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Nishan Shetty

 When no job is running in the cluster invoke the ClusterStatus.getMapTasks() 
 and ClusterStatus.getReduceTasks() API's
 Observed that these API's are returning one instead of zero(as no job is 
 running)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4289) JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values

2012-08-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned MAPREDUCE-4289:
---

Assignee: (was: Karthik Kambatla)

Marking as unassigned, should anyone be interested in working on it.

Will come back to this if still available when time permits.

 JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving 
 any values
 

 Key: MAPREDUCE-4289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Nishan Shetty

 1.Run a simple job
 2.Invoke JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's
 Observe that these API's are giving zeros instead of showing map/reduce 
 progress

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433510#comment-13433510
 ] 

Karthik Kambatla commented on MAPREDUCE-4518:
-

Given that this concerns YARN, should I convert this into a YARN issue?

 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, 
 trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4455) RMAppImpl state machine does not handle event ATTEMPT_KILLED at ACCEPTED

2012-08-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4455:
--

   Fix Version/s: (was: trunk)
Target Version/s: 2.2.0-alpha
  Status: Open  (was: Patch Available)

Mayank, these transitions don't seem to be valid. I got this wrong while 
talking to you about this earlier... sorry about that.

The unit test is broken in this case. The InlineDispatcher, which is used for 
this test and others, behaves differently from the regular dispatcher. It 
handles all events inline, as against handling one event to completion before 
processing the next one.
For this test - the App should have transitioned to the KILLED state, before 
the ATTEMPT_KILLED event came in.
The DrainDispatcher seems like a much better option for unit tests.

 RMAppImpl state machine does not handle event ATTEMPT_KILLED at ACCEPTED
 

 Key: MAPREDUCE-4455
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4455
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4455-trunk-v1.patch


 TestRMAppTransitions#testAppSubmittedKilled causes an invalid event exception 
 but the test doesn't catch the error since the final app state is still 
 killed.  Killed for the wrong reason, but the final state is the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead

2012-08-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated MAPREDUCE-4511:
---

Component/s: performance

 Add IFile readahead
 ---

 Key: MAPREDUCE-4511
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2, performance
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4511_branch1.patch, 
 MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, 
 MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, 
 MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, 
 MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, 
 MAPREDUCE-4511_trunk_rev5.patch


 This ticket is to add IFile readahead as part of HADOOP-7714.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433599#comment-13433599
 ] 

Thomas Graves commented on MAPREDUCE-4053:
--

+1 Thanks Bobby!

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4053:
-

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
   2.1.0-alpha
   0.23.3
   Status: Resolved  (was: Patch Available)

 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433611#comment-13433611
 ] 

Hudson commented on MAPREDUCE-4053:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2639 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2639/])
MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over 
group names deprecated names don't show up  (Robert Evans via tgraves) 
(Revision 1372636)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java


 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433622#comment-13433622
 ] 

Hudson commented on MAPREDUCE-4053:
---

Integrated in Hadoop-Common-trunk-Commit #2574 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2574/])
MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over 
group names deprecated names don't show up  (Robert Evans via tgraves) 
(Revision 1372636)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java


 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4555) make user's mapred .staging area permissions configurable

2012-08-13 Thread Alexander Alten-Lorenz (JIRA)
Alexander Alten-Lorenz created MAPREDUCE-4555:
-

 Summary: make user's mapred .staging area permissions configurable
 Key: MAPREDUCE-4555
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4555
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, job submission
Affects Versions: 1.0.3
Reporter: Alexander Alten-Lorenz


The directories are created in JobTracker and LocalRunner, but they are 
currently forced to be 0700. There is even a segment of the source code that 
will check the permissions are 0700, and if not it will change the permissions 
to match 0700. For monitoring purposes the permissions should be configurable.

Please note:
1. We can make the hard-coded 700 configurable at clients (its the client who 
creates it) but there's two issues here: 
1.1. It violates security principals (as its client sided and overridable) 
1.2. It can't be consistent, since some user may ignore configs provided to 
them and create it with 0700.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails

2012-08-13 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4466:
-

Attachment: MAPREDUCE-4466-trunk-v4.patch

Thanks Sid for your comments.

Incorporated all of those.

Thanks,
Mayank

 Using URI for yarn.nodemanager log dirs fails
 -

 Key: MAPREDUCE-4466
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Eli Collins
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4466-trunk-v1.patch, 
 MAPREDUCE-4466-trunk-v2.patch, MAPREDUCE-4466-trunk-v3.patch, 
 MAPREDUCE-4466-trunk-v4.patch


 If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs 
 or yarn.nodemanager.remote-app-log-dir the container log servlet fails with 
 an NPE (works if I remove the file scheme). Using a URI for 
 yarn.nodemanager.local-dirs works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails

2012-08-13 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4466:
-

Status: Patch Available  (was: Open)

 Using URI for yarn.nodemanager log dirs fails
 -

 Key: MAPREDUCE-4466
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.3
Reporter: Eli Collins
Assignee: Mayank Bansal
Priority: Minor
 Attachments: MAPREDUCE-4466-trunk-v1.patch, 
 MAPREDUCE-4466-trunk-v2.patch, MAPREDUCE-4466-trunk-v3.patch, 
 MAPREDUCE-4466-trunk-v4.patch


 If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs 
 or yarn.nodemanager.remote-app-log-dir the container log servlet fails with 
 an NPE (works if I remove the file scheme). Using a URI for 
 yarn.nodemanager.local-dirs works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433694#comment-13433694
 ] 

Hudson commented on MAPREDUCE-4053:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2597 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2597/])
MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over 
group names deprecated names don't show up  (Robert Evans via tgraves) 
(Revision 1372636)

 Result = FAILURE
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java


 Counters group names deprecation is wrong, iterating over group names 
 deprecated names don't show up
 

 Key: MAPREDUCE-4053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.24.0, 0.23.3
Reporter: Alejandro Abdelnur
Assignee: Robert Joseph Evans
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: MR-4053.txt


 This is similar to the deprecation of Configuration properties bug 
 HADOOP-8167, interator() retrieval of counter names only returns new names.
 Oozie breaks here because it is using the deprecate name and iterating over 
 values (OOZIE-777). While it can be worked around easily in Oozie, this is 
 breaking backwards compatibility.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-08-13 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated MAPREDUCE-4367:
-

Attachment: MAPREDUCE-4367-trunk-v2.patch

Fixing test

Thanks,
Mayank

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Fix For: trunk

 Attachments: MAPREDUCE-4367-trunk-v1.patch, 
 MAPREDUCE-4367-trunk-v2.patch


 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4518:


Attachment: (was: trunk-MR-4518.patch)

 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, 
 trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation

2012-08-13 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4518:


Attachment: trunk-MR-4518.patch

 FairScheduler: PoolSchedulable#updateDemand() - potential redundant 
 aggregation
 ---

 Key: MAPREDUCE-4518
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, 
 trunk-MR-4518.patch


 In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only 
 after iterating though all the pools and computing the final demand. 
 By checking if the demand has reached maxTasks in every iteration, we can 
 avoid redundant work, at the expense of one condition check every iteration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3202) Integrating Hadoop Vaidya with JobHistory Server

2012-08-13 Thread vitthal (Suhas) Gogate (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vitthal (Suhas) Gogate updated MAPREDUCE-3202:
--

Affects Version/s: 0.20.205.0

 Integrating Hadoop Vaidya with JobHistory Server
 

 Key: MAPREDUCE-3202
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3202
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: jobhistoryserver
Affects Versions: 0.20.205.0, 1.0.0
Reporter: vitthal (Suhas) Gogate

 At present jobdetailshistory page served by JobHistory Server provides 
 elementary job analysis through link Analyze This job. Hadoop Vaidya 
 provides a detailed analysis of the M/R job in terms of various execution 
 inefficiencies and the associated remedies that user can easily understand 
 and fix. Integrating Hadoop Vaidya with JobHistory server would really 
 improve the usability of this tool and also benefit many naive users 
 understanding various performance problems and/or best practices violations 
 associated with their job.
 Integration would also aim at providing users a convenient interface where 
 they can manage the existing rules as well as write their own new rules. 
 During my tenure at Yahoo, Vaidya tool was successfully deployed in 
 production analyzing tens of thousands of jobs every day with lot more useful 
 rules than the sample ones present in the contrib project. Many of these 
 rules are open sourced already (big thanks to Yahoo! MAPREDUCE-1530) but yet 
 to integrate with the tool.
 I will add more design details for this feature in near future as work 
 towards getting prototype running.. Any thoughts/comments are welcome. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433785#comment-13433785
 ] 

Hadoop QA commented on MAPREDUCE-4367:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12540794/MAPREDUCE-4367-trunk-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  
org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2726//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2726//console

This message is automatically generated.

 mapred job -kill tries to connect to history server
 ---

 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Assignee: Mayank Bansal
Priority: Minor
 Fix For: trunk

 Attachments: MAPREDUCE-4367-trunk-v1.patch, 
 MAPREDUCE-4367-trunk-v2.patch


 The {{mapred job -kill}} command attempts to connect to the history server, 
 even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira