[jira] [Commented] (MAPREDUCE-4460) Refresh queue throws IO exception after configuring wrong queue capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432985#comment-13432985 ] nemon lou commented on MAPREDUCE-4460: -- The same to MAPREDUCE-3763.Any updates here? Refresh queue throws IO exception after configuring wrong queue capacity Key: MAPREDUCE-4460 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4460 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.1.0-alpha Reporter: Nishan Shetty Assignee: Arun C Murthy Priority: Critical Scenario: 1.My setup has a,b queues(each with capacity say 50%) under root queue 2.Start the process 3.Add one more queue 'c' under root 4.Configure some capacity for 'c' such that total capacity of a,b,c is not equal to 100 5.Now do refresh queues, it will throw exception as wrong capacity(This is expected as capacity was not equal to 100). 6.Now reconfigure queue capacities of a,b,c such that total capacity is 100 5.Now do refresh queues again Observed that it throws IO exception {noformat} java.io.IOException: Failed to re-init queues at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:216) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:174) at org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62) at org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686) Caused by: org.apache.hadoop.metrics2.MetricsException: Metrics source QueueMetrics,q0=root,q1=c already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:126) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:107) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:216) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.forQueue(QueueMetrics.java:129) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.forQueue(QueueMetrics.java:119) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:136) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:313) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:246) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:213) ... 11 more at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Failed to re-init queues at org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:50) at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:40) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:184) at org.apache.hadoop.yarn.server.resourcemanager.api.impl.pb.service.RMAdminProtocolPBServiceImpl.refreshQueues(RMAdminProtocolPBServiceImpl.java:62) at org.apache.hadoop.yarn.proto.RMAdminProtocol$RMAdminProtocolService$2.callBlockingMethod(RMAdminProtocol.java:122) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688) at
[jira] [Commented] (MAPREDUCE-3542) Support FileSystemCounter legacy counter group name for compatibility
[ https://issues.apache.org/jira/browse/MAPREDUCE-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432987#comment-13432987 ] Jarek Jarcec Cecho commented on MAPREDUCE-3542: --- Hi Guys, I was investigating related issue in Sqoop project(http://sqoop.apache.org/). Basically we are reporting number of written filesystem bytes back to the user and on Hadoop 0.23/2.x we're always getting 0. I've noticed that there was some refactorization in FileSystem counter related code and found this issue requesting backward compatibility. Included patch seems to be adding counter FileSystemCounter: {code:title=hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java:84} legacyMap.put(FileSystemCounter, FileSystemCounter.class.getName()); {code} But it appears that original name is FileSystemCounters (Notice the plural s at the end of name): {code:title=src/mapred/org/apache/hadoop/mapred/Task.java:91 (0.20.2)} protected static final String FILESYSTEM_COUNTER_GROUP = FileSystemCounters; {code} {code:title=src/mapred/org/apache/hadoop/mapred/Task.java:109 (1.0.3)} protected static final String FILESYSTEM_COUNTER_GROUP = FileSystemCounters; {code} I therefore believe that this counter should be renamed in order to provide backward compatibility. I might fix this discrepancy in Sqoop, but I believe that other projects/users might also be affected and therefore it would be better to fix it in upstream. I wanted to reopen this ticket, but apparently I do not have enough privileges to do so. Could I ask anyone with proper privileges to do that or should I create new JIRA instead? Support FileSystemCounter legacy counter group name for compatibility --- Key: MAPREDUCE-3542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3542 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3542.patch The group name changed from FileSystemCounter to org.apache.hadoop.mapreduce.FileSystemCounter, but we should support the old one for compatibility's sake. This came up in PIG-2347. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3542) Support FileSystemCounter legacy counter group name for compatibility
[ https://issues.apache.org/jira/browse/MAPREDUCE-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated MAPREDUCE-3542: -- Attachment: MAPREDUCE-3542-name-fix.patch I've provided fix to rename the backward compatible name to the original name that is present in Hadoop 0.20 and Hadoop 1.0.3. I've also checked linked Pig sources to see if this change won't break them. They seems to have special classes for handling differences between hadoop versions: {code:title=shims/src/hadoop23/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java:83} static public String getFsCounterGroupName() { return org.apache.hadoop.mapreduce.FileSystemCounter; } {code} {code:title=shims/src/hadoop20/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java:85} static public String getFsCounterGroupName() { return FileSystemCounters; } {code} I therefore believe that this change will not affect pig. Support FileSystemCounter legacy counter group name for compatibility --- Key: MAPREDUCE-3542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3542 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Tom White Assignee: Tom White Fix For: 0.23.1 Attachments: MAPREDUCE-3542-name-fix.patch, MAPREDUCE-3542.patch The group name changed from FileSystemCounter to org.apache.hadoop.mapreduce.FileSystemCounter, but we should support the old one for compatibility's sake. This came up in PIG-2347. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4549) Distributed cache conflicts breaks backwards compatability
Robert Joseph Evans created MAPREDUCE-4549: -- Summary: Distributed cache conflicts breaks backwards compatability Key: MAPREDUCE-4549 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4549 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Critical I recently put in MAPREDUCE-4503 which went a bit too far, and broke backwards compatibility with 1.0 in distribtued cache entries. This is to change the behavior of the distributed cache to more closely match that of 1.0. In 1.0 when adding in a cache archive link the first link would win (be the one that was created), not the last one as is the current behavior, when there were conflicts then all of the others were ignored and just did not get a symlink created, and finally no symlink was created for archives that had did not have a fragment in the URL. To simulate this behavior after we parse the cache files and cache archives configuration we should walk through all conflicting links and pick the first link that has a fragment to win. If no link has a fragment then it is just the first link wins. All other conflicting links will have a warning an the name of the link will be changed to include a UUID. If the same file is both in the distributed cache as a cache file and a cache archive we will throw an exception, for backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4538) add Legacy Counter support to getGroupNames
[ https://issues.apache.org/jira/browse/MAPREDUCE-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433174#comment-13433174 ] Robert Joseph Evans commented on MAPREDUCE-4538: Correct and since this JIRA is messed up still, I will probably just switch over to MAPREDUCE-4053, and post my patch there. add Legacy Counter support to getGroupNames --- Key: MAPREDUCE-4538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4538 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.1.0-alpha, 3.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4538.txt Oozie loops through counters using getGroupNames(). This does not include with it legacy counter names, so they get missed, and can result in a backwards compatibility issue in the oozie counter API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4332) Add a yarn-client module
[ https://issues.apache.org/jira/browse/MAPREDUCE-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433179#comment-13433179 ] Jason Lowe commented on MAPREDUCE-4332: --- Took a quick look, in general looks great. Couple of comments: * Need to update patch after yarn moved out of mapreduce in YARN-1. * We should probably mark YarnClient as Public and Unstable or Evolving. * Would also be nice to add javadocs to the interface methods since most other public interfaces have them. Add a yarn-client module Key: MAPREDUCE-4332 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4332 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 2.0.0-alpha Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 2.1.0-alpha Attachments: MAPREDUCE-4332-20120621.txt, MAPREDUCE-4332-20120621-with-common-changes.txt, MAPREDUCE-4332-20120622.txt, MAPREDUCE-4332-20120625.txt I see that we are duplicating (some) code for talking to RM via client API. In this light, a yarn-client module will be useful so that clients of all frameworks can use/extend it. And that same module can be the destination for all the YARN's command line tools. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans reassigned MAPREDUCE-4053: -- Assignee: Robert Joseph Evans Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4053: --- Attachment: MR-4053.txt This adds in deprecated names to the getCounterNames. Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4053: --- Target Version/s: 0.23.3, 2.1.0-alpha, 3.0.0 Status: Patch Available (was: Open) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433202#comment-13433202 ] Hadoop QA commented on MAPREDUCE-4053: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540665/MR-4053.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2724//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2724//console This message is automatically generated. Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433237#comment-13433237 ] Benoy Antony commented on MAPREDUCE-4491: - To make the reviewing this patch easier, I am dividing this patch into smaller patches. I am opening sub tasks under this jira issue and attaching the patches to those liras. Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf, MR_4491_1.1.patch, MR_4491_trunk.patch When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations
Benoy Antony created MAPREDUCE-4550: --- Summary: Key Protection : Define Encryption and Key Protection interfaces and default implementations Key: MAPREDUCE-4550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Reporter: Benoy Antony Assignee: Benoy Antony A secret key is read from a Key Store and then encrypted during transport between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets and provide the secrets to child tasks which part of the job. This jira defines the interfaces to accomplish the above : 1) KeyProvider - to read keys from a KeyStore 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data. The default/dummy implementations will also be added. This includes a KeyProvider implementation to read keys from a Java KeyStore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4550) Key Protection : Define Encryption and Key Protection interfaces and default implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4550: Attachment: MR_4550_1_1.patch MR_4550_trunk.patch Key Protection : Define Encryption and Key Protection interfaces and default implementations Key: MAPREDUCE-4550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4550 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4550_1_1.patch, MR_4550_trunk.patch A secret key is read from a Key Store and then encrypted during transport between JobClient and Task. The tasktrackers/nodemanagers decrypt the secrets and provide the secrets to child tasks which part of the job. This jira defines the interfaces to accomplish the above : 1) KeyProvider - to read keys from a KeyStore 2) Encrypter and Decrypter - to and encrypt and decrypt secrets/data. The default/dummy implementations will also be added. This includes a KeyProvider implementation to read keys from a Java KeyStore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers
Benoy Antony created MAPREDUCE-4551: --- Summary: Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers Key: MAPREDUCE-4551 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony The following requirements are addressed. • Plug in different key store mechanisms. • Retrieve specified keys from a configured keystore as part of job submission • Protect keys during its transport through the cluster. • Make sure that keys are handed over only to the tasks of the correct job. Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters to decrypt the job's secrets. Based on Job configuration, JobClient reads secrets from a KeyStore using a Keyprovider implementation and encrypts them using the cluster's public key. The encrypted secrets are stored in Job Credentials. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers
[ https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4551: Description: Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters to decrypt the job's secrets. Based on Job configuration, JobClient reads secrets from a KeyStore using a Keyprovider implementation and encrypts them using the cluster's public key. The encrypted secrets are stored in Job Credentials. The task addresses the following requirements: • Plug in different key store mechanisms. • Retrieve specified keys from a configured keystore as part of job submission • Protect keys during its transport through the cluster. • Make sure that keys are handed over only to the tasks of the correct job. was: The following requirements are addressed. • Plug in different key store mechanisms. • Retrieve specified keys from a configured keystore as part of job submission • Protect keys during its transport through the cluster. • Make sure that keys are handed over only to the tasks of the correct job. Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters to decrypt the job's secrets. Based on Job configuration, JobClient reads secrets from a KeyStore using a Keyprovider implementation and encrypts them using the cluster's public key. The encrypted secrets are stored in Job Credentials. Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers --- Key: MAPREDUCE-4551 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters to decrypt the job's secrets. Based on Job configuration, JobClient reads secrets from a KeyStore using a Keyprovider implementation and encrypts them using the cluster's public key. The encrypted secrets are stored in Job Credentials. The task addresses the following requirements: • Plug in different key store mechanisms. • Retrieve specified keys from a configured keystore as part of job submission • Protect keys during its transport through the cluster. • Make sure that keys are handed over only to the tasks of the correct job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4551) Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers
[ https://issues.apache.org/jira/browse/MAPREDUCE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4551: Attachment: MR_4551_trunk.patch MR_4551_1_1.patch Key Protection : Add ability to read keys and protect keys in JobClient and TTS/NodeManagers --- Key: MAPREDUCE-4551 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4551 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4551_1_1.patch, MR_4551_trunk.patch Based on Cluster configuration, NodeManager/TaskTrackers set up Decrypters to decrypt the job's secrets. Based on Job configuration, JobClient reads secrets from a KeyStore using a Keyprovider implementation and encrypts them using the cluster's public key. The encrypted secrets are stored in Job Credentials. The task addresses the following requirements: • Plug in different key store mechanisms. • Retrieve specified keys from a configured keystore as part of job submission • Protect keys during its transport through the cluster. • Make sure that keys are handed over only to the tasks of the correct job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption
Benoy Antony created MAPREDUCE-4552: --- Summary: Encryption: Add support for PGP Encryption Key: MAPREDUCE-4552 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Reporter: Benoy Antony Assignee: Benoy Antony Provide support for PGP encryption by implementing Encrypter and Decrypter interfaces defined in MAPREDUCE-4450. This can be used by the cluster to protect the job secrets. This also be used map reduce jobs to encrypt/decrypt data. Add PGPCodec as a CompressionCodec so that encrypted data can be processed transparently like compressed data . The aliases to the keys can be specified as part of Job. Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the data in cluster. They include 1. DistributedSplitter – Split an encrypted file into smaller files. 2. DistributedEncrypter – encrypt files in a cluster. 3. DistributedDecrypter – decrypt encrypted files in a cluster. 4. DistributedRecrypter – decrypt an encrypted file and encrypt it with another key. Uitlities are added to encrypt/decrypt files in local file system 1. Genkey - Generate an asymmetric key pair (public and private keys) of a specified strength 2. Encrypt - Encrypt a file 3. Decrypt – Decrypt a file Added as a contrib project - hadoop-crypto. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4552) Encryption: Add support for PGP Encryption
[ https://issues.apache.org/jira/browse/MAPREDUCE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4552: Attachment: MR_4552_1_1.patch MR_4552_trunk.patch Encryption: Add support for PGP Encryption --- Key: MAPREDUCE-4552 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4552 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4552_1_1.patch, MR_4552_trunk.patch Provide support for PGP encryption by implementing Encrypter and Decrypter interfaces defined in MAPREDUCE-4450. This can be used by the cluster to protect the job secrets. This also be used map reduce jobs to encrypt/decrypt data. Add PGPCodec as a CompressionCodec so that encrypted data can be processed transparently like compressed data . The aliases to the keys can be specified as part of Job. Based on PGPCodec, a number of utilities are provided to encrypt, decrypt the data in cluster. They include 1.DistributedSplitter – Split an encrypted file into smaller files. 2.DistributedEncrypter – encrypt files in a cluster. 3.DistributedDecrypter – decrypt encrypted files in a cluster. 4.DistributedRecrypter – decrypt an encrypted file and encrypt it with another key. Uitlities are added to encrypt/decrypt files in local file system 1.Genkey - Generate an asymmetric key pair (public and private keys) of a specified strength 2.Encrypt - Encrypt a file 3.Decrypt – Decrypt a file Added as a contrib project - hadoop-crypto. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore
Benoy Antony created MAPREDUCE-4553: --- Summary: Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore Key: MAPREDUCE-4553 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Normally keys have to be stored in a central location suing custom key management system. organizations can implement KeyProvider to integrate their custom key management system to Hadoop. This interface is specified in MAPREDUCE-4550 Optionally , developers can use Safe to integrate custom key management system with Hadoop. Safe is an open source web service based keystore to securely store secret keys and passwords. Safe authenticates the user using SPNego, checks whether the user is authorized to read the secret and returns the secret. It is easy to plug in different mechanisms for authentication,authorization and Key storage. Safe is kept as a separate open source project at (http://benoyantony.github.com/safe/) The hadoop proxy to safe is added as a contrib project - hadoop-safe. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4553) Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore
[ https://issues.apache.org/jira/browse/MAPREDUCE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4553: Attachment: MR_4553_trunk.patch MR_4553_1_1.patch Key Protection : Implement KeyProvider to read key from a WebService Based KeyStore Key: MAPREDUCE-4553 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4553 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4553_1_1.patch, MR_4553_trunk.patch Normally keys have to be stored in a central location suing custom key management system. organizations can implement KeyProvider to integrate their custom key management system to Hadoop. This interface is specified in MAPREDUCE-4550 Optionally , developers can use Safe to integrate custom key management system with Hadoop. Safe is an open source web service based keystore to securely store secret keys and passwords. Safe authenticates the user using SPNego, checks whether the user is authorized to read the secret and returns the secret. It is easy to plug in different mechanisms for authentication,authorization and Key storage. Safe is kept as a separate open source project at (http://benoyantony.github.com/safe/) The hadoop proxy to safe is added as a contrib project - hadoop-safe. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4491: Attachment: (was: MR_4491_1.1.patch) Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4491: Attachment: (was: MR_4491_trunk.patch) Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4491: Description: When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. Update: The patches are uploaded to subtasks. was: When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. Update: The patches are uploaded to subtasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off
Benoy Antony created MAPREDUCE-4554: --- Summary: Job Credentials are not transmitted if security is turned off Key: MAPREDUCE-4554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Credentials (secret keys) can be passed to a job via mapreduce.job.credentials.json or mapreduce.job.credentials.binary . These credentials get submitted during job submission and are made available to the task processes. In HADOOP 1, these credentials get submitted and routed to task processes even if security was off. In HADOOP 2 , these credentials are transmitted only when the security is turned on. This should be changed for two reasons: 1) It is not backward compatible. 2) Credentials should be passed even if security is turned off . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off
[ https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4554: Attachment: MR_4554_1_1.patch MR_4554_trunk.patch Job Credentials are not transmitted if security is turned off - Key: MAPREDUCE-4554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch Credentials (secret keys) can be passed to a job via mapreduce.job.credentials.json or mapreduce.job.credentials.binary . These credentials get submitted during job submission and are made available to the task processes. In HADOOP 1, these credentials get submitted and routed to task processes even if security was off. In HADOOP 2 , these credentials are transmitted only when the security is turned on. This should be changed for two reasons: 1) It is not backward compatible. 2) Credentials should be passed even if security is turned off . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off
[ https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated MAPREDUCE-4554: Affects Version/s: 2.0.0-alpha Job Credentials are not transmitted if security is turned off - Key: MAPREDUCE-4554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Affects Versions: 2.0.0-alpha Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch Credentials (secret keys) can be passed to a job via mapreduce.job.credentials.json or mapreduce.job.credentials.binary . These credentials get submitted during job submission and are made available to the task processes. In HADOOP 1, these credentials get submitted and routed to task processes even if security was off. In HADOOP 2 , these credentials are transmitted only when the security is turned on. This should be changed for two reasons: 1) It is not backward compatible. 2) Credentials should be passed even if security is turned off . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4554) Job Credentials are not transmitted if security is turned off
[ https://issues.apache.org/jira/browse/MAPREDUCE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433301#comment-13433301 ] Benoy Antony commented on MAPREDUCE-4554: - The patch adds a test case for 1.1 and 2.0 . It also removes security on/off checks when transmitting credentials. Job Credentials are not transmitted if security is turned off - Key: MAPREDUCE-4554 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4554 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Affects Versions: 2.0.0-alpha Reporter: Benoy Antony Assignee: Benoy Antony Attachments: MR_4554_1_1.patch, MR_4554_trunk.patch Credentials (secret keys) can be passed to a job via mapreduce.job.credentials.json or mapreduce.job.credentials.binary . These credentials get submitted during job submission and are made available to the task processes. In HADOOP 1, these credentials get submitted and routed to task processes even if security was off. In HADOOP 2 , these credentials are transmitted only when the security is turned on. This should be changed for two reasons: 1) It is not backward compatible. 2) Credentials should be passed even if security is turned off . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1340#comment-1340 ] Alejandro Abdelnur commented on MAPREDUCE-4511: --- We need HADOOP-7754 in order for this patch to apply and be test-patch-ed, in the mean time, a better name for the method {{determineFd}} would be {{getFileDescriptorIfAvail}}. Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4469) Resource calculation in child tasks is CPU-heavy
[ https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1348#comment-1348 ] Alejandro Abdelnur commented on MAPREDUCE-4469: --- Ahmed, your suggested approach means that in the case of streaming jobs we may lose the info for up to 9 updates (default), right? Resource calculation in child tasks is CPU-heavy Key: MAPREDUCE-4469 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469 Project: Hadoop Map/Reduce Issue Type: Bug Components: performance, task Affects Versions: 1.0.3 Reporter: Todd Lipcon Assignee: Ahmed Radwan Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch In doing some benchmarking on a hadoop-1 derived codebase, I noticed that each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed that it's spending a lot of time looping through all the files in /proc to calculate resource usage. As a test, I added a flag to disable use of the ResourceCalculatorPlugin within the tasks. On a CPU-bound 500G-sort workload, this improved total job runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4068) Jars in lib subdirectory of the submittable JAR are not added to the classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433363#comment-13433363 ] Robert Kanter commented on MAPREDUCE-4068: -- The TestTaskTrackerLocalization file looks like it tests this, but only for mr1 Jars in lib subdirectory of the submittable JAR are not added to the classpath -- Key: MAPREDUCE-4068 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4068 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Ahmed Radwan Priority: Blocker Fix For: 2.2.0-alpha Prior to hadoop 0.23, users could add third party jars to the lib subdirectory of the submitted job jar and they become available in the task's classpath. I see this functionality was in TaskRunner.java, but I can't see similar functionality in hadoop 0.23 (neither in MapReduceChildJVM.java nor other places). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection
[ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433401#comment-13433401 ] Benoy Antony commented on MAPREDUCE-4491: - One of the goals of this feature is to achieve encryption of files in transit and at rest(when stored on disk). One way to achieve this goal is to depend on a software/hardware which allows encryption in the local file system plus rely on HDFS-3637 and MR shuffle encryption. This jira explores an alternative approach to the problem without depending on s special software to do local file system encryption. The key advantages of this approach over the local file system encryption approach are 1) A file can be decrypted only if the user provides the correct key. So even if someone managed to read the file, he cannot read its contents without key. So user's possession of the key is required in addition to his read permission. So there are two levels of protection. There could be cases where a user accidentally set read permissions for everyone. There could be cases where a superuser reads the file. But this scheme protects the data. 2) No dependency on local file system encryption software. This approach allows encryption without such special setup. 3) A file is decrypted/encrypted only during processing and not when it is read. So this results in a less number of encryption/decryption. Other key points will be : 1) Encrypted and plain text files can coexist in a normal file system. 2) Developers can plugin other encryption algorithms/standards - CMS, AES, custom encryption and thus have more flexibility. 3) Allows transporting keys/password/tokens from JobClient to tasks for use cases other than encryption like connecting to a webservice . MAPREDUCE-4491 adds keyProtection and encryption uses it. 4) Can manage keys in one central location. JobClient gets on behalf of user like any other application. If we look at these two approaches from a higher level, we can see that one local file system approach is an internal approach to encryption and MAPREDUCE-4491 approach is an external approach. These two choices are available in normal (non-distributed) application development also where developers can rely on the file system to provide encryption or do encryption themselves. There are tradeoffs and flexibilities in the both the approaches and we choose it based on our use cases and needs. So I believe , we should provide these two alternatives in Hadoop. In addition, this feature allows key protection in general, which can be used for purposes other than encryption. The keys also will be encrypted when stored on disk and decrypted only in memory. Encryption and Key Protection - Key: MAPREDUCE-4491 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491 Project: Hadoop Map/Reduce Issue Type: New Feature Components: documentation, security, task-controller, tasktracker Reporter: Benoy Antony Assignee: Benoy Antony Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore. The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks. The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps. The design document is attached. It explains the requirement, design and use cases. Kindly review and comment. Collaboration is very much welcome. I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement. Update: The patches are uploaded to subtasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4518: Attachment: trunk-MR-4518.patch Updated the patch for trunk - Added constructor to FSQueueSchedulable for testing purposes - Test checks if the demand is less than or equal to maxResources - Verified right number of iterations via the logs in the loop in updateDemand() FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation --- Key: MAPREDUCE-4518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, trunk-MR-4518.patch In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only after iterating though all the pools and computing the final demand. By checking if the demand has reached maxTasks in every iteration, we can avoid redundant work, at the expense of one condition check every iteration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433465#comment-13433465 ] Ahmed Radwan commented on MAPREDUCE-4511: - Here are the updated patches with the new method name. Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, MAPREDUCE-4511_trunk_rev5.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4511: Attachment: MAPREDUCE-4511_trunk_rev5.patch Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, MAPREDUCE-4511_trunk_rev5.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4511: Attachment: MAPREDUCE-4511_branch-1_rev5.patch Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, MAPREDUCE-4511_trunk_rev5.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433470#comment-13433470 ] Hadoop QA commented on MAPREDUCE-4511: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540738/MAPREDUCE-4511_trunk_rev5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2725//console This message is automatically generated. Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, MAPREDUCE-4511_trunk_rev5.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4503) Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles
[ https://issues.apache.org/jira/browse/MAPREDUCE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4503: --- Fix Version/s: (was: 0.23.3) I just pulled this out of 0.23.3. We may add it back in later once we determine how MAPREDUCE-4549 is to be addressed. Should throw InvalidJobConfException if duplicates found in cacheArchives or cacheFiles --- Key: MAPREDUCE-4503 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4503 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Fix For: 3.0.0, 2.2.0-alpha Attachments: MR-4503.txt, MR-4503.txt in 1.0 if a file was both in a jobs cache archives and cache files, and InvalidJobConfException was thrown. We should replicate this behavior on mrv2. We should also extend it so that if a cache archive or cache file is not going to be downloaded at all because of conflicts in the names of the symlinks a similar exception is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned MAPREDUCE-4288: --- Assignee: (was: Karthik Kambatla) Couldn't get around to this, marking it as unassigned should anyone be interested. Will pick it up again if still available later. ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running --- Key: MAPREDUCE-4288 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty When no job is running in the cluster invoke the ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() API's Observed that these API's are returning one instead of zero(as no job is running) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4289) JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned MAPREDUCE-4289: --- Assignee: (was: Karthik Kambatla) Marking as unassigned, should anyone be interested in working on it. Will come back to this if still available when time permits. JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's not giving any values Key: MAPREDUCE-4289 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4289 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty 1.Run a simple job 2.Invoke JobStatus.getReduceProgress() and JobStatus.getMapProgress() API's Observe that these API's are giving zeros instead of showing map/reduce progress -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433510#comment-13433510 ] Karthik Kambatla commented on MAPREDUCE-4518: - Given that this concerns YARN, should I convert this into a YARN issue? FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation --- Key: MAPREDUCE-4518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, trunk-MR-4518.patch In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only after iterating though all the pools and computing the final demand. By checking if the demand has reached maxTasks in every iteration, we can avoid redundant work, at the expense of one condition check every iteration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4455) RMAppImpl state machine does not handle event ATTEMPT_KILLED at ACCEPTED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4455: -- Fix Version/s: (was: trunk) Target Version/s: 2.2.0-alpha Status: Open (was: Patch Available) Mayank, these transitions don't seem to be valid. I got this wrong while talking to you about this earlier... sorry about that. The unit test is broken in this case. The InlineDispatcher, which is used for this test and others, behaves differently from the regular dispatcher. It handles all events inline, as against handling one event to completion before processing the next one. For this test - the App should have transitioned to the KILLED state, before the ATTEMPT_KILLED event came in. The DrainDispatcher seems like a much better option for unit tests. RMAppImpl state machine does not handle event ATTEMPT_KILLED at ACCEPTED Key: MAPREDUCE-4455 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4455 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Reporter: Jason Lowe Assignee: Mayank Bansal Priority: Minor Attachments: MAPREDUCE-4455-trunk-v1.patch TestRMAppTransitions#testAppSubmittedKilled causes an invalid event exception but the test doesn't catch the error since the final app state is still killed. Killed for the wrong reason, but the final state is the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4511) Add IFile readahead
[ https://issues.apache.org/jira/browse/MAPREDUCE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4511: --- Component/s: performance Add IFile readahead --- Key: MAPREDUCE-4511 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4511 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2, performance Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4511_branch1.patch, MAPREDUCE-4511_branch-1_rev2.patch, MAPREDUCE-4511_branch-1_rev3.patch, MAPREDUCE-4511_branch-1_rev4.patch, MAPREDUCE-4511_branch-1_rev5.patch, MAPREDUCE-4511_trunk.patch, MAPREDUCE-4511_trunk_rev2.patch, MAPREDUCE-4511_trunk_rev3.patch, MAPREDUCE-4511_trunk_rev4.patch, MAPREDUCE-4511_trunk_rev5.patch This ticket is to add IFile readahead as part of HADOOP-7714. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433599#comment-13433599 ] Thomas Graves commented on MAPREDUCE-4053: -- +1 Thanks Bobby! Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4053: - Resolution: Fixed Fix Version/s: 2.2.0-alpha 3.0.0 2.1.0-alpha 0.23.3 Status: Resolved (was: Patch Available) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433611#comment-13433611 ] Hudson commented on MAPREDUCE-4053: --- Integrated in Hadoop-Hdfs-trunk-Commit #2639 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2639/]) MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over group names deprecated names don't show up (Robert Evans via tgraves) (Revision 1372636) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433622#comment-13433622 ] Hudson commented on MAPREDUCE-4053: --- Integrated in Hadoop-Common-trunk-Commit #2574 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2574/]) MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over group names deprecated names don't show up (Robert Evans via tgraves) (Revision 1372636) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4555) make user's mapred .staging area permissions configurable
Alexander Alten-Lorenz created MAPREDUCE-4555: - Summary: make user's mapred .staging area permissions configurable Key: MAPREDUCE-4555 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4555 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, job submission Affects Versions: 1.0.3 Reporter: Alexander Alten-Lorenz The directories are created in JobTracker and LocalRunner, but they are currently forced to be 0700. There is even a segment of the source code that will check the permissions are 0700, and if not it will change the permissions to match 0700. For monitoring purposes the permissions should be configurable. Please note: 1. We can make the hard-coded 700 configurable at clients (its the client who creates it) but there's two issues here: 1.1. It violates security principals (as its client sided and overridable) 1.2. It can't be consistent, since some user may ignore configs provided to them and create it with 0700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4466: - Attachment: MAPREDUCE-4466-trunk-v4.patch Thanks Sid for your comments. Incorporated all of those. Thanks, Mayank Using URI for yarn.nodemanager log dirs fails - Key: MAPREDUCE-4466 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3 Reporter: Eli Collins Assignee: Mayank Bansal Priority: Minor Attachments: MAPREDUCE-4466-trunk-v1.patch, MAPREDUCE-4466-trunk-v2.patch, MAPREDUCE-4466-trunk-v3.patch, MAPREDUCE-4466-trunk-v4.patch If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs or yarn.nodemanager.remote-app-log-dir the container log servlet fails with an NPE (works if I remove the file scheme). Using a URI for yarn.nodemanager.local-dirs works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4466) Using URI for yarn.nodemanager log dirs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4466: - Status: Patch Available (was: Open) Using URI for yarn.nodemanager log dirs fails - Key: MAPREDUCE-4466 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4466 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3 Reporter: Eli Collins Assignee: Mayank Bansal Priority: Minor Attachments: MAPREDUCE-4466-trunk-v1.patch, MAPREDUCE-4466-trunk-v2.patch, MAPREDUCE-4466-trunk-v3.patch, MAPREDUCE-4466-trunk-v4.patch If I use URIs (eg file:///home/eli/hadoop/dirs) for yarn.nodemanager.log-dirs or yarn.nodemanager.remote-app-log-dir the container log servlet fails with an NPE (works if I remove the file scheme). Using a URI for yarn.nodemanager.local-dirs works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4053) Counters group names deprecation is wrong, iterating over group names deprecated names don't show up
[ https://issues.apache.org/jira/browse/MAPREDUCE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433694#comment-13433694 ] Hudson commented on MAPREDUCE-4053: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2597 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2597/]) MAPREDUCE-4053. Counters group names deprecation is wrong, iterating over group names deprecated names don't show up (Robert Evans via tgraves) (Revision 1372636) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372636 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/counters/AbstractCounters.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestCounters.java Counters group names deprecation is wrong, iterating over group names deprecated names don't show up Key: MAPREDUCE-4053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4053 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.24.0, 0.23.3 Reporter: Alejandro Abdelnur Assignee: Robert Joseph Evans Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: MR-4053.txt This is similar to the deprecation of Configuration properties bug HADOOP-8167, interator() retrieval of counter names only returns new names. Oozie breaks here because it is using the deprecate name and iterating over values (OOZIE-777). While it can be worked around easily in Oozie, this is breaking backwards compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4367) mapred job -kill tries to connect to history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4367: - Attachment: MAPREDUCE-4367-trunk-v2.patch Fixing test Thanks, Mayank mapred job -kill tries to connect to history server --- Key: MAPREDUCE-4367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Mayank Bansal Priority: Minor Fix For: trunk Attachments: MAPREDUCE-4367-trunk-v1.patch, MAPREDUCE-4367-trunk-v2.patch The {{mapred job -kill}} command attempts to connect to the history server, even though it is unrelated to the process of killing a job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4518: Attachment: (was: trunk-MR-4518.patch) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation --- Key: MAPREDUCE-4518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, trunk-MR-4518.patch In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only after iterating though all the pools and computing the final demand. By checking if the demand has reached maxTasks in every iteration, we can avoid redundant work, at the expense of one condition check every iteration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4518) FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4518: Attachment: trunk-MR-4518.patch FairScheduler: PoolSchedulable#updateDemand() - potential redundant aggregation --- Key: MAPREDUCE-4518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: MR-4518_branch1.patch, trunk-MR-4518.patch, trunk-MR-4518.patch In FS, PoolSchedulable#updateDemand() limits the demand to maxTasks only after iterating though all the pools and computing the final demand. By checking if the demand has reached maxTasks in every iteration, we can avoid redundant work, at the expense of one condition check every iteration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3202) Integrating Hadoop Vaidya with JobHistory Server
[ https://issues.apache.org/jira/browse/MAPREDUCE-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vitthal (Suhas) Gogate updated MAPREDUCE-3202: -- Affects Version/s: 0.20.205.0 Integrating Hadoop Vaidya with JobHistory Server Key: MAPREDUCE-3202 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3202 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobhistoryserver Affects Versions: 0.20.205.0, 1.0.0 Reporter: vitthal (Suhas) Gogate At present jobdetailshistory page served by JobHistory Server provides elementary job analysis through link Analyze This job. Hadoop Vaidya provides a detailed analysis of the M/R job in terms of various execution inefficiencies and the associated remedies that user can easily understand and fix. Integrating Hadoop Vaidya with JobHistory server would really improve the usability of this tool and also benefit many naive users understanding various performance problems and/or best practices violations associated with their job. Integration would also aim at providing users a convenient interface where they can manage the existing rules as well as write their own new rules. During my tenure at Yahoo, Vaidya tool was successfully deployed in production analyzing tens of thousands of jobs every day with lot more useful rules than the sample ones present in the contrib project. Many of these rules are open sourced already (big thanks to Yahoo! MAPREDUCE-1530) but yet to integrate with the tool. I will add more design details for this feature in near future as work towards getting prototype running.. Any thoughts/comments are welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433785#comment-13433785 ] Hadoop QA commented on MAPREDUCE-4367: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540794/MAPREDUCE-4367-trunk-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2726//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2726//console This message is automatically generated. mapred job -kill tries to connect to history server --- Key: MAPREDUCE-4367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Mayank Bansal Priority: Minor Fix For: trunk Attachments: MAPREDUCE-4367-trunk-v1.patch, MAPREDUCE-4367-trunk-v2.patch The {{mapred job -kill}} command attempts to connect to the history server, even though it is unrelated to the process of killing a job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira