[jira] [Commented] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework
[ https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093765#comment-14093765 ] Kai Zheng commented on HADOOP-10959: Below is the brief introduction about the proposed solution. We proposed to add token-preauth mechanism similar to PKINIT and OTP for Kerberos based on the Pre-Authentication framework, which allows users to authenticate to KDC using a JWT token instead of password. KDC authenticates the JWT token and issues TGT as it would trust the token authority/issuer via PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and they’re interested. Currently we’re collaborating with MIT team to work on the draft and standardize the mechanism. We also did a POC which implemented the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately packaged as a Linux .so module and deployed additionally for existing installations. MIT also wish we could contribute the codes and make it available in their future releases. Before that we can make the plugin binary and source codes available to the community for experimental usage and review. So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, the end users can authenticate to 3rd party JWT token authorities and get tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on that, we implemented the token authentication for Hadoop, with only a few of central modifications into the code base, as we don’t have to add another Authentication Method and the solution leverages the existing Kerberos support. We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to support logging in using a token or token cache. The new module is compatible with Krb5LoginModule in configuration and functionality, thus can be used safely. We also added KerberosTokenAuthenticationHandler to support Hadoop web interfaces. It extends KerberosAuthenticationHandler and adds to support token authentication and perform the SPNEGO negotiation purely in server side in the new handler. Again the new handler is compatible with KerberosAuthenticationHandler and can be used safely. Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as normally does. In addition to that, to employ the token attributes to enforce fine-grained authorization or whatever, a token derivation is encapsulated into ticket as Authorization data when KDC issues the ticket with the token. Then in service (Hadoop services) side, token can be queried and extracted from service ticket. We made this happen in both GSSAPI and SASL contexts as the both are used in Hadoop. As we can see or think of, the main concern for this solution may be that it requires to deploy additional plugin for existing Kerberos installations, and involves necessary identity accounts sync from identity management systems to Kerberos KDC. Most importantly, it requires Kerberos deployment as its prerequisite setup. We’re also discussing with MIT team about how to simplify Kerberos deployment especially for Hadoop large clusters and alleviate the overhead to employ PKINIT/token-preauth mechanisms like identity sync. > A Complement and Short Term Solution to TokenAuth Based on Kerberos > Pre-Authentication Framework > > > Key: HADOOP-10959 > URL: https://issues.apache.org/jira/browse/HADOOP-10959 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Reporter: Kai Zheng >Assignee: Kai Zheng > Labels: Rhino > Attachments: KerbToken-v2.pdf > > > To implement and integrate pluggable authentication providers, enhance > desirable single sign on for end users, and help enforce centralized access > control on the platform, the community has widely discussed and concluded > token based authentication could be the appropriate approach. TokenAuth > (HADOOP-9392) was proposed and is under development to implement another > Authentication Method in lieu with Simple and Kerberos. It is a big and long > term effort to support TokenAuth across the entire ecosystem. We here propose > a short term replacement based on Kerberos that can complement to TokenAuth. > Our solution involves less codes changes with limited risk and the main > development work has already been done in our POC. Users can use our solution > as a short term solution to support token inside Hadoop. > This effort and resultant solution will be fully described in the design > document to be attached. And the brief introduction will be commented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework
[ https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-10959: --- Description: To implement and integrate pluggable authentication providers, enhance desirable single sign on for end users, and help enforce centralized access control on the platform, the community has widely discussed and concluded token based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) was proposed and is under development to implement another Authentication Method in lieu with Simple and Kerberos. It is a big and long term effort to support TokenAuth across the entire ecosystem. We here propose a short term replacement based on Kerberos that can complement to TokenAuth. Our solution involves less codes changes with limited risk and the main development work has already been done in our POC. Users can use our solution as a short term solution to support token inside Hadoop. This effort and resultant solution will be fully described in the design document to be attached. And the brief introduction will be commented. was: To implement and integrate pluggable authentication providers, enhance desirable single sign on for end users, and help enforce centralized access control on the platform, the community has widely discussed and concluded token based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) was proposed and is under development to implement another Authentication Method in lieu with Simple and Kerberos. It is a big and long term effort to support TokenAuth across the entire ecosystem. We here propose a short term replacement based on Kerberos that can complement to TokenAuth. Our solution involves less codes changes with limited risk and the main development work has already been done in our POC. Users can use our solution as a short term solution to support token inside Hadoop. This effort and resultant solution will be fully described in the design document to be attached soon. Below is the brief introduction. We proposed to add token-preauth mechanism similar to PKINIT and OTP for Kerberos based on the Pre-Authentication framework, which allows users to authenticate to KDC using a JWT token instead of password. KDC authenticates the JWT token and issues TGT as it would trust the token authority/issuer via PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and they’re interested. Currently we’re collaborating with MIT team to work on the draft and standardize the mechanism. We also did a POC which implemented the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately packaged as a Linux .so module and deployed additionally for existing installations. MIT also wish we could contribute the codes and make it available in their future releases. Before that we can make the plugin binary and source codes available to the community for experimental usage and review. So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, the end users can authenticate to 3rd party JWT token authorities and get tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on that, we implemented the token authentication for Hadoop, with only a few of central modifications into the code base, as we don’t have to add another Authentication Method and the solution leverages the existing Kerberos support. We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to support logging in using a token or token cache. The new module is compatible with Krb5LoginModule in configuration and functionality, thus can be used safely. We also added KerberosTokenAuthenticationHandler to support Hadoop web interfaces. It extends KerberosAuthenticationHandler and adds to support token authentication and perform the SPNEGO negotiation purely in server side in the new handler. Again the new handler is compatible with KerberosAuthenticationHandler and can be used safely. Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as normally does. In addition to that, to employ the token attributes to enforce fine-grained authorization or whatever, a token derivation is encapsulated into ticket as Authorization data when KDC issues the ticket with the token. Then in service (Hadoop services) side, token can be queried and extracted from service ticket. We made this happen in both GSSAPI and SASL contexts as the both are used in Hadoop. As we can see or think of, the main concern for this solution may be that it requires to deploy additional plugin for existing Kerberos installations, and involves necessary identity accounts sync from identity management systems to Kerberos KDC. Most importantly, it requires Kerberos deployment as its prerequisite setup. We’re also discussing with MIT team about how to simplify Kerbero
[jira] [Commented] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”
[ https://issues.apache.org/jira/browse/HADOOP-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093762#comment-14093762 ] Jean-Baptiste Note commented on HADOOP-10960: - We saw somthing very similar to this problem (among others) recurring on RHEL5. It appears that the kernel, after an initial, legit, softlockup report (for instance because of IO contention), can go into a loop of reporting soft lockups and -- by the mere amount of data spewed -- lock itself to panic. For us it was due to dumping data to the (relatively slow) serial console, for you it may be by dumping data to disk, which is presumably the cause for contention in your case. Once you've cleared the way for real issues (controller problems, for instance), you may want to in vestigate one of the following: 0) reduce kernel verbosity on the console (provided it reduces the amount of data dumped to /var/log/messages, i'm not familiar with your setup, we're remote logging everything) 1) disable softlockup reboot 2) disable disk logging of kernel messages / log to tmpfs / log to a separate, dedicated system *disk* HTH > hadoop cause system crash with “soft lock” and “hard lock” > -- > > Key: HADOOP-10960 > URL: https://issues.apache.org/jira/browse/HADOOP-10960 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: redhat rhel 6.3,6,4,6.5 > jdk1.7.0_45 > hadoop2.2 >Reporter: linbao111 >Priority: Critical > Original Estimate: 168h > Remaining Estimate: 168h > > I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after > a while. /var/log/messages shows repeatedly: > Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! > [jsvc:11508] > Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc > iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode > dcdbas serio_raw iTCO_w > dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod > crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash > dm_log dm_m > od [last unloaded: scsi_wait_scan] > Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 > Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc > iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode > dcdbas serio_raw iTCO_w > dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod > crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash > dm_log dm_m > od [last unloaded: scsi_wait_scan] > Aug 11 06:30:42 jn4_73_128 kernel: > Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW > ---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW > Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[] > [] wait_for_rqlock+0x28/0x40 > Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8 EFLAGS: > 0202 > Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: > 8807786c3ee8 RCX: 880028216680 > Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: > 88061cd29370 RDI: 0286 > Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: > 0001 R09: 0001 > Aug 11 06:30:42 jn4_73_128 kernel: R10: R11: > R12: 0286 > Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: > 810e0f6e R15: 8807786c3e48 > Aug 11 06:30:42 jn4_73_128 kernel: FS: () > GS:88002820() knlGS: > Aug 11 06:30:42 jn4_73_128 kernel: CS: 0010 DS: ES: CR0: > 80050033 > Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: > 01a85000 CR4: 06e0 > Aug 11 06:30:42 jn4_73_128 kernel: DR0: DR1: > DR2: > Aug 11 06:30:42 jn4_73_128 kernel: DR3: DR6: > 0ff0 DR7: 0400 > Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo > 8807786c2000, task 880c1def3500) > Aug 11 06:30:42 jn4_73_128 kernel: Stack: > Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b > 8807786c3f28 > Aug 11 06:30:42 jn4_73_128 kernel: 880701735260 880c1def39c8 > 880c1def39c8 > Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f28 8807786c3f28 > 8807786c3f78 7f092d0ad700 > Aug 11 06:30:42 jn4_73_128 kernel: Call Trace: > Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? > system_call_fastpath+0x16/0x1b > Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f
[jira] [Commented] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard
[ https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093751#comment-14093751 ] Hadoop QA commented on HADOOP-10957: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661109/HADOOP-10957.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.fs.TestGlobPaths {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4453//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4453//console This message is automatically generated. > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard > - > > Key: HADOOP-10957 > URL: https://issues.apache.org/jira/browse/HADOOP-10957 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-10957.001.patch > > > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard. The existing unit tests don't catch > this, because it doesn't happen for superusers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”
[ https://issues.apache.org/jira/browse/HADOOP-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HADOOP-10960. Resolution: Invalid Hadoop core has no kernel mode components so it cannot cause a kernel panic. You likely have a buggy device driver or hit a kernel bug. Resolving as Invalid. > hadoop cause system crash with “soft lock” and “hard lock” > -- > > Key: HADOOP-10960 > URL: https://issues.apache.org/jira/browse/HADOOP-10960 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: redhat rhel 6.3,6,4,6.5 > jdk1.7.0_45 > hadoop2.2 >Reporter: linbao111 >Priority: Critical > Original Estimate: 168h > Remaining Estimate: 168h > > I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after > a while. /var/log/messages shows repeatedly: > Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! > [jsvc:11508] > Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc > iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode > dcdbas serio_raw iTCO_w > dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod > crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash > dm_log dm_m > od [last unloaded: scsi_wait_scan] > Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 > Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc > iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode > dcdbas serio_raw iTCO_w > dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod > crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash > dm_log dm_m > od [last unloaded: scsi_wait_scan] > Aug 11 06:30:42 jn4_73_128 kernel: > Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW > ---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW > Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[] > [] wait_for_rqlock+0x28/0x40 > Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8 EFLAGS: > 0202 > Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: > 8807786c3ee8 RCX: 880028216680 > Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: > 88061cd29370 RDI: 0286 > Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: > 0001 R09: 0001 > Aug 11 06:30:42 jn4_73_128 kernel: R10: R11: > R12: 0286 > Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: > 810e0f6e R15: 8807786c3e48 > Aug 11 06:30:42 jn4_73_128 kernel: FS: () > GS:88002820() knlGS: > Aug 11 06:30:42 jn4_73_128 kernel: CS: 0010 DS: ES: CR0: > 80050033 > Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: > 01a85000 CR4: 06e0 > Aug 11 06:30:42 jn4_73_128 kernel: DR0: DR1: > DR2: > Aug 11 06:30:42 jn4_73_128 kernel: DR3: DR6: > 0ff0 DR7: 0400 > Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo > 8807786c2000, task 880c1def3500) > Aug 11 06:30:42 jn4_73_128 kernel: Stack: > Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b > 8807786c3f28 > Aug 11 06:30:42 jn4_73_128 kernel: 880701735260 880c1def39c8 > 880c1def39c8 > Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f28 8807786c3f28 > 8807786c3f78 7f092d0ad700 > Aug 11 06:30:42 jn4_73_128 kernel: Call Trace: > Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? > system_call_fastpath+0x16/0x1b > Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f 44 00 00 > 48 c7 c0 80 66 01 00 65 48 8b 0c 25 b0 e0 00 00 0f ae f0 48 01 c1 eb 09 0f 1f > 80 00 00 00 00 90 8b 01 89 c2 c1 fa 10 66 39 c2 75 f2 c9 c3 0f 1f 84 00 > 00 > Aug 11 06:30:42 jn4_73_128 kernel: Call Trace: > Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20 > Aug 11 06:30:42 jn4_73_128 kernel: [] ? > system_call_fastpath+0x16/0x1b > > and finally crashed > crash /usr/lib/debug/lib/modules/2.6.32-431.5.1.el6.x86_64/vmlinux > /opt/crash/127.0.0.1-2014-08-10-09\:47\:38/vmcore > crash 6.1.0-5.el6 > Copyright (C) 2002-2012 Red Hat, Inc. > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > Copyright (C) 1999-2006 Hewlett-Packard Co > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > Copyright (C) 2006, 2007
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093726#comment-14093726 ] zhihai xu commented on HADOOP-10820: I tested TestWebDelegationToken at local. I didn't see the error. Also the error is "java.net.BindException: Address already in use", which looks like some address conflict in the test environment. --- T E S T S --- Running org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.484 sec - in org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken Results : Tests run: 7, Failures: 0, Errors: 0, Skipped: 0 > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820-3.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”
linbao111 created HADOOP-10960: -- Summary: hadoop cause system crash with “soft lock” and “hard lock” Key: HADOOP-10960 URL: https://issues.apache.org/jira/browse/HADOOP-10960 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.2.0 Environment: redhat rhel 6.3,6,4,6.5 jdk1.7.0_45 hadoop2.2 Reporter: linbao111 Priority: Critical I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after a while. /var/log/messages shows repeatedly: Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! [jsvc:11508] Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode dcdbas serio_raw iTCO_w dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_m od [last unloaded: scsi_wait_scan] Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode dcdbas serio_raw iTCO_w dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_m od [last unloaded: scsi_wait_scan] Aug 11 06:30:42 jn4_73_128 kernel: Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW ---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[] [] wait_for_rqlock+0x28/0x40 Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8 EFLAGS: 0202 Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: 8807786c3ee8 RCX: 880028216680 Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: 88061cd29370 RDI: 0286 Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: 0001 R09: 0001 Aug 11 06:30:42 jn4_73_128 kernel: R10: R11: R12: 0286 Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: 810e0f6e R15: 8807786c3e48 Aug 11 06:30:42 jn4_73_128 kernel: FS: () GS:88002820() knlGS: Aug 11 06:30:42 jn4_73_128 kernel: CS: 0010 DS: ES: CR0: 80050033 Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: 01a85000 CR4: 06e0 Aug 11 06:30:42 jn4_73_128 kernel: DR0: DR1: DR2: Aug 11 06:30:42 jn4_73_128 kernel: DR3: DR6: 0ff0 DR7: 0400 Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo 8807786c2000, task 880c1def3500) Aug 11 06:30:42 jn4_73_128 kernel: Stack: Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b 8807786c3f28 Aug 11 06:30:42 jn4_73_128 kernel: 880701735260 880c1def39c8 880c1def39c8 Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f28 8807786c3f28 8807786c3f78 7f092d0ad700 Aug 11 06:30:42 jn4_73_128 kernel: Call Trace: Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870 Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20 Aug 11 06:30:42 jn4_73_128 kernel: [] ? system_call_fastpath+0x16/0x1b Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f 44 00 00 48 c7 c0 80 66 01 00 65 48 8b 0c 25 b0 e0 00 00 0f ae f0 48 01 c1 eb 09 0f 1f 80 00 00 00 00 90 8b 01 89 c2 c1 fa 10 66 39 c2 75 f2 c9 c3 0f 1f 84 00 00 Aug 11 06:30:42 jn4_73_128 kernel: Call Trace: Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870 Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20 Aug 11 06:30:42 jn4_73_128 kernel: [] ? system_call_fastpath+0x16/0x1b and finally crashed crash /usr/lib/debug/lib/modules/2.6.32-431.5.1.el6.x86_64/vmlinux /opt/crash/127.0.0.1-2014-08-10-09\:47\:38/vmcore crash 6.1.0-5.el6 Copyright (C) 2002-2012 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 7.3.1 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
[jira] [Updated] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework
[ https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-10959: --- Attachment: KerbToken-v2.pdf The design doc. Your comments are welcome. Thanks. > A Complement and Short Term Solution to TokenAuth Based on Kerberos > Pre-Authentication Framework > > > Key: HADOOP-10959 > URL: https://issues.apache.org/jira/browse/HADOOP-10959 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Reporter: Kai Zheng >Assignee: Kai Zheng > Labels: Rhino > Attachments: KerbToken-v2.pdf > > > To implement and integrate pluggable authentication providers, enhance > desirable single sign on for end users, and help enforce centralized access > control on the platform, the community has widely discussed and concluded > token based authentication could be the appropriate approach. TokenAuth > (HADOOP-9392) was proposed and is under development to implement another > Authentication Method in lieu with Simple and Kerberos. It is a big and long > term effort to support TokenAuth across the entire ecosystem. We here propose > a short term replacement based on Kerberos that can complement to TokenAuth. > Our solution involves less codes changes with limited risk and the main > development work has already been done in our POC. Users can use our solution > as a short term solution to support token inside Hadoop. > This effort and resultant solution will be fully described in the design > document to be attached soon. Below is the brief introduction. > We proposed to add token-preauth mechanism similar to PKINIT and OTP for > Kerberos based on the Pre-Authentication framework, which allows users to > authenticate to KDC using a JWT token instead of password. KDC authenticates > the JWT token and issues TGT as it would trust the token authority/issuer via > PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and > they’re interested. Currently we’re collaborating with MIT team to work on > the draft and standardize the mechanism. We also did a POC which implemented > the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be > separately packaged as a Linux .so module and deployed additionally for > existing installations. MIT also wish we could contribute the codes and make > it available in their future releases. Before that we can make the plugin > binary and source codes available to the community for experimental usage and > review. > So ideally token-preauth plugin can be deployed to a MIT Kerberos > installation, the end users can authenticate to 3rd party JWT token > authorities and get tokens, and then use the tokens to acquire Kerberos TGT > from KDC. Based on that, we implemented the token authentication for Hadoop, > with only a few of central modifications into the code base, as we don’t have > to add another Authentication Method and the solution leverages the existing > Kerberos support. > We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to > support logging in using a token or token cache. The new module is compatible > with Krb5LoginModule in configuration and functionality, thus can be used > safely. > We also added KerberosTokenAuthenticationHandler to support Hadoop web > interfaces. It extends KerberosAuthenticationHandler and adds to support > token authentication and perform the SPNEGO negotiation purely in server side > in the new handler. Again the new handler is compatible with > KerberosAuthenticationHandler and can be used safely. > Token is used to exchange Kerberos ticket and ticket goes to Hadoop services > as normally does. In addition to that, to employ the token attributes to > enforce fine-grained authorization or whatever, a token derivation is > encapsulated into ticket as Authorization data when KDC issues the ticket > with the token. Then in service (Hadoop services) side, token can be queried > and extracted from service ticket. We made this happen in both GSSAPI and > SASL contexts as the both are used in Hadoop. > As we can see or think of, the main concern for this solution may be that it > requires to deploy additional plugin for existing Kerberos installations, and > involves necessary identity accounts sync from identity management systems to > Kerberos KDC. Most importantly, it requires Kerberos deployment as its > prerequisite setup. We’re also discussing with MIT team about how to simplify > Kerberos deployment especially for Hadoop large clusters and alleviate the > overhead to employ PKINIT/token-preauth mechanisms like identity sync. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework
Kai Zheng created HADOOP-10959: -- Summary: A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework Key: HADOOP-10959 URL: https://issues.apache.org/jira/browse/HADOOP-10959 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Kai Zheng Assignee: Kai Zheng To implement and integrate pluggable authentication providers, enhance desirable single sign on for end users, and help enforce centralized access control on the platform, the community has widely discussed and concluded token based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) was proposed and is under development to implement another Authentication Method in lieu with Simple and Kerberos. It is a big and long term effort to support TokenAuth across the entire ecosystem. We here propose a short term replacement based on Kerberos that can complement to TokenAuth. Our solution involves less codes changes with limited risk and the main development work has already been done in our POC. Users can use our solution as a short term solution to support token inside Hadoop. This effort and resultant solution will be fully described in the design document to be attached soon. Below is the brief introduction. We proposed to add token-preauth mechanism similar to PKINIT and OTP for Kerberos based on the Pre-Authentication framework, which allows users to authenticate to KDC using a JWT token instead of password. KDC authenticates the JWT token and issues TGT as it would trust the token authority/issuer via PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and they’re interested. Currently we’re collaborating with MIT team to work on the draft and standardize the mechanism. We also did a POC which implemented the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately packaged as a Linux .so module and deployed additionally for existing installations. MIT also wish we could contribute the codes and make it available in their future releases. Before that we can make the plugin binary and source codes available to the community for experimental usage and review. So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, the end users can authenticate to 3rd party JWT token authorities and get tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on that, we implemented the token authentication for Hadoop, with only a few of central modifications into the code base, as we don’t have to add another Authentication Method and the solution leverages the existing Kerberos support. We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to support logging in using a token or token cache. The new module is compatible with Krb5LoginModule in configuration and functionality, thus can be used safely. We also added KerberosTokenAuthenticationHandler to support Hadoop web interfaces. It extends KerberosAuthenticationHandler and adds to support token authentication and perform the SPNEGO negotiation purely in server side in the new handler. Again the new handler is compatible with KerberosAuthenticationHandler and can be used safely. Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as normally does. In addition to that, to employ the token attributes to enforce fine-grained authorization or whatever, a token derivation is encapsulated into ticket as Authorization data when KDC issues the ticket with the token. Then in service (Hadoop services) side, token can be queried and extracted from service ticket. We made this happen in both GSSAPI and SASL contexts as the both are used in Hadoop. As we can see or think of, the main concern for this solution may be that it requires to deploy additional plugin for existing Kerberos installations, and involves necessary identity accounts sync from identity management systems to Kerberos KDC. Most importantly, it requires Kerberos deployment as its prerequisite setup. We’re also discussing with MIT team about how to simplify Kerberos deployment especially for Hadoop large clusters and alleviate the overhead to employ PKINIT/token-preauth mechanisms like identity sync. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option
[ https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8944: - Attachment: HADOOP-8944-2.patch > Shell command fs -count should include human readable option > > > Key: HADOOP-8944 > URL: https://issues.apache.org/jira/browse/HADOOP-8944 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jonathan Allen >Assignee: Allen Wittenauer >Priority: Trivial > Labels: newbie > Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, > HADOOP-8944.patch > > > The shell command fs -count report sizes in bytes. The command should accept > a -h option to display the sizes in a human readable format, i.e. K, M, G, > etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option
[ https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8944: - Attachment: (was: HADOOP-8944-2.patch) > Shell command fs -count should include human readable option > > > Key: HADOOP-8944 > URL: https://issues.apache.org/jira/browse/HADOOP-8944 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jonathan Allen >Assignee: Allen Wittenauer >Priority: Trivial > Labels: newbie > Attachments: HADOOP-8944-1.patch, HADOOP-8944.patch > > > The shell command fs -count report sizes in bytes. The command should accept > a -h option to display the sizes in a human readable format, i.e. K, M, G, > etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093677#comment-14093677 ] Hadoop QA commented on HADOOP-10820: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661108/HADOOP-10820-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4454//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4454//console This message is automatically generated. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820-3.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()
[ https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093666#comment-14093666 ] Tsuyoshi OZAWA commented on HADOOP-9869: The test failure is not related to the patch. [~ste...@apache.org], do you mind reviewing a patch? > Configuration.getSocketAddr()/getEnum() should use getTrimmed() > > > Key: HADOOP-9869 > URL: https://issues.apache.org/jira/browse/HADOOP-9869 > Project: Hadoop Common > Issue Type: Improvement > Components: conf >Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0 >Reporter: Steve Loughran >Assignee: Tsuyoshi OZAWA >Priority: Minor > Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, > HADOOP-9869.3.patch, HADOOP-9869.4.patch > > > YARN-1059 has shown that the hostname:port string used for the address of > things like the RM isn't trimmed before its parsed, leading to errors that > aren't that obvious. > We should trim it -it's clearly not going to break any existing (valid) > configurations -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option
[ https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8944: - Status: Patch Available (was: Open) > Shell command fs -count should include human readable option > > > Key: HADOOP-8944 > URL: https://issues.apache.org/jira/browse/HADOOP-8944 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jonathan Allen >Assignee: Allen Wittenauer >Priority: Trivial > Labels: newbie > Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, > HADOOP-8944.patch > > > The shell command fs -count report sizes in bytes. The command should accept > a -h option to display the sizes in a human readable format, i.e. K, M, G, > etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option
[ https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8944: - Attachment: HADOOP-8944-2.patch This has reworked test cases and undid the change to QUOTA_HEADER. > Shell command fs -count should include human readable option > > > Key: HADOOP-8944 > URL: https://issues.apache.org/jira/browse/HADOOP-8944 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jonathan Allen >Assignee: Allen Wittenauer >Priority: Trivial > Labels: newbie > Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, > HADOOP-8944.patch > > > The shell command fs -count report sizes in bytes. The command should accept > a -h option to display the sizes in a human readable format, i.e. K, M, G, > etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()
[ https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093660#comment-14093660 ] Hadoop QA commented on HADOOP-9869: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/1266/HADOOP-9869.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4452//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4452//console This message is automatically generated. > Configuration.getSocketAddr()/getEnum() should use getTrimmed() > > > Key: HADOOP-9869 > URL: https://issues.apache.org/jira/browse/HADOOP-9869 > Project: Hadoop Common > Issue Type: Improvement > Components: conf >Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0 >Reporter: Steve Loughran >Assignee: Tsuyoshi OZAWA >Priority: Minor > Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, > HADOOP-9869.3.patch, HADOOP-9869.4.patch > > > YARN-1059 has shown that the hostname:port string used for the address of > things like the RM isn't trimmed before its parsed, leading to errors that > aren't that obvious. > We should trim it -it's clearly not going to break any existing (valid) > configurations -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10836) Replace HttpFS custom proxyuser handling with common implementation
[ https://issues.apache.org/jira/browse/HADOOP-10836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093639#comment-14093639 ] Hadoop QA commented on HADOOP-10836: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661100/HADOOP-10836.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4451//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4451//console This message is automatically generated. > Replace HttpFS custom proxyuser handling with common implementation > --- > > Key: HADOOP-10836 > URL: https://issues.apache.org/jira/browse/HADOOP-10836 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: COMBO.patch, HADOOP-10836.patch, HADOOP-10836.patch, > HADOOP-10836.patch, HADOOP-10836.patch > > > Use HADOOP-10835 to implement proxyuser logic in HttpFS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard
[ https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093632#comment-14093632 ] Colin Patrick McCabe commented on HADOOP-10957: --- Credit goes to Daryn for originally identifying this issue in HADOOP-10942 > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard > - > > Key: HADOOP-10957 > URL: https://issues.apache.org/jira/browse/HADOOP-10957 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-10957.001.patch > > > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard. The existing unit tests don't catch > this, because it doesn't happen for superusers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix
[ https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093634#comment-14093634 ] Colin Patrick McCabe commented on HADOOP-10942: --- I created HADOOP-10958 for the globber test rework (I think it's going to be a giant patch, although simple in concept.). I created HADOOP-10957 for the urgent globber bug, and posted a small bugfix that we can easily backport. Can you file a JIRA for the FileContext issue, if there's not one out there already? And perhaps one for any other miscellaneous optimizations / refactoring we should do in the globber. > Globbing optimizations and regression fix > - > > Key: HADOOP-10942 > URL: https://issues.apache.org/jira/browse/HADOOP-10942 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0, 2.1.0-beta >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-10942.patch > > > When globbing was commonized to support both filesystem and filecontext, it > regressed a fix that prevents an intermediate glob that matches a file from > throwing a confusing permissions exception. The hdfs traverse check requires > the exec bit which a file does not have. > Additional optimizations to reduce rpcs actually increases them if > directories contain 1 item. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()
[ https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated HADOOP-9869: --- Attachment: HADOOP-9869.4.patch Rebased on trunk. > Configuration.getSocketAddr()/getEnum() should use getTrimmed() > > > Key: HADOOP-9869 > URL: https://issues.apache.org/jira/browse/HADOOP-9869 > Project: Hadoop Common > Issue Type: Improvement > Components: conf >Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0 >Reporter: Steve Loughran >Assignee: Tsuyoshi OZAWA >Priority: Minor > Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, > HADOOP-9869.3.patch, HADOOP-9869.4.patch > > > YARN-1059 has shown that the hostname:port string used for the address of > things like the RM isn't trimmed before its parsed, leading to errors that > aren't that obvious. > We should trim it -it's clearly not going to break any existing (valid) > configurations -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard
[ https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10957: -- Status: Patch Available (was: Open) > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard > - > > Key: HADOOP-10957 > URL: https://issues.apache.org/jira/browse/HADOOP-10957 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-10957.001.patch > > > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard. The existing unit tests don't catch > this, because it doesn't happen for superusers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093628#comment-14093628 ] Andrew Wang commented on HADOOP-10820: -- +1 pending Jenkins, thanks Zhihai! > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820-3.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard
[ https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10957: -- Attachment: HADOOP-10957.001.patch > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard > - > > Key: HADOOP-10957 > URL: https://issues.apache.org/jira/browse/HADOOP-10957 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-10957.001.patch > > > The globber will sometimes erroneously return a permission denied exception > when there is a non-terminal wildcard. The existing unit tests don't catch > this, because it doesn't happen for superusers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10958) TestGlobPaths should do more tests of globbing by unprivileged users
Colin Patrick McCabe created HADOOP-10958: - Summary: TestGlobPaths should do more tests of globbing by unprivileged users Key: HADOOP-10958 URL: https://issues.apache.org/jira/browse/HADOOP-10958 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe TestGlobPaths should do more tests of globbing by unprivileged users. Right now, most of the tests are of globbing by the superuser, but this tends to hide permission exception issues such as HADOOP-10957. We should keep a few tests operating with privileged globs, but do most of them unprivileged. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093620#comment-14093620 ] Charles Lamb commented on HADOOP-10919: --- I should clarify case (1). If you are distcp'ing from the ez root or higher, then you don't need to pre-create the EZ because all of the raw.* xattrs will be preserved. Given that, I'm wondering what would the purpose be for checking that the target is an EZ? > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10958) TestGlobPaths should do more tests of globbing by unprivileged users
[ https://issues.apache.org/jira/browse/HADOOP-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093621#comment-14093621 ] Colin Patrick McCabe commented on HADOOP-10958: --- I think we should consider renaming {{TestGlobPaths#fs}} to {{TestGlobPaths#privFs}}, and renaming {{TestGlobPaths#unprivilegedFs}} to {{TestGlobPaths#unprivFs}} to make it less unwieldy to type. (And similar for FC, and all the other wrappers, etc.) This will also make it clear which one we're using. > TestGlobPaths should do more tests of globbing by unprivileged users > > > Key: HADOOP-10958 > URL: https://issues.apache.org/jira/browse/HADOOP-10958 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Colin Patrick McCabe > > TestGlobPaths should do more tests of globbing by unprivileged users. Right > now, most of the tests are of globbing by the superuser, but this tends to > hide permission exception issues such as HADOOP-10957. We should keep a few > tests operating with privileged globs, but do most of them unprivileged. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard
Colin Patrick McCabe created HADOOP-10957: - Summary: The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard Key: HADOOP-10957 URL: https://issues.apache.org/jira/browse/HADOOP-10957 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard. The existing unit tests don't catch this, because it doesn't happen for superusers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093616#comment-14093616 ] zhihai xu commented on HADOOP-10820: As discussed with [~andrew.wang] offline, I made two changes in the HADOOP-10820-3.patch. 1. change the "File list length can't be zero" to "File name can't be empty string" because both are caused by empty string. 2. use tmp.isEmpty() instead of "".equals(tmp) for more semantic. and also split function for String will guarantee the the string element in the array is not null. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820-3.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093611#comment-14093611 ] Charles Lamb commented on HADOOP-10919: --- Sanjay, There are three scenarios. (1) An administrator who does not have access to the keys in the KMS would use the /.reserved/raw prefix on src and dest: distcp /.reserved/raw/src /.reserved/raw/dest The /.reserved/raw is the only interface that exposes the raw.* xattrs holding the encryption metadata. This allows the raw.* xattrs to be preserved on the dest as well as to copy the files without decrypting them. This scenario assumes that an ez has been set up on dest. As you suggested, it would be a good idea to check that the dest is actually an ez. (2) A non-admin user who has access to some subset of files in an ez could use the non-/.reserved/raw prefix and copy a hierarchy from one ez to another. In that case, the raw.* xattrs from the src ez would not be preserved. This scenario assumes that the dest ez is already set up. Of course the dest files will have new keys associated with them since they'll be new copies. (3) Neither src or dst has /.reserved/raw and one or the other of src/dest is not an ez. It is not necessary to have the target also be an ez. The use case would be that the user wants to copy a subset of the ez into/out-of a non-encrypted file system. distcp without the /.reserved/raw prefix could be used for this. Does this all make sense? > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HADOOP-10820: --- Attachment: HADOOP-10820-3.patch > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820-3.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10955) FSShell's get operation should have the ability to take "start" and "length" argument
[ https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guo Ruijing updated HADOOP-10955: - Description: Use Case: if HDFS file is corrupted, some tool can be used to copy out good part of corrupted file. We may enhance "hdfs -get" to copy out good part. Existing: hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... Proposal: hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-start] [-length] ... was: Use Case: if HDFS file is corrupted, some tool can be used to copy out good part of corrupted file. We may enhance "hdfs -get" to copy out good part. Existing: hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... Proposal: hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length] ... Summary: FSShell's get operation should have the ability to take "start" and "length" argument (was: FSShell's get operation should have the ability to take a "length" argument) > FSShell's get operation should have the ability to take "start" and "length" > argument > - > > Key: HADOOP-10955 > URL: https://issues.apache.org/jira/browse/HADOOP-10955 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Guo Ruijing > > Use Case: if HDFS file is corrupted, some tool can be used to copy out good > part of corrupted file. We may enhance "hdfs -get" to copy out good part. > Existing: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... > Proposal: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-start] [-length] > ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument
[ https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093605#comment-14093605 ] Guo Ruijing commented on HADOOP-10955: -- Hi, Colin. "start" argument is a good point and update JIRA title and description according to your comments. > FSShell's get operation should have the ability to take a "length" argument > --- > > Key: HADOOP-10955 > URL: https://issues.apache.org/jira/browse/HADOOP-10955 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Guo Ruijing > > Use Case: if HDFS file is corrupted, some tool can be used to copy out good > part of corrupted file. We may enhance "hdfs -get" to copy out good part. > Existing: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... > Proposal: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length] ... > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093594#comment-14093594 ] Sanjay Radia commented on HADOOP-10919: --- charles, what is the usage model for distcp of encrypted files: * distcp path1 path2 - where distcp will insert /.reserved/.raw to the pathnames if in encrypted zone. * OR distcp /.reserved/.raw/path1 /.reserved/.raw/path2 BTW is the proposal is that both src and dest MUST be encryptedZones or neither ? (Because of your "misspoke" comment I am a little confused.) > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093593#comment-14093593 ] Hadoop QA commented on HADOOP-10281: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661094/HADOOP-10281.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4450//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4450//console This message is automatically generated. > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. > > As of now, the current scheduler is the DecayRpcScheduler, which only keeps > track of the number of each type of call and decays these counts periodically. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10836) Replace HttpFS custom proxyuser handling with common implementation
[ https://issues.apache.org/jira/browse/HADOOP-10836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10836: Attachment: HADOOP-10836.patch rebasing patch on trunk now that HADOOP-10835 is committed. > Replace HttpFS custom proxyuser handling with common implementation > --- > > Key: HADOOP-10836 > URL: https://issues.apache.org/jira/browse/HADOOP-10836 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Attachments: COMBO.patch, HADOOP-10836.patch, HADOOP-10836.patch, > HADOOP-10836.patch, HADOOP-10836.patch > > > Use HADOOP-10835 to implement proxyuser logic in HttpFS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093565#comment-14093565 ] Charles Lamb commented on HADOOP-10919: --- Sanjay, I just re-read your comment and I realized that I mis-spoke. Yes, I think it would make sense. I'll open a jira for that. Thanks. > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10956) Fix create-release script to include docs in the binary
Karthik Kambatla created HADOOP-10956: - Summary: Fix create-release script to include docs in the binary Key: HADOOP-10956 URL: https://issues.apache.org/jira/browse/HADOOP-10956 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 2.5.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker The create-release script doesn't include docs in the binary tarball. We should fix that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093560#comment-14093560 ] Andrew Wang commented on HADOOP-10820: -- Hi Zhihai, only comment I have is that it would be nice to validate for empty string before the file list length, e.g.: {noformat} -> % hadoop fs -files ,, Exception in thread "main" java.lang.IllegalArgumentException: File list length can't be zero {noformat} I think if we just need to do this check on the finalArr instead at the end. +1 pending this. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093557#comment-14093557 ] Charles Lamb commented on HADOOP-10919: --- bq. Charles, you list disadvantage for the .raw scheme where the target of a distcp is not an encrypted zone. Would it make sense for distcp to check for that and to fail the distcp? Hi Sanjay, Presently distcp requires both src and target to be either both in /.reserved/raw or neither in /.reserved/raw. I'll update the HDFS-6509 document and comments. Thanks for catching that. > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093548#comment-14093548 ] Sanjay Radia commented on HADOOP-10919: --- Charles, you list disadvantage for the .raw scheme where the target of a distcp is not an encrypted zone. Would it make sense for distcp to check for that and to fail the distcp? > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option
[ https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8944: - Status: Open (was: Patch Available) > Shell command fs -count should include human readable option > > > Key: HADOOP-8944 > URL: https://issues.apache.org/jira/browse/HADOOP-8944 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jonathan Allen >Assignee: Allen Wittenauer >Priority: Trivial > Labels: newbie > Attachments: HADOOP-8944-1.patch, HADOOP-8944.patch > > > The shell command fs -count report sizes in bytes. The command should accept > a -h option to display the sizes in a human readable format, i.e. K, M, G, > etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes
[ https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093550#comment-14093550 ] Sanjay Radia commented on HADOOP-10919: --- Charles, the work you did for distcp needs to be also applied to har. I suspect .raw would also work. > Copy command should preserve raw.* namespace extended attributes > > > Key: HADOOP-10919 > URL: https://issues.apache.org/jira/browse/HADOOP-10919 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Charles Lamb >Assignee: Charles Lamb > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch > > > Refer to the doc attached to HDFS-6509 for background. > Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve > extended attributes in the raw.* namespace by default whenever the src and > target are in /.reserved/raw. To not preserve raw xattrs, don't specify > /.reserved/raw in either the src or target. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093539#comment-14093539 ] Hudson commented on HADOOP-10835: - FAILURE: Integrated in Hadoop-trunk-Commit #6049 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6049/]) HADOOP-10835. Implement HTTP proxyuser support in HTTP authentication client/server libraries. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617384) * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticatedURL.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticationFilter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticationHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/HttpUserGroupInformation.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/token/delegation/web/TestWebDelegationToken.java > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix
[ https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093535#comment-14093535 ] Colin Patrick McCabe commented on HADOOP-10942: --- It seems like there's a bunch of things going on here: * The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard. For example, when listing {{/a*/b}}, if there is a file named /alpha, the glob will fail. This bug does *not* occur for superusers, which is why the existing tests and casual testing didn't catch it. You mention that this was a "fix" which was regressed between 0.23 and branch-2... is there a JIRA number for this already? * Optimizations: you mention "doing a simple immediate file status if the path contains no globs, etc". The existing code already does this. It was added in HADOOP-9877. Are we missing a case? I didn't understand the comment about "Additional optimizations to reduce rpcs actually increases them if directories contain 1 item." Which specific optimization(s) are increasing RPCs for you and how can we avoid this? * You added a comment that "FileContext returns a path to the home dir of the user that started the jvm instead of the ugi user so we'll just workaround it." I wasn't aware of this issue. Is there a JIRA number? This seems like an inconsistency that we should note in the test, along with a link to the JIRA that should fix it. * There's a bunch of reorganization here, perhaps almost a rewrite of the main part of the globber. Let's split these into separate JIRAs so that it's easier to review. > Globbing optimizations and regression fix > - > > Key: HADOOP-10942 > URL: https://issues.apache.org/jira/browse/HADOOP-10942 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0, 2.1.0-beta >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-10942.patch > > > When globbing was commonized to support both filesystem and filecontext, it > regressed a fix that prevents an intermediate glob that matches a file from > throwing a confusing permissions exception. The hdfs traverse check requires > the exec bit which a file does not have. > Additional optimizations to reduce rpcs actually increases them if > directories contain 1 item. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10835: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) committed to trunk and branch-2. > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093527#comment-14093527 ] Aaron T. Myers commented on HADOOP-10835: - +1, the latest patch looks good to me. > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Li updated HADOOP-10281: -- Attachment: (was: HADOOP-10281-preview.patch) > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Li updated HADOOP-10281: -- Description: The Scheduler decides which sub-queue to assign a given Call. It implements a single method getPriorityLevel(Schedulable call) which returns an integer corresponding to the subqueue the FairCallQueue should place the call in. The HistoryRpcScheduler is one such implementation which uses the username of each call and determines what % of calls in recent history were made by this user. It is configured with a historyLength (how many calls to track) and a list of integer thresholds which determine the boundaries between priority levels. For instance, if the scheduler has a historyLength of 8; and priority thresholds of 4,2,1; and saw calls made by these users in order: Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice * Another call by Alice would be placed in queue 3, since she has already made >= 4 calls * Another call by Bob would be placed in queue 2, since he has >= 2 but less than 4 calls * A call by Carlos would be placed in queue 0, since he has no calls in the history Also, some versions of this patch include the concept of a 'service user', which is a user that is always scheduled high-priority. Currently this seems redundant and will probably be removed in later patches, since its not too useful. As of now, the current scheduler is the DecayRpcScheduler, which only keeps track of the number of each type of call and decays these counts periodically. was: The Scheduler decides which sub-queue to assign a given Call. It implements a single method getPriorityLevel(Schedulable call) which returns an integer corresponding to the subqueue the FairCallQueue should place the call in. The HistoryRpcScheduler is one such implementation which uses the username of each call and determines what % of calls in recent history were made by this user. It is configured with a historyLength (how many calls to track) and a list of integer thresholds which determine the boundaries between priority levels. For instance, if the scheduler has a historyLength of 8; and priority thresholds of 4,2,1; and saw calls made by these users in order: Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice * Another call by Alice would be placed in queue 3, since she has already made >= 4 calls * Another call by Bob would be placed in queue 2, since he has >= 2 but less than 4 calls * A call by Carlos would be placed in queue 0, since he has no calls in the history Also, some versions of this patch include the concept of a 'service user', which is a user that is always scheduled high-priority. Currently this seems redundant and will probably be removed in later patches, since its not too useful. > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. > > As of now, the current scheduler is the DecayRpcScheduler, which only keeps > track of the number of each type of call and decays these counts periodically. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Li updated HADOOP-10281: -- Attachment: HADOOP-10281.patch > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Li updated HADOOP-10281: -- Status: Patch Available (was: Open) > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093516#comment-14093516 ] Hadoop QA commented on HADOOP-10820: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661083/HADOOP-10820-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4449//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4449//console This message is automatically generated. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093482#comment-14093482 ] Alejandro Abdelnur commented on HADOOP-10835: - failure is unrelated. > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093480#comment-14093480 ] zhihai xu commented on HADOOP-10820: I attached a new patch HADOOP-10820-2.patch based on [~andrew.wang] comments. Also I find the filename with space character is not permitted in URI parser. My previous concern is already handled in original code. So I add a test case "a, ,b" which will trigger URISyntaxException at line 394 in GenericOptionsParser.java. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093479#comment-14093479 ] Hadoop QA commented on HADOOP-10835: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661074/HADOOP-10835.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverControllerStress {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4448//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4448//console This message is automatically generated. > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated HADOOP-10820: --- Attachment: HADOOP-10820-2.patch > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, > HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument
[ https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093471#comment-14093471 ] Colin Patrick McCabe commented on HADOOP-10955: --- also, you might consider adding a start argument as well as a length, while you're at it > FSShell's get operation should have the ability to take a "length" argument > --- > > Key: HADOOP-10955 > URL: https://issues.apache.org/jira/browse/HADOOP-10955 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Guo Ruijing > > Use Case: if HDFS file is corrupted, some tool can be used to copy out good > part of corrupted file. We may enhance "hdfs -get" to copy out good part. > Existing: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... > Proposal: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length] ... > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument
[ https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093470#comment-14093470 ] Colin Patrick McCabe commented on HADOOP-10955: --- Moving from HDFS to HADOOP, since FSShell is part of the common code. > FSShell's get operation should have the ability to take a "length" argument > --- > > Key: HADOOP-10955 > URL: https://issues.apache.org/jira/browse/HADOOP-10955 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Guo Ruijing > > Use Case: if HDFS file is corrupted, some tool can be used to copy out good > part of corrupted file. We may enhance "hdfs -get" to copy out good part. > Existing: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... > Proposal: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length] ... > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument
[ https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe moved HDFS-6818 to HADOOP-10955: - Key: HADOOP-10955 (was: HDFS-6818) Project: Hadoop Common (was: Hadoop HDFS) > FSShell's get operation should have the ability to take a "length" argument > --- > > Key: HADOOP-10955 > URL: https://issues.apache.org/jira/browse/HADOOP-10955 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Guo Ruijing > > Use Case: if HDFS file is corrupted, some tool can be used to copy out good > part of corrupted file. We may enhance "hdfs -get" to copy out good part. > Existing: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] ... > Proposal: > hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length] ... > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10835: Attachment: HADOOP-10835.patch thanks @atm, new patch addressing your comments. > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch, HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10940) RPC client does no bounds checking of responses
[ https://issues.apache.org/jira/browse/HADOOP-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093330#comment-14093330 ] Colin Patrick McCabe commented on HADOOP-10940: --- nit: maxDataLength should be final, since it can't change {code} \@InterfaceAudience.Private // ONLY exposed for SaslRpcClient public static class IpcStreams implements Closeable { {code} Is this comment still valid? It looks like even non-SASL clients are now using {{IpcStreams}}. {code} // don't flush! we need to avoid broken pipes if server closes or // rejects the connection. the perils of multiple sends before a read // insecure: header+context+call, flush // secure : header+negotiate, flush, (sasl), context+call, flush {code} Hmm. I wonder if we could rephrase this to be clearer. Maybe something like "At this point, the data is buffered by the output stream. We do not want to flush yet, since that would generate unnecessary context switches. Another advantage of deferring the TCP write operation is that we do not get a "broken pipe" exception if the server closes or rejects the connection at this point." {code} // again, don't flush! see writeConnectionHeader {code} Do we need this comment here? There wasn't a flush here earlier. {code} public void sendRequest(RpcRequestHeaderProto header, Message request, boolean flush) throws IOException { try { header.writeDelimitedTo(dob); request.writeDelimitedTo(dob); sendRequest(dob, flush); } finally { dob.reset(); } } public void sendRequest(DataOutputBuffer buffer, boolean flush) throws IOException { out.writeInt(buffer.size()); // total Length buffer.writeTo(out); // request header + payload if (flush) { out.flush(); } } {code} Rather than having a boolean argument, why not just have the callers who want to flush call {{ioStreams.out.flush()}}? There seems to be no advantage to folding it into {{sendRequest}}, and it means that we need a comment to explain the value of the boolean everywhere. {code} public boolean useWrap() { {code} add VisibleForTesting? > RPC client does no bounds checking of responses > --- > > Key: HADOOP-10940 > URL: https://issues.apache.org/jira/browse/HADOOP-10940 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-10940.patch, HADOOP-10940.patch > > > The rpc client does no bounds checking of server responses. In the case of > communicating with an older and incompatible RPC, this may lead to OOM issues > and leaking of resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093331#comment-14093331 ] Arpit Agarwal commented on HADOOP-10281: It is probably not worth running the test again just for that. > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries
[ https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093321#comment-14093321 ] Aaron T. Myers commented on HADOOP-10835: - Patch looks pretty good to me, Tucu. A few small comments/nits. +1 once these are addressed. Comments: # Is it definitely correct that in {{DelegationTokenAuthenticationFilter#getProxyuserConfiguration}} we create a {{Configuration}} object without loading the defaults? That surprised me a bit, but maybe it's reasonable. Perhaps add a comment explaining why we're doing that here? # Recommend using some constants for the many repeated strings in the tests, e.g. "ok-user" is repeated many times. Nits: # This change seems unnecessary and unhelpful: {code} - * Sets an external DelegationTokenSecretManager instance to + * Sets an external DelegationTokenSecretManager instance to {code} # Should have a comma here, instead of a period: {code} + * Returns the remote {@link UserGroupInformation} in context for the current + * HTTP request. taking into account proxy user requests. {code} # One too many "using": {code} + // requests using using delegation token as auth do not honor doAs {code} > Implement HTTP proxyuser support in HTTP authentication client/server > libraries > --- > > Key: HADOOP-10835 > URL: https://issues.apache.org/jira/browse/HADOOP-10835 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: 2.4.1 >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.6.0 > > Attachments: HADOOP-10835.patch, HADOOP-10835.patch, > HADOOP-10835.patch > > > This is to implement generic handling of proxyuser in the > {{DelegationTokenAuthenticatedURL}} and > {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on > the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10946) Fix a bunch of typos in log messages
[ https://issues.apache.org/jira/browse/HADOOP-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093315#comment-14093315 ] Ray Chiang commented on HADOOP-10946: - Good to know. Thanks. I just thought it odd that on Jenkins the only two non-PASS/non-FAILs in the recent job list were the two runs I listed above. > Fix a bunch of typos in log messages > > > Key: HADOOP-10946 > URL: https://issues.apache.org/jira/browse/HADOOP-10946 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Ray Chiang >Priority: Trivial > Labels: newbie > Attachments: HADOOP10946-01.patch, HADOOP10946-02.patch > > > There are a bunch of typos in various log messages. These need cleaning up. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10946) Fix a bunch of typos in log messages
[ https://issues.apache.org/jira/browse/HADOOP-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093307#comment-14093307 ] Allen Wittenauer commented on HADOOP-10946: --- FYI, HDFS-6694 and HDFS-4663. So those TestPipeline failures are almost certainly not real failures. > Fix a bunch of typos in log messages > > > Key: HADOOP-10946 > URL: https://issues.apache.org/jira/browse/HADOOP-10946 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Ray Chiang >Priority: Trivial > Labels: newbie > Attachments: HADOOP10946-01.patch, HADOOP10946-02.patch > > > There are a bunch of typos in various log messages. These need cleaning up. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093300#comment-14093300 ] Chris Li commented on HADOOP-10281: --- I didn't record it, but I can if you're interested. I suspect it'll be slightly worse than the minority user's latency in the LinkedBlockingQueue (since the resources have to come from somewhere). > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093288#comment-14093288 ] Arpit Agarwal commented on HADOOP-10281: Thanks Chris. Just curious if you measured the impact to the majority user latency. > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix
[ https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093278#comment-14093278 ] Colin Patrick McCabe commented on HADOOP-10942: --- bq. Colin Patrick McCabe, could you take a look since you made most of the changes after my 0.23 overhaul? Will take a look. I was on vacation last week so that's why I haven't responded til now > Globbing optimizations and regression fix > - > > Key: HADOOP-10942 > URL: https://issues.apache.org/jira/browse/HADOOP-10942 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0, 2.1.0-beta >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-10942.patch > > > When globbing was commonized to support both filesystem and filecontext, it > regressed a fix that prevents an intermediate glob that matches a file from > throwing a confusing permissions exception. The hdfs traverse check requires > the exec bit which a file does not have. > Additional optimizations to reduce rpcs actually increases them if > directories contain 1 item. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093274#comment-14093274 ] Aaron T. Myers commented on HADOOP-9902: I agree with [~aw] on the trunk/branch-2 question. We quite clearly can't commit this patch to branch-2 because of the compat issues, at least not without some fairly substantial scaling back of this change. Based on some recent discussions on some of the lists, seems like the motivation for a release off of trunk (i.e. 3.x) is building. This change being only on trunk would add to the motivation to make a release from that branch. > Shell script rewrite > > > Key: HADOOP-9902 > URL: https://issues.apache.org/jira/browse/HADOOP-9902 > Project: Hadoop Common > Issue Type: Improvement > Components: scripts >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Labels: releasenotes > Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, > HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, > HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, > HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, > HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, > HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt > > > Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093253#comment-14093253 ] Alejandro Abdelnur commented on HADOOP-9902: bq. If trunk is getting 'stale', then that sounds like an issue for the PMC to take up. I'm being proactive on this one. I'm trying to avoid getting into that situation. I'd love to get this in, just in a way it is exercised and refined ASAP. Else, a year from now or more we'll be battling with it. What are the key issues to be addressed for getting this in branch-2 and how can we take care of it? > Shell script rewrite > > > Key: HADOOP-9902 > URL: https://issues.apache.org/jira/browse/HADOOP-9902 > Project: Hadoop Common > Issue Type: Improvement > Components: scripts >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Labels: releasenotes > Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, > HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, > HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, > HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, > HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, > HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt > > > Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093244#comment-14093244 ] Allen Wittenauer commented on HADOOP-9902: -- bq. I was under the impression we were targeting this for branch-2? is not the case? It hasn't been my intention to commit this to branch-2 for a very long time. Others have expressed interest in a back port, though. Of course, while this patch definitely moves the needle the largest, there are still lots of smaller projects that need to be finished (see the blocked by list) for a comprehensive fix. bq. we are at risk of getting things stale in trunk as people add changes in branch-2 only. There are already changes in trunk that aren't in branch-2. This would just be another one (albeit probably the biggest one). If trunk is getting 'stale', then that sounds like an issue for the PMC to take up. It doesn't really have much bearing on this patch, IMO. > Shell script rewrite > > > Key: HADOOP-9902 > URL: https://issues.apache.org/jira/browse/HADOOP-9902 > Project: Hadoop Common > Issue Type: Improvement > Components: scripts >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Labels: releasenotes > Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, > HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, > HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, > HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, > HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, > HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt > > > Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level
[ https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093229#comment-14093229 ] Chris Li commented on HADOOP-10281: --- Hi [~arpitagarwal] that's correct. It's not very scientific, but it's a sanity check to make sure that the scheduler performs under various loads. The workloads are mapreduce jobs that coordinate to perform a ddos attack on the namenode. Each job runs under 10 users, each job maps to 20 nodes, and spams the namenode using a varying number of threads. Rest: No load Equal: 100 threads each Balanced: 10, 20, 30..., 80, 90, 100 threads respectievly Majority: 100, then 1-2 for the rest I think this is ready. I will post a patch shortly for CI > Create a scheduler, which assigns schedulables a priority level > --- > > Key: HADOOP-10281 > URL: https://issues.apache.org/jira/browse/HADOOP-10281 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Chris Li >Assignee: Chris Li > Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, > HADOOP-10281.patch, HADOOP-10281.patch > > > The Scheduler decides which sub-queue to assign a given Call. It implements a > single method getPriorityLevel(Schedulable call) which returns an integer > corresponding to the subqueue the FairCallQueue should place the call in. > The HistoryRpcScheduler is one such implementation which uses the username of > each call and determines what % of calls in recent history were made by this > user. > It is configured with a historyLength (how many calls to track) and a list of > integer thresholds which determine the boundaries between priority levels. > For instance, if the scheduler has a historyLength of 8; and priority > thresholds of 4,2,1; and saw calls made by these users in order: > Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice > * Another call by Alice would be placed in queue 3, since she has already > made >= 4 calls > * Another call by Bob would be placed in queue 2, since he has >= 2 but less > than 4 calls > * A call by Carlos would be placed in queue 0, since he has no calls in the > history > Also, some versions of this patch include the concept of a 'service user', > which is a user that is always scheduled high-priority. Currently this seems > redundant and will probably be removed in later patches, since its not too > useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093158#comment-14093158 ] zhihai xu commented on HADOOP-10820: Hi [~alex.holmes], It look like you don't have time to work on this issue, if you don't mind, I will create a patch based on your patch to address the comment from [~andrew.wang]. thanks zhihai > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized
[ https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093139#comment-14093139 ] zhihai xu commented on HADOOP-10820: A filename with only space character are valid. So my suggestion is not good. We should't trim space when check empty string. > Empty entry in libjars results in working directory being recursively > localized > --- > > Key: HADOOP-10820 > URL: https://issues.apache.org/jira/browse/HADOOP-10820 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Alex Holmes >Priority: Minor > Attachments: HADOOP-10820-1.patch, HADOOP-10820.patch > > > An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the > current working directory to be recursively localized. > Here's an example of this in action (using Hadoop 2.2.0): > {code} > # create a temp directory and touch three JAR files > mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar > # Run an example job only specifying two of the JARs. > # Include an empty entry in libjars. > hadoop jar > /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar > pi -libjars a.jar,,c.jar 2 10 > # As the job is running examine the localized directory in HDFS. > # Notice that not only are the two JAR's specified in libjars copied, > # but in addition the contents of the working directory are also recursively > copied. > $ hadoop fs -lsr > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path > /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093115#comment-14093115 ] Alejandro Abdelnur commented on HADOOP-9902: [~aw], I was under the impression we were targeting this for branch-2? is not the case? If we don't do that, given that we don't have imminent plans to create a branch-3 out of trunk, we are at risk of getting things stale in trunk as people add changes in branch-2 only. > Shell script rewrite > > > Key: HADOOP-9902 > URL: https://issues.apache.org/jira/browse/HADOOP-9902 > Project: Hadoop Common > Issue Type: Improvement > Components: scripts >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Labels: releasenotes > Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, > HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, > HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, > HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, > HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, > HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt > > > Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093075#comment-14093075 ] Allen Wittenauer commented on HADOOP-9902: -- Test failures are obviously unrelated. Patch -14 deals with the issues that [~rvs] discovered. > Shell script rewrite > > > Key: HADOOP-9902 > URL: https://issues.apache.org/jira/browse/HADOOP-9902 > Project: Hadoop Common > Issue Type: Improvement > Components: scripts >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Labels: releasenotes > Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, > HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, > HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, > HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, > HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, > HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt > > > Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables
[ https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092844#comment-14092844 ] Hudson commented on HADOOP-10402: - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1860 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1860/]) HADOOP-10402. Configuration.getValByRegex does not substitute for variables. (Robert Kanter via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java > Configuration.getValByRegex does not substitute for variables > - > > Key: HADOOP-10402 > URL: https://issues.apache.org/jira/browse/HADOOP-10402 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.6.0 > > Attachments: HADOOP-10402.patch > > > When using Configuration.getValByRegex(...), variables are not resolved. > For example: > {code:xml} > >bar >woot > > >foo3 >${bar} > > {code} > If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, > it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10954) Adding site documents of hadoop-tools
Masatake Iwasaki created HADOOP-10954: - Summary: Adding site documents of hadoop-tools Key: HADOOP-10954 URL: https://issues.apache.org/jira/browse/HADOOP-10954 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 2.4.1 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor There are no pages for hadoop-tools in the site documents of branch-2 or later. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10954) Adding site documents of hadoop-tools
[ https://issues.apache.org/jira/browse/HADOOP-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092774#comment-14092774 ] Masatake Iwasaki commented on HADOOP-10954: --- In the site documents of branch-1, there are pages such as http://hadoop.apache.org/docs/current1/hadoop_archives.html or http://hadoop.apache.org/docs/current1/gridmix.html . Those could be migrated from forrest to maven-site format. > Adding site documents of hadoop-tools > - > > Key: HADOOP-10954 > URL: https://issues.apache.org/jira/browse/HADOOP-10954 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.1 >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > > There are no pages for hadoop-tools in the site documents of branch-2 or > later. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables
[ https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092701#comment-14092701 ] Hudson commented on HADOOP-10402: - ABORTED: Integrated in Hadoop-Hdfs-trunk #1834 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1834/]) HADOOP-10402. Configuration.getValByRegex does not substitute for variables. (Robert Kanter via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java > Configuration.getValByRegex does not substitute for variables > - > > Key: HADOOP-10402 > URL: https://issues.apache.org/jira/browse/HADOOP-10402 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.6.0 > > Attachments: HADOOP-10402.patch > > > When using Configuration.getValByRegex(...), variables are not resolved. > For example: > {code:xml} > >bar >woot > > >foo3 >${bar} > > {code} > If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, > it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10953) a minor concurrent bug inside NetworkTopology
[ https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092651#comment-14092651 ] Hadoop QA commented on HADOOP-10953: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12660965/HADOOP-10953.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4447//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4447//console This message is automatically generated. > a minor concurrent bug inside NetworkTopology > - > > Key: HADOOP-10953 > URL: https://issues.apache.org/jira/browse/HADOOP-10953 > Project: Hadoop Common > Issue Type: Bug > Components: net >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie >Priority: Minor > Attachments: HADOOP-10953.txt > > > Found this issue while reading the related code. In > NetworkTopology.toString() method, there is no thread safety guarantee > directly, it's called by add/remove, and inside add/remove, most of > this.toString() calls are protected by rwlock, except a couple of error > handling codes, one possible fix is that moving them into lock as well, due > to not heavy operations, so no obvious downgration should be observed per my > current knowledge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10953) a minor concurrent bug inside NetworkTopology
[ https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HADOOP-10953: --- Status: Patch Available (was: Open) > a minor concurrent bug inside NetworkTopology > - > > Key: HADOOP-10953 > URL: https://issues.apache.org/jira/browse/HADOOP-10953 > Project: Hadoop Common > Issue Type: Bug > Components: net >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie >Priority: Minor > Attachments: HADOOP-10953.txt > > > Found this issue while reading the related code. In > NetworkTopology.toString() method, there is no thread safety guarantee > directly, it's called by add/remove, and inside add/remove, most of > this.toString() calls are protected by rwlock, except a couple of error > handling codes, one possible fix is that moving them into lock as well, due > to not heavy operations, so no obvious downgration should be observed per my > current knowledge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10953) a minor concurrent bug inside NetworkTopology
[ https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HADOOP-10953: --- Attachment: HADOOP-10953.txt > a minor concurrent bug inside NetworkTopology > - > > Key: HADOOP-10953 > URL: https://issues.apache.org/jira/browse/HADOOP-10953 > Project: Hadoop Common > Issue Type: Bug > Components: net >Affects Versions: 3.0.0 >Reporter: Liang Xie >Assignee: Liang Xie >Priority: Minor > Attachments: HADOOP-10953.txt > > > Found this issue while reading the related code. In > NetworkTopology.toString() method, there is no thread safety guarantee > directly, it's called by add/remove, and inside add/remove, most of > this.toString() calls are protected by rwlock, except a couple of error > handling codes, one possible fix is that moving them into lock as well, due > to not heavy operations, so no obvious downgration should be observed per my > current knowledge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10953) a minor concurrent bug inside NetworkTopology
Liang Xie created HADOOP-10953: -- Summary: a minor concurrent bug inside NetworkTopology Key: HADOOP-10953 URL: https://issues.apache.org/jira/browse/HADOOP-10953 Project: Hadoop Common Issue Type: Bug Components: net Affects Versions: 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Found this issue while reading the related code. In NetworkTopology.toString() method, there is no thread safety guarantee directly, it's called by add/remove, and inside add/remove, most of this.toString() calls are protected by rwlock, except a couple of error handling codes, one possible fix is that moving them into lock as well, due to not heavy operations, so no obvious downgration should be observed per my current knowledge. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables
[ https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092603#comment-14092603 ] Hudson commented on HADOOP-10402: - FAILURE: Integrated in Hadoop-Yarn-trunk #641 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/641/]) HADOOP-10402. Configuration.getValByRegex does not substitute for variables. (Robert Kanter via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java > Configuration.getValByRegex does not substitute for variables > - > > Key: HADOOP-10402 > URL: https://issues.apache.org/jira/browse/HADOOP-10402 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.6.0 > > Attachments: HADOOP-10402.patch > > > When using Configuration.getValByRegex(...), variables are not resolved. > For example: > {code:xml} > >bar >woot > > >foo3 >${bar} > > {code} > If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, > it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-8989) hadoop dfs -find feature
[ https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092562#comment-14092562 ] Akira AJISAKA commented on HADOOP-8989: --- Thanks [~jonallen] for the update. +1 (non-binding). Sorry for the late response. > hadoop dfs -find feature > > > Key: HADOOP-8989 > URL: https://issues.apache.org/jira/browse/HADOOP-8989 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Marco Nicosia >Assignee: Jonathan Allen > Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, > HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, > HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, > HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, > HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, > HADOOP-8989.patch, HADOOP-8989.patch > > > Both sysadmins and users make frequent use of the unix 'find' command, but > Hadoop has no correlate. Without this, users are writing scripts which make > heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs > -lsr is somewhat taxing on the NameNode, and a really slow experience on the > client side. Possibly an in-NameNode find operation would be only a bit more > taxing on the NameNode, but significantly faster from the client's point of > view? > The minimum set of options I can think of which would make a Hadoop find > command generally useful is (in priority order): > * -type (file or directory, for now) > * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments) > * -print0 (for piping to xargs -0) > * -depth > * -owner/-group (and -nouser/-nogroup) > * -name (allowing for shell pattern, or even regex?) > * -perm > * -size > One possible special case, but could possibly be really cool if it ran from > within the NameNode: > * -delete > The "hadoop dfs -lsr | hadoop dfs -rm" cycle is really, really slow. > Lower priority, some people do use operators, mostly to execute -or searches > such as: > * find / \(-nouser -or -nogroup\) > Finally, I thought I'd include a link to the [Posix spec for > find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10885) Fix dead links to the javadocs of o.a.h.security.authorize
[ https://issues.apache.org/jira/browse/HADOOP-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092526#comment-14092526 ] Ray Chiang commented on HADOOP-10885: - Just observing. Peeking through the org.apache.hadoop.security.authorize files: AccessControlList.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) AuthorizationException.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) DefaultImpersonationProvider.java:@InterfaceAudience.Public ImpersonationProvider.java:@InterfaceAudience.Public PolicyProvider.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) ProxyUsers.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce", "HBase", "Hive"}) RefreshAuthorizationPolicyProtocol.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) Service.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) ServiceAuthorizationManager.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"}) package-info.java:@InterfaceAudience.LimitedPrivate({"HBase", "HDFS", "MapReduce"}) It also looks like package-info.java will override the "Hive" access part of ProxyUsers.java. Is the right fix: a) Update DefaultImpersonationProvider/ImpersonationProvider to be LimitedPrivate and fix the fields for ProxyUsers.java. If so, what setting(s)? b) Fix package-info.java to be @InterfaceAudience.Public. I'd guess a), but it would be good to check with someone who actually has the right answer. > Fix dead links to the javadocs of o.a.h.security.authorize > -- > > Key: HADOOP-10885 > URL: https://issues.apache.org/jira/browse/HADOOP-10885 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.6.0 >Reporter: Akira AJISAKA >Priority: Minor > Labels: newbie > > In API doc ([my trunk > build|http://aajisaka.github.io/hadoop-project/api/index.html]), > {{ImpersonationProvider}} and {{DefaultImpersonationProvider}} classes are > linked but these documents are not generated. > There's an inconsistency about {{@InterfaceAudience}} between package-info > and these classes, so these dead links are generated. -- This message was sent by Atlassian JIRA (v6.2#6252)