[jira] [Commented] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework

2014-08-11 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093765#comment-14093765
 ] 

Kai Zheng commented on HADOOP-10959:


Below is the brief introduction about the proposed solution.

We proposed to add token-preauth mechanism similar to PKINIT and OTP for 
Kerberos based on the Pre-Authentication framework, which allows users to 
authenticate to KDC using a JWT token instead of password. KDC authenticates 
the JWT token and issues TGT as it would trust the token authority/issuer via 
PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and 
they’re interested. Currently we’re collaborating with MIT team to work on the 
draft and standardize the mechanism. We also did a POC which implemented the 
token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately 
packaged as a Linux .so module and deployed additionally for existing 
installations. MIT also wish we could contribute the codes and make it 
available in their future releases. Before that we can make the plugin binary 
and source codes available to the community for experimental usage and review.

So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, 
the end users can authenticate to 3rd party JWT token authorities and get 
tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on 
that, we implemented the token authentication for Hadoop, with only a few of 
central modifications into the code base, as we don’t have to add another 
Authentication Method and the solution leverages the existing Kerberos support.

We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to 
support logging in using a token or token cache. The new module is compatible 
with Krb5LoginModule in configuration and functionality, thus can be used 
safely.

We also added KerberosTokenAuthenticationHandler to support Hadoop web 
interfaces. It extends KerberosAuthenticationHandler and adds to support token 
authentication and perform the SPNEGO negotiation purely in server side in the 
new handler. Again the new handler is compatible with 
KerberosAuthenticationHandler and can be used safely.

Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as 
normally does. In addition to that, to employ the token attributes to enforce 
fine-grained authorization or whatever, a token derivation is encapsulated into 
ticket as Authorization data when KDC issues the ticket with the token. Then in 
service (Hadoop services) side, token can be queried and extracted from service 
ticket. We made this happen in both GSSAPI and SASL contexts as the both are 
used in Hadoop.

As we can see or think of, the main concern for this solution may be that it 
requires to deploy additional plugin for existing Kerberos installations, and 
involves necessary identity accounts sync from identity management systems to 
Kerberos KDC. Most importantly, it requires Kerberos deployment as its 
prerequisite setup. We’re also discussing with MIT team about how to simplify 
Kerberos deployment especially for Hadoop large clusters and alleviate the 
overhead to employ PKINIT/token-preauth mechanisms like identity sync. 

> A Complement and Short Term Solution to TokenAuth Based on Kerberos 
> Pre-Authentication Framework
> 
>
> Key: HADOOP-10959
> URL: https://issues.apache.org/jira/browse/HADOOP-10959
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>  Labels: Rhino
> Attachments: KerbToken-v2.pdf
>
>
> To implement and integrate pluggable authentication providers, enhance 
> desirable single sign on for end users, and help enforce centralized access 
> control on the platform, the community has widely discussed and concluded 
> token based authentication could be the appropriate approach. TokenAuth 
> (HADOOP-9392) was proposed and is under development to implement another 
> Authentication Method in lieu with Simple and Kerberos. It is a big and long 
> term effort to support TokenAuth across the entire ecosystem. We here propose 
> a short term replacement based on Kerberos that can complement to TokenAuth. 
> Our solution involves less codes changes with limited risk and the main 
> development work has already been done in our POC. Users can use our solution 
> as a short term solution to support token inside Hadoop.
> This effort and resultant solution will be fully described in the design 
> document to be attached. And the brief introduction will be commented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework

2014-08-11 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-10959:
---

Description: 
To implement and integrate pluggable authentication providers, enhance 
desirable single sign on for end users, and help enforce centralized access 
control on the platform, the community has widely discussed and concluded token 
based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) 
was proposed and is under development to implement another Authentication 
Method in lieu with Simple and Kerberos. It is a big and long term effort to 
support TokenAuth across the entire ecosystem. We here propose a short term 
replacement based on Kerberos that can complement to TokenAuth. Our solution 
involves less codes changes with limited risk and the main development work has 
already been done in our POC. Users can use our solution as a short term 
solution to support token inside Hadoop.

This effort and resultant solution will be fully described in the design 
document to be attached. And the brief introduction will be commented.


  was:
To implement and integrate pluggable authentication providers, enhance 
desirable single sign on for end users, and help enforce centralized access 
control on the platform, the community has widely discussed and concluded token 
based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) 
was proposed and is under development to implement another Authentication 
Method in lieu with Simple and Kerberos. It is a big and long term effort to 
support TokenAuth across the entire ecosystem. We here propose a short term 
replacement based on Kerberos that can complement to TokenAuth. Our solution 
involves less codes changes with limited risk and the main development work has 
already been done in our POC. Users can use our solution as a short term 
solution to support token inside Hadoop.

This effort and resultant solution will be fully described in the design 
document to be attached soon. Below is the brief introduction.

We proposed to add token-preauth mechanism similar to PKINIT and OTP for 
Kerberos based on the Pre-Authentication framework, which allows users to 
authenticate to KDC using a JWT token instead of password. KDC authenticates 
the JWT token and issues TGT as it would trust the token authority/issuer via 
PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and 
they’re interested. Currently we’re collaborating with MIT team to work on the 
draft and standardize the mechanism. We also did a POC which implemented the 
token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately 
packaged as a Linux .so module and deployed additionally for existing 
installations. MIT also wish we could contribute the codes and make it 
available in their future releases. Before that we can make the plugin binary 
and source codes available to the community for experimental usage and review.

So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, 
the end users can authenticate to 3rd party JWT token authorities and get 
tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on 
that, we implemented the token authentication for Hadoop, with only a few of 
central modifications into the code base, as we don’t have to add another 
Authentication Method and the solution leverages the existing Kerberos support.

We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to 
support logging in using a token or token cache. The new module is compatible 
with Krb5LoginModule in configuration and functionality, thus can be used 
safely.

We also added KerberosTokenAuthenticationHandler to support Hadoop web 
interfaces. It extends KerberosAuthenticationHandler and adds to support token 
authentication and perform the SPNEGO negotiation purely in server side in the 
new handler. Again the new handler is compatible with 
KerberosAuthenticationHandler and can be used safely.

Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as 
normally does. In addition to that, to employ the token attributes to enforce 
fine-grained authorization or whatever, a token derivation is encapsulated into 
ticket as Authorization data when KDC issues the ticket with the token. Then in 
service (Hadoop services) side, token can be queried and extracted from service 
ticket. We made this happen in both GSSAPI and SASL contexts as the both are 
used in Hadoop.

As we can see or think of, the main concern for this solution may be that it 
requires to deploy additional plugin for existing Kerberos installations, and 
involves necessary identity accounts sync from identity management systems to 
Kerberos KDC. Most importantly, it requires Kerberos deployment as its 
prerequisite setup. We’re also discussing with MIT team about how to simplify 
Kerbero

[jira] [Commented] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”

2014-08-11 Thread Jean-Baptiste Note (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093762#comment-14093762
 ] 

Jean-Baptiste Note commented on HADOOP-10960:
-

We saw somthing very similar to this problem (among others) recurring on RHEL5.
It appears that the kernel, after an initial, legit, softlockup report (for 
instance because of IO contention), can go into a loop of reporting soft 
lockups and -- by the mere amount of data spewed -- lock itself to panic.

For us it was due to dumping data to the (relatively slow) serial console, for 
you it may be by dumping data to disk, which is presumably the cause for 
contention in your case.

Once you've cleared the way for real issues (controller problems, for 
instance), you may want to in vestigate one of the following:
0) reduce kernel verbosity on the console (provided it reduces the amount of 
data dumped to /var/log/messages, i'm not familiar with your setup, we're 
remote logging everything)
1) disable softlockup reboot
2) disable disk logging of kernel messages / log to tmpfs / log to a separate, 
dedicated system *disk*

HTH

> hadoop cause system crash with “soft lock” and “hard lock”
> --
>
> Key: HADOOP-10960
> URL: https://issues.apache.org/jira/browse/HADOOP-10960
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
> Environment: redhat rhel 6.3,6,4,6.5
> jdk1.7.0_45
> hadoop2.2
>Reporter: linbao111
>Priority: Critical
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after 
> a while. /var/log/messages shows repeatedly:
> Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! 
> [jsvc:11508]
> Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
> iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
> dcdbas serio_raw iTCO_w
> dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
> crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
> dm_log dm_m
> od [last unloaded: scsi_wait_scan]
> Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 
> Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
> iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
> dcdbas serio_raw iTCO_w
> dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
> crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
> dm_log dm_m
> od [last unloaded: scsi_wait_scan]
> Aug 11 06:30:42 jn4_73_128 kernel: 
> Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW 
>  ---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW
> Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[]  
> [] wait_for_rqlock+0x28/0x40
> Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8  EFLAGS: 
> 0202
> Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: 
> 8807786c3ee8 RCX: 880028216680
> Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: 
> 88061cd29370 RDI: 0286
> Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: 
> 0001 R09: 0001
> Aug 11 06:30:42 jn4_73_128 kernel: R10:  R11: 
>  R12: 0286
> Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: 
> 810e0f6e R15: 8807786c3e48
> Aug 11 06:30:42 jn4_73_128 kernel: FS:  () 
> GS:88002820() knlGS:
> Aug 11 06:30:42 jn4_73_128 kernel: CS:  0010 DS:  ES:  CR0: 
> 80050033
> Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: 
> 01a85000 CR4: 06e0
> Aug 11 06:30:42 jn4_73_128 kernel: DR0:  DR1: 
>  DR2: 
> Aug 11 06:30:42 jn4_73_128 kernel: DR3:  DR6: 
> 0ff0 DR7: 0400
> Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo 
> 8807786c2000, task 880c1def3500)
> Aug 11 06:30:42 jn4_73_128 kernel: Stack:
> Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b 
>  8807786c3f28
> Aug 11 06:30:42 jn4_73_128 kernel:  880701735260 880c1def39c8 
> 880c1def39c8 
> Aug 11 06:30:42 jn4_73_128 kernel:  8807786c3f28 8807786c3f28 
> 8807786c3f78 7f092d0ad700
> Aug 11 06:30:42 jn4_73_128 kernel: Call Trace:
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? 
> system_call_fastpath+0x16/0x1b
> Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f

[jira] [Commented] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093751#comment-14093751
 ] 

Hadoop QA commented on HADOOP-10957:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12661109/HADOOP-10957.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.fs.TestGlobPaths

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4453//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4453//console

This message is automatically generated.

> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard
> -
>
> Key: HADOOP-10957
> URL: https://issues.apache.org/jira/browse/HADOOP-10957
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10957.001.patch
>
>
> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard.  The existing unit tests don't catch 
> this, because it doesn't happen for superusers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”

2014-08-11 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HADOOP-10960.


Resolution: Invalid

Hadoop core has no kernel mode components so it cannot cause a kernel panic. 
You likely have a buggy device driver or hit a kernel bug.

Resolving as Invalid.

> hadoop cause system crash with “soft lock” and “hard lock”
> --
>
> Key: HADOOP-10960
> URL: https://issues.apache.org/jira/browse/HADOOP-10960
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
> Environment: redhat rhel 6.3,6,4,6.5
> jdk1.7.0_45
> hadoop2.2
>Reporter: linbao111
>Priority: Critical
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after 
> a while. /var/log/messages shows repeatedly:
> Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! 
> [jsvc:11508]
> Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
> iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
> dcdbas serio_raw iTCO_w
> dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
> crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
> dm_log dm_m
> od [last unloaded: scsi_wait_scan]
> Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 
> Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
> iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
> dcdbas serio_raw iTCO_w
> dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
> crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
> dm_log dm_m
> od [last unloaded: scsi_wait_scan]
> Aug 11 06:30:42 jn4_73_128 kernel: 
> Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW 
>  ---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW
> Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[]  
> [] wait_for_rqlock+0x28/0x40
> Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8  EFLAGS: 
> 0202
> Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: 
> 8807786c3ee8 RCX: 880028216680
> Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: 
> 88061cd29370 RDI: 0286
> Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: 
> 0001 R09: 0001
> Aug 11 06:30:42 jn4_73_128 kernel: R10:  R11: 
>  R12: 0286
> Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: 
> 810e0f6e R15: 8807786c3e48
> Aug 11 06:30:42 jn4_73_128 kernel: FS:  () 
> GS:88002820() knlGS:
> Aug 11 06:30:42 jn4_73_128 kernel: CS:  0010 DS:  ES:  CR0: 
> 80050033
> Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: 
> 01a85000 CR4: 06e0
> Aug 11 06:30:42 jn4_73_128 kernel: DR0:  DR1: 
>  DR2: 
> Aug 11 06:30:42 jn4_73_128 kernel: DR3:  DR6: 
> 0ff0 DR7: 0400
> Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo 
> 8807786c2000, task 880c1def3500)
> Aug 11 06:30:42 jn4_73_128 kernel: Stack:
> Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b 
>  8807786c3f28
> Aug 11 06:30:42 jn4_73_128 kernel:  880701735260 880c1def39c8 
> 880c1def39c8 
> Aug 11 06:30:42 jn4_73_128 kernel:  8807786c3f28 8807786c3f28 
> 8807786c3f78 7f092d0ad700
> Aug 11 06:30:42 jn4_73_128 kernel: Call Trace:
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? 
> system_call_fastpath+0x16/0x1b
> Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f 44 00 00 
> 48 c7 c0 80 66 01 00 65 48 8b 0c 25 b0 e0 00 00 0f ae f0 48 01 c1 eb 09 0f 1f 
> 80 00 00 00 00  90 8b 01 89 c2 c1 fa 10 66 39 c2 75 f2 c9 c3 0f 1f 84 00 
> 00 
> Aug 11 06:30:42 jn4_73_128 kernel: Call Trace:
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20
> Aug 11 06:30:42 jn4_73_128 kernel: [] ? 
> system_call_fastpath+0x16/0x1b
> 
> and finally crashed
> crash /usr/lib/debug/lib/modules/2.6.32-431.5.1.el6.x86_64/vmlinux  
> /opt/crash/127.0.0.1-2014-08-10-09\:47\:38/vmcore
> crash 6.1.0-5.el6
> Copyright (C) 2002-2012  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> Copyright (C) 2006, 2007  

[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093726#comment-14093726
 ] 

zhihai xu commented on HADOOP-10820:


I tested TestWebDelegationToken at local. I didn't see the error. Also the 
error is 
"java.net.BindException: Address already in use", which looks like some address 
conflict in the test environment.


---
 T E S T S
---
Running org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.484 sec - in 
org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken

Results :

Tests run: 7, Failures: 0, Errors: 0, Skipped: 0



> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820-3.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10960) hadoop cause system crash with “soft lock” and “hard lock”

2014-08-11 Thread linbao111 (JIRA)
linbao111 created HADOOP-10960:
--

 Summary: hadoop cause system crash with “soft lock” and “hard lock”
 Key: HADOOP-10960
 URL: https://issues.apache.org/jira/browse/HADOOP-10960
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
 Environment: redhat rhel 6.3,6,4,6.5
jdk1.7.0_45
hadoop2.2
Reporter: linbao111
Priority: Critical


I am running hadoop2.2 on redhat6.3-6.5,and all of my machines crashed after a 
while. /var/log/messages shows repeatedly:

Aug 11 06:30:42 jn4_73_128 kernel: BUG: soft lockup - CPU#1 stuck for 67s! 
[jsvc:11508]
Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
dcdbas serio_raw iTCO_w
dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
dm_log dm_m
od [last unloaded: scsi_wait_scan]
Aug 11 06:30:42 jn4_73_128 kernel: CPU 1 
Aug 11 06:30:42 jn4_73_128 kernel: Modules linked in: bridge stp llc 
iptable_filter ip_tables mptctl mptbase xfs exportfs power_meter microcode 
dcdbas serio_raw iTCO_w
dt iTCO_vendor_support i7core_edac edac_core sg bnx2 ext4 mbcache jbd2 sd_mod 
crc_t10dif wmi mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash 
dm_log dm_m
od [last unloaded: scsi_wait_scan]
Aug 11 06:30:42 jn4_73_128 kernel: 
Aug 11 06:30:42 jn4_73_128 kernel: Pid: 11508, comm: jsvc Tainted: GW  
---2.6.32-279.el6.x86_64 #1 Dell Inc. PowerEdge R510/084YMW
Aug 11 06:30:42 jn4_73_128 kernel: RIP: 0010:[]  
[] wait_for_rqlock+0x28/0x40
Aug 11 06:30:42 jn4_73_128 kernel: RSP: 0018:8807786c3ee8  EFLAGS: 0202
Aug 11 06:30:42 jn4_73_128 kernel: RAX: f6e9f6e1 RBX: 8807786c3ee8 
RCX: 880028216680
Aug 11 06:30:42 jn4_73_128 kernel: RDX: f6e9 RSI: 88061cd29370 
RDI: 0286
Aug 11 06:30:42 jn4_73_128 kernel: RBP: 8100bc0e R08: 0001 
R09: 0001
Aug 11 06:30:42 jn4_73_128 kernel: R10:  R11:  
R12: 0286
Aug 11 06:30:42 jn4_73_128 kernel: R13: 8807786c3eb8 R14: 810e0f6e 
R15: 8807786c3e48
Aug 11 06:30:42 jn4_73_128 kernel: FS:  () 
GS:88002820() knlGS:
Aug 11 06:30:42 jn4_73_128 kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Aug 11 06:30:42 jn4_73_128 kernel: CR2: 00e5bd70 CR3: 01a85000 
CR4: 06e0
Aug 11 06:30:42 jn4_73_128 kernel: DR0:  DR1:  
DR2: 
Aug 11 06:30:42 jn4_73_128 kernel: DR3:  DR6: 0ff0 
DR7: 0400
Aug 11 06:30:42 jn4_73_128 kernel: Process jsvc (pid: 11508, threadinfo 
8807786c2000, task 880c1def3500)
Aug 11 06:30:42 jn4_73_128 kernel: Stack:
Aug 11 06:30:42 jn4_73_128 kernel: 8807786c3f68 8107091b 
 8807786c3f28
Aug 11 06:30:42 jn4_73_128 kernel:  880701735260 880c1def39c8 
880c1def39c8 
Aug 11 06:30:42 jn4_73_128 kernel:  8807786c3f28 8807786c3f28 
8807786c3f78 7f092d0ad700
Aug 11 06:30:42 jn4_73_128 kernel: Call Trace:
Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870
Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20
Aug 11 06:30:42 jn4_73_128 kernel: [] ? 
system_call_fastpath+0x16/0x1b
Aug 11 06:30:42 jn4_73_128 kernel: Code: ff ff 90 55 48 89 e5 0f 1f 44 00 00 48 
c7 c0 80 66 01 00 65 48 8b 0c 25 b0 e0 00 00 0f ae f0 48 01 c1 eb 09 0f 1f 80 
00 00 00 00  90 8b 01 89 c2 c1 fa 10 66 39 c2 75 f2 c9 c3 0f 1f 84 00 00 
Aug 11 06:30:42 jn4_73_128 kernel: Call Trace:
Aug 11 06:30:42 jn4_73_128 kernel: [] ? do_exit+0x5ab/0x870
Aug 11 06:30:42 jn4_73_128 kernel: [] ? sys_exit+0x17/0x20
Aug 11 06:30:42 jn4_73_128 kernel: [] ? 
system_call_fastpath+0x16/0x1b

and finally crashed

crash /usr/lib/debug/lib/modules/2.6.32-431.5.1.el6.x86_64/vmlinux  
/opt/crash/127.0.0.1-2014-08-10-09\:47\:38/vmcore

crash 6.1.0-5.el6
Copyright (C) 2002-2012  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later

[jira] [Updated] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework

2014-08-11 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-10959:
---

Attachment: KerbToken-v2.pdf

The design doc. Your comments are welcome. Thanks.

> A Complement and Short Term Solution to TokenAuth Based on Kerberos 
> Pre-Authentication Framework
> 
>
> Key: HADOOP-10959
> URL: https://issues.apache.org/jira/browse/HADOOP-10959
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>  Labels: Rhino
> Attachments: KerbToken-v2.pdf
>
>
> To implement and integrate pluggable authentication providers, enhance 
> desirable single sign on for end users, and help enforce centralized access 
> control on the platform, the community has widely discussed and concluded 
> token based authentication could be the appropriate approach. TokenAuth 
> (HADOOP-9392) was proposed and is under development to implement another 
> Authentication Method in lieu with Simple and Kerberos. It is a big and long 
> term effort to support TokenAuth across the entire ecosystem. We here propose 
> a short term replacement based on Kerberos that can complement to TokenAuth. 
> Our solution involves less codes changes with limited risk and the main 
> development work has already been done in our POC. Users can use our solution 
> as a short term solution to support token inside Hadoop.
> This effort and resultant solution will be fully described in the design 
> document to be attached soon. Below is the brief introduction.
> We proposed to add token-preauth mechanism similar to PKINIT and OTP for 
> Kerberos based on the Pre-Authentication framework, which allows users to 
> authenticate to KDC using a JWT token instead of password. KDC authenticates 
> the JWT token and issues TGT as it would trust the token authority/issuer via 
> PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and 
> they’re interested. Currently we’re collaborating with MIT team to work on 
> the draft and standardize the mechanism. We also did a POC which implemented 
> the token-preauth mechanism as a MIT Kerberos plugin. The plugin can be 
> separately packaged as a Linux .so module and deployed additionally for 
> existing installations. MIT also wish we could contribute the codes and make 
> it available in their future releases. Before that we can make the plugin 
> binary and source codes available to the community for experimental usage and 
> review.
> So ideally token-preauth plugin can be deployed to a MIT Kerberos 
> installation, the end users can authenticate to 3rd party JWT token 
> authorities and get tokens, and then use the tokens to acquire Kerberos TGT 
> from KDC. Based on that, we implemented the token authentication for Hadoop, 
> with only a few of central modifications into the code base, as we don’t have 
> to add another Authentication Method and the solution leverages the existing 
> Kerberos support.
> We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to 
> support logging in using a token or token cache. The new module is compatible 
> with Krb5LoginModule in configuration and functionality, thus can be used 
> safely.
> We also added KerberosTokenAuthenticationHandler to support Hadoop web 
> interfaces. It extends KerberosAuthenticationHandler and adds to support 
> token authentication and perform the SPNEGO negotiation purely in server side 
> in the new handler. Again the new handler is compatible with 
> KerberosAuthenticationHandler and can be used safely.
> Token is used to exchange Kerberos ticket and ticket goes to Hadoop services 
> as normally does. In addition to that, to employ the token attributes to 
> enforce fine-grained authorization or whatever, a token derivation is 
> encapsulated into ticket as Authorization data when KDC issues the ticket 
> with the token. Then in service (Hadoop services) side, token can be queried 
> and extracted from service ticket. We made this happen in both GSSAPI and 
> SASL contexts as the both are used in Hadoop.
> As we can see or think of, the main concern for this solution may be that it 
> requires to deploy additional plugin for existing Kerberos installations, and 
> involves necessary identity accounts sync from identity management systems to 
> Kerberos KDC. Most importantly, it requires Kerberos deployment as its 
> prerequisite setup. We’re also discussing with MIT team about how to simplify 
> Kerberos deployment especially for Hadoop large clusters and alleviate the 
> overhead to employ PKINIT/token-preauth mechanisms like identity sync. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10959) A Complement and Short Term Solution to TokenAuth Based on Kerberos Pre-Authentication Framework

2014-08-11 Thread Kai Zheng (JIRA)
Kai Zheng created HADOOP-10959:
--

 Summary: A Complement and Short Term Solution to TokenAuth Based 
on Kerberos Pre-Authentication Framework
 Key: HADOOP-10959
 URL: https://issues.apache.org/jira/browse/HADOOP-10959
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng


To implement and integrate pluggable authentication providers, enhance 
desirable single sign on for end users, and help enforce centralized access 
control on the platform, the community has widely discussed and concluded token 
based authentication could be the appropriate approach. TokenAuth (HADOOP-9392) 
was proposed and is under development to implement another Authentication 
Method in lieu with Simple and Kerberos. It is a big and long term effort to 
support TokenAuth across the entire ecosystem. We here propose a short term 
replacement based on Kerberos that can complement to TokenAuth. Our solution 
involves less codes changes with limited risk and the main development work has 
already been done in our POC. Users can use our solution as a short term 
solution to support token inside Hadoop.

This effort and resultant solution will be fully described in the design 
document to be attached soon. Below is the brief introduction.

We proposed to add token-preauth mechanism similar to PKINIT and OTP for 
Kerberos based on the Pre-Authentication framework, which allows users to 
authenticate to KDC using a JWT token instead of password. KDC authenticates 
the JWT token and issues TGT as it would trust the token authority/issuer via 
PKI mechanism. The proposal was submitted to Kerberos and IETF Kitten WG and 
they’re interested. Currently we’re collaborating with MIT team to work on the 
draft and standardize the mechanism. We also did a POC which implemented the 
token-preauth mechanism as a MIT Kerberos plugin. The plugin can be separately 
packaged as a Linux .so module and deployed additionally for existing 
installations. MIT also wish we could contribute the codes and make it 
available in their future releases. Before that we can make the plugin binary 
and source codes available to the community for experimental usage and review.

So ideally token-preauth plugin can be deployed to a MIT Kerberos installation, 
the end users can authenticate to 3rd party JWT token authorities and get 
tokens, and then use the tokens to acquire Kerberos TGT from KDC. Based on 
that, we implemented the token authentication for Hadoop, with only a few of 
central modifications into the code base, as we don’t have to add another 
Authentication Method and the solution leverages the existing Kerberos support.

We added KrbTokenLoginModule that extends the Krb5LoginModule and adds to 
support logging in using a token or token cache. The new module is compatible 
with Krb5LoginModule in configuration and functionality, thus can be used 
safely.

We also added KerberosTokenAuthenticationHandler to support Hadoop web 
interfaces. It extends KerberosAuthenticationHandler and adds to support token 
authentication and perform the SPNEGO negotiation purely in server side in the 
new handler. Again the new handler is compatible with 
KerberosAuthenticationHandler and can be used safely.

Token is used to exchange Kerberos ticket and ticket goes to Hadoop services as 
normally does. In addition to that, to employ the token attributes to enforce 
fine-grained authorization or whatever, a token derivation is encapsulated into 
ticket as Authorization data when KDC issues the ticket with the token. Then in 
service (Hadoop services) side, token can be queried and extracted from service 
ticket. We made this happen in both GSSAPI and SASL contexts as the both are 
used in Hadoop.

As we can see or think of, the main concern for this solution may be that it 
requires to deploy additional plugin for existing Kerberos installations, and 
involves necessary identity accounts sync from identity management systems to 
Kerberos KDC. Most importantly, it requires Kerberos deployment as its 
prerequisite setup. We’re also discussing with MIT team about how to simplify 
Kerberos deployment especially for Hadoop large clusters and alleviate the 
overhead to employ PKINIT/token-preauth mechanisms like identity sync. 




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option

2014-08-11 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8944:
-

Attachment: HADOOP-8944-2.patch

> Shell command fs -count should include human readable option
> 
>
> Key: HADOOP-8944
> URL: https://issues.apache.org/jira/browse/HADOOP-8944
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Allen
>Assignee: Allen Wittenauer
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, 
> HADOOP-8944.patch
>
>
> The shell command fs -count report sizes in bytes.  The command should accept 
> a -h option to display the sizes in a human readable format, i.e. K, M, G, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option

2014-08-11 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8944:
-

Attachment: (was: HADOOP-8944-2.patch)

> Shell command fs -count should include human readable option
> 
>
> Key: HADOOP-8944
> URL: https://issues.apache.org/jira/browse/HADOOP-8944
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Allen
>Assignee: Allen Wittenauer
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-8944-1.patch, HADOOP-8944.patch
>
>
> The shell command fs -count report sizes in bytes.  The command should accept 
> a -h option to display the sizes in a human readable format, i.e. K, M, G, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093677#comment-14093677
 ] 

Hadoop QA commented on HADOOP-10820:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661108/HADOOP-10820-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  
org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4454//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4454//console

This message is automatically generated.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820-3.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()

2014-08-11 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093666#comment-14093666
 ] 

Tsuyoshi OZAWA commented on HADOOP-9869:


The test failure is not related to the patch. [~ste...@apache.org], do you mind 
reviewing a patch?

>  Configuration.getSocketAddr()/getEnum() should use getTrimmed()
> 
>
> Key: HADOOP-9869
> URL: https://issues.apache.org/jira/browse/HADOOP-9869
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: conf
>Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0
>Reporter: Steve Loughran
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, 
> HADOOP-9869.3.patch, HADOOP-9869.4.patch
>
>
> YARN-1059 has shown that the hostname:port string used for the address of 
> things like the RM isn't trimmed before its parsed, leading to errors that 
> aren't that obvious. 
> We should trim it -it's clearly not going to break any existing (valid) 
> configurations



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option

2014-08-11 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8944:
-

Status: Patch Available  (was: Open)

> Shell command fs -count should include human readable option
> 
>
> Key: HADOOP-8944
> URL: https://issues.apache.org/jira/browse/HADOOP-8944
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Allen
>Assignee: Allen Wittenauer
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, 
> HADOOP-8944.patch
>
>
> The shell command fs -count report sizes in bytes.  The command should accept 
> a -h option to display the sizes in a human readable format, i.e. K, M, G, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option

2014-08-11 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8944:
-

Attachment: HADOOP-8944-2.patch

This has reworked test cases and undid the change to QUOTA_HEADER.

> Shell command fs -count should include human readable option
> 
>
> Key: HADOOP-8944
> URL: https://issues.apache.org/jira/browse/HADOOP-8944
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Allen
>Assignee: Allen Wittenauer
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-8944-1.patch, HADOOP-8944-2.patch, 
> HADOOP-8944.patch
>
>
> The shell command fs -count report sizes in bytes.  The command should accept 
> a -h option to display the sizes in a human readable format, i.e. K, M, G, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093660#comment-14093660
 ] 

Hadoop QA commented on HADOOP-9869:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/1266/HADOOP-9869.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4452//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4452//console

This message is automatically generated.

>  Configuration.getSocketAddr()/getEnum() should use getTrimmed()
> 
>
> Key: HADOOP-9869
> URL: https://issues.apache.org/jira/browse/HADOOP-9869
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: conf
>Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0
>Reporter: Steve Loughran
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, 
> HADOOP-9869.3.patch, HADOOP-9869.4.patch
>
>
> YARN-1059 has shown that the hostname:port string used for the address of 
> things like the RM isn't trimmed before its parsed, leading to errors that 
> aren't that obvious. 
> We should trim it -it's clearly not going to break any existing (valid) 
> configurations



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10836) Replace HttpFS custom proxyuser handling with common implementation

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093639#comment-14093639
 ] 

Hadoop QA commented on HADOOP-10836:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661100/HADOOP-10836.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs-httpfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4451//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4451//console

This message is automatically generated.

> Replace HttpFS custom proxyuser handling with common implementation
> ---
>
> Key: HADOOP-10836
> URL: https://issues.apache.org/jira/browse/HADOOP-10836
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: COMBO.patch, HADOOP-10836.patch, HADOOP-10836.patch, 
> HADOOP-10836.patch, HADOOP-10836.patch
>
>
> Use HADOOP-10835 to implement proxyuser logic in HttpFS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093632#comment-14093632
 ] 

Colin Patrick McCabe commented on HADOOP-10957:
---

Credit goes to Daryn for originally identifying this issue in HADOOP-10942

> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard
> -
>
> Key: HADOOP-10957
> URL: https://issues.apache.org/jira/browse/HADOOP-10957
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10957.001.patch
>
>
> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard.  The existing unit tests don't catch 
> this, because it doesn't happen for superusers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093634#comment-14093634
 ] 

Colin Patrick McCabe commented on HADOOP-10942:
---

I created HADOOP-10958 for the globber test rework (I think it's going to be a 
giant patch, although simple in concept.).  I created HADOOP-10957 for the 
urgent globber bug, and posted a small bugfix that we can easily backport.  Can 
you file a JIRA for the FileContext issue, if there's not one out there 
already?  And perhaps one for any other miscellaneous optimizations / 
refactoring we should do in the globber.

> Globbing optimizations and regression fix
> -
>
> Key: HADOOP-10942
> URL: https://issues.apache.org/jira/browse/HADOOP-10942
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HADOOP-10942.patch
>
>
> When globbing was commonized to support both filesystem and filecontext, it 
> regressed a fix that prevents an intermediate glob that matches a file from 
> throwing a confusing permissions exception.  The hdfs traverse check requires 
> the exec bit which a file does not have.
> Additional optimizations to reduce rpcs actually increases them if 
> directories contain 1 item.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9869) Configuration.getSocketAddr()/getEnum() should use getTrimmed()

2014-08-11 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated HADOOP-9869:
---

Attachment: HADOOP-9869.4.patch

Rebased on trunk.

>  Configuration.getSocketAddr()/getEnum() should use getTrimmed()
> 
>
> Key: HADOOP-9869
> URL: https://issues.apache.org/jira/browse/HADOOP-9869
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: conf
>Affects Versions: 3.0.0, 2.1.0-beta, 1.3.0
>Reporter: Steve Loughran
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: HADOOP-9869.1.patch, HADOOP-9869.2.patch, 
> HADOOP-9869.3.patch, HADOOP-9869.4.patch
>
>
> YARN-1059 has shown that the hostname:port string used for the address of 
> things like the RM isn't trimmed before its parsed, leading to errors that 
> aren't that obvious. 
> We should trim it -it's clearly not going to break any existing (valid) 
> configurations



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard

2014-08-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10957:
--

Status: Patch Available  (was: Open)

> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard
> -
>
> Key: HADOOP-10957
> URL: https://issues.apache.org/jira/browse/HADOOP-10957
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10957.001.patch
>
>
> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard.  The existing unit tests don't catch 
> this, because it doesn't happen for superusers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093628#comment-14093628
 ] 

Andrew Wang commented on HADOOP-10820:
--

+1 pending Jenkins, thanks Zhihai!

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820-3.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard

2014-08-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10957:
--

Attachment: HADOOP-10957.001.patch

> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard
> -
>
> Key: HADOOP-10957
> URL: https://issues.apache.org/jira/browse/HADOOP-10957
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10957.001.patch
>
>
> The globber will sometimes erroneously return a permission denied exception 
> when there is a non-terminal wildcard.  The existing unit tests don't catch 
> this, because it doesn't happen for superusers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10958) TestGlobPaths should do more tests of globbing by unprivileged users

2014-08-11 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-10958:
-

 Summary: TestGlobPaths should do more tests of globbing by 
unprivileged users
 Key: HADOOP-10958
 URL: https://issues.apache.org/jira/browse/HADOOP-10958
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe


TestGlobPaths should do more tests of globbing by unprivileged users.  Right 
now, most of the tests are of globbing by the superuser, but this tends to hide 
permission exception issues such as HADOOP-10957.  We should keep a few tests 
operating with privileged globs, but do most of them unprivileged.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093620#comment-14093620
 ] 

Charles Lamb commented on HADOOP-10919:
---

I should clarify case (1). If you are distcp'ing from the ez root or higher, 
then you don't need to pre-create the EZ because all of the raw.* xattrs will 
be preserved.

Given that, I'm wondering what would the purpose be for checking that the 
target is an EZ? 


> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10958) TestGlobPaths should do more tests of globbing by unprivileged users

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093621#comment-14093621
 ] 

Colin Patrick McCabe commented on HADOOP-10958:
---

I think we should consider renaming {{TestGlobPaths#fs}} to 
{{TestGlobPaths#privFs}}, and renaming {{TestGlobPaths#unprivilegedFs}} to 
{{TestGlobPaths#unprivFs}} to make it less unwieldy to type.  (And similar for 
FC, and all the other wrappers, etc.) This will also make it clear which one 
we're using.

> TestGlobPaths should do more tests of globbing by unprivileged users
> 
>
> Key: HADOOP-10958
> URL: https://issues.apache.org/jira/browse/HADOOP-10958
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Colin Patrick McCabe
>
> TestGlobPaths should do more tests of globbing by unprivileged users.  Right 
> now, most of the tests are of globbing by the superuser, but this tends to 
> hide permission exception issues such as HADOOP-10957.  We should keep a few 
> tests operating with privileged globs, but do most of them unprivileged.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10957) The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard

2014-08-11 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-10957:
-

 Summary: The globber will sometimes erroneously return a 
permission denied exception when there is a non-terminal wildcard
 Key: HADOOP-10957
 URL: https://issues.apache.org/jira/browse/HADOOP-10957
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


The globber will sometimes erroneously return a permission denied exception 
when there is a non-terminal wildcard.  The existing unit tests don't catch 
this, because it doesn't happen for superusers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093616#comment-14093616
 ] 

zhihai xu commented on HADOOP-10820:


As discussed with [~andrew.wang] offline, I made two changes in the 
HADOOP-10820-3.patch.
1. change the "File list length can't be zero" to "File name can't be empty 
string" because both are caused by empty string.
2. use tmp.isEmpty() instead of "".equals(tmp) for more semantic. and also 
split function for String will guarantee the the string element in the array is 
not null.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820-3.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093611#comment-14093611
 ] 

Charles Lamb commented on HADOOP-10919:
---

Sanjay,

There are three scenarios. 

(1) An administrator who does not have access to the keys in the KMS would use 
the /.reserved/raw prefix on src and dest:

distcp /.reserved/raw/src /.reserved/raw/dest

The /.reserved/raw is the only interface that exposes the raw.* xattrs holding 
the encryption metadata. This allows the raw.* xattrs to be preserved on the 
dest as well as to copy the files without decrypting them. This scenario 
assumes that an ez has been set up on dest. As you suggested, it would be a 
good idea to check that the dest is actually an ez.

(2) A non-admin user who has access to some subset of files in an ez could use 
the non-/.reserved/raw prefix and copy a hierarchy from one ez to another. In 
that case, the raw.* xattrs from the src ez would not be preserved. This 
scenario assumes that the dest ez is already set up. Of course the dest files 
will have new keys associated with them since they'll be new copies. 

(3) Neither src or dst has /.reserved/raw and one or the other of src/dest is 
not an ez. It is not necessary to have the target also be an ez. The use case 
would be that the user wants to copy a subset of the ez into/out-of a 
non-encrypted file system. distcp without the /.reserved/raw prefix could be 
used for this.

Does this all make sense?




> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HADOOP-10820:
---

Attachment: HADOOP-10820-3.patch

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820-3.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10955) FSShell's get operation should have the ability to take "start" and "length" argument

2014-08-11 Thread Guo Ruijing (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guo Ruijing updated HADOOP-10955:
-

Description: 
Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
part of corrupted file. We may enhance "hdfs -get" to copy out good part.

Existing:
hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 

Proposal:

hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-start] [-length] 
 ... 

  was:
Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
part of corrupted file. We may enhance "hdfs -get" to copy out good part.

Existing:
hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 

Proposal:

hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 


Summary: FSShell's get operation should have the ability to take 
"start" and "length" argument  (was: FSShell's get operation should have the 
ability to take a "length" argument)

> FSShell's get operation should have the ability to take "start" and "length" 
> argument
> -
>
> Key: HADOOP-10955
> URL: https://issues.apache.org/jira/browse/HADOOP-10955
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-start] [-length] 
>  ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument

2014-08-11 Thread Guo Ruijing (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093605#comment-14093605
 ] 

Guo Ruijing commented on HADOOP-10955:
--

Hi, Colin. "start" argument is a good point and update JIRA title and 
description according to your comments.

> FSShell's get operation should have the ability to take a "length" argument
> ---
>
> Key: HADOOP-10955
> URL: https://issues.apache.org/jira/browse/HADOOP-10955
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 
> 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093594#comment-14093594
 ] 

Sanjay Radia commented on HADOOP-10919:
---

charles, what is the usage model for distcp of encrypted files:
* distcp path1 path2   - where distcp will insert /.reserved/.raw to the 
pathnames if in encrypted zone.
* OR distcp /.reserved/.raw/path1  /.reserved/.raw/path2


BTW is the proposal is that both src and dest MUST be encryptedZones or neither 
? (Because of your "misspoke" comment I am a little confused.)


> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093593#comment-14093593
 ] 

Hadoop QA commented on HADOOP-10281:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661094/HADOOP-10281.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4450//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4450//console

This message is automatically generated.

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.
> 
> As of now, the current scheduler is the DecayRpcScheduler, which only keeps 
> track of the number of each type of call and decays these counts periodically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10836) Replace HttpFS custom proxyuser handling with common implementation

2014-08-11 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HADOOP-10836:


Attachment: HADOOP-10836.patch

rebasing patch on trunk now that HADOOP-10835 is committed.

> Replace HttpFS custom proxyuser handling with common implementation
> ---
>
> Key: HADOOP-10836
> URL: https://issues.apache.org/jira/browse/HADOOP-10836
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: COMBO.patch, HADOOP-10836.patch, HADOOP-10836.patch, 
> HADOOP-10836.patch, HADOOP-10836.patch
>
>
> Use HADOOP-10835 to implement proxyuser logic in HttpFS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093565#comment-14093565
 ] 

Charles Lamb commented on HADOOP-10919:
---

Sanjay,

I just re-read your comment and I realized that I mis-spoke.

Yes, I think it would make sense. I'll open a jira for that.

Thanks.


> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10956) Fix create-release script to include docs in the binary

2014-08-11 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created HADOOP-10956:
-

 Summary: Fix create-release script to include docs in the binary
 Key: HADOOP-10956
 URL: https://issues.apache.org/jira/browse/HADOOP-10956
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 2.5.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker


The create-release script doesn't include docs in the binary tarball. We should 
fix that. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093560#comment-14093560
 ] 

Andrew Wang commented on HADOOP-10820:
--

Hi Zhihai, only comment I have is that it would be nice to validate for empty 
string before the file list length, e.g.:

{noformat}
-> % hadoop fs -files ,, 
Exception in thread "main" java.lang.IllegalArgumentException: File list length 
can't be zero
{noformat}

I think if we just need to do this check on the finalArr instead at the end. +1 
pending this.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093557#comment-14093557
 ] 

Charles Lamb commented on HADOOP-10919:
---

bq. Charles, you list disadvantage for the .raw scheme where the target of a 
distcp is not an encrypted zone. Would it make sense for distcp to check for 
that and to fail the distcp?

Hi Sanjay,

Presently distcp requires both src and target to be either both in 
/.reserved/raw or neither in /.reserved/raw.

I'll update the HDFS-6509 document and comments.

Thanks for catching that.


> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093548#comment-14093548
 ] 

Sanjay Radia commented on HADOOP-10919:
---

Charles, you list  disadvantage for the .raw scheme where the target of a 
distcp is not an encrypted zone. Would it make sense for distcp to check for 
that and to fail the distcp? 

> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8944) Shell command fs -count should include human readable option

2014-08-11 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8944:
-

Status: Open  (was: Patch Available)

> Shell command fs -count should include human readable option
> 
>
> Key: HADOOP-8944
> URL: https://issues.apache.org/jira/browse/HADOOP-8944
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Allen
>Assignee: Allen Wittenauer
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-8944-1.patch, HADOOP-8944.patch
>
>
> The shell command fs -count report sizes in bytes.  The command should accept 
> a -h option to display the sizes in a human readable format, i.e. K, M, G, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10919) Copy command should preserve raw.* namespace extended attributes

2014-08-11 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093550#comment-14093550
 ] 

Sanjay Radia commented on HADOOP-10919:
---

Charles, the work you did for distcp needs to be also applied to har. I suspect 
.raw would also work.

> Copy command should preserve raw.* namespace extended attributes
> 
>
> Key: HADOOP-10919
> URL: https://issues.apache.org/jira/browse/HADOOP-10919
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10919.001.patch, HADOOP-10919.002.patch
>
>
> Refer to the doc attached to HDFS-6509 for background.
> Like distcp -p (see MAPREDUCE-6007), the copy command also needs to preserve 
> extended attributes in the raw.* namespace by default whenever the src and 
> target are in /.reserved/raw. To not preserve raw xattrs, don't specify 
> /.reserved/raw in either the src or target. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093539#comment-14093539
 ] 

Hudson commented on HADOOP-10835:
-

FAILURE: Integrated in Hadoop-trunk-Commit #6049 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6049/])
HADOOP-10835. Implement HTTP proxyuser support in HTTP authentication 
client/server libraries. (tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617384)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationFilter.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticatedURL.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticationFilter.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticationHandler.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/HttpUserGroupInformation.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/token/delegation/web/TestWebDelegationToken.java


> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093535#comment-14093535
 ] 

Colin Patrick McCabe commented on HADOOP-10942:
---

It seems like there's a bunch of things going on here:

* The globber will sometimes erroneously return a permission denied exception 
when there is a non-terminal wildcard.  For example, when listing {{/a*/b}}, if 
there is a file named /alpha, the glob will fail.  This bug does *not* occur 
for superusers, which is why the existing tests and casual testing didn't catch 
it.  You mention that this was a "fix" which was regressed between 0.23 and 
branch-2... is there a JIRA number for this already?

* Optimizations: you mention "doing a simple immediate file status if the path 
contains no globs, etc".  The existing code already does this.  It was added in 
HADOOP-9877.  Are we missing a case?  I didn't understand the comment about 
"Additional optimizations to reduce rpcs actually increases them if directories 
contain 1 item."  Which specific optimization(s) are increasing RPCs for you 
and how can we avoid this?

* You added a comment that "FileContext returns a path to the home dir of the 
user that started the jvm instead of the ugi user so we'll just workaround it." 
 I wasn't aware of this issue.  Is there a JIRA number?  This seems like an 
inconsistency that we should note in the test, along with a link to the JIRA 
that should fix it.

* There's a bunch of reorganization here, perhaps almost a rewrite of the main 
part of the globber.

Let's split these into separate JIRAs so that it's easier to review.

> Globbing optimizations and regression fix
> -
>
> Key: HADOOP-10942
> URL: https://issues.apache.org/jira/browse/HADOOP-10942
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HADOOP-10942.patch
>
>
> When globbing was commonized to support both filesystem and filecontext, it 
> regressed a fix that prevents an intermediate glob that matches a file from 
> throwing a confusing permissions exception.  The hdfs traverse check requires 
> the exec bit which a file does not have.
> Additional optimizations to reduce rpcs actually increases them if 
> directories contain 1 item.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HADOOP-10835:


  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

committed to trunk and branch-2.

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093527#comment-14093527
 ] 

Aaron T. Myers commented on HADOOP-10835:
-

+1, the latest patch looks good to me.

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-10281:
--

Attachment: (was: HADOOP-10281-preview.patch)

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-10281:
--

Description: 
The Scheduler decides which sub-queue to assign a given Call. It implements a 
single method getPriorityLevel(Schedulable call) which returns an integer 
corresponding to the subqueue the FairCallQueue should place the call in.

The HistoryRpcScheduler is one such implementation which uses the username of 
each call and determines what % of calls in recent history were made by this 
user.

It is configured with a historyLength (how many calls to track) and a list of 
integer thresholds which determine the boundaries between priority levels.

For instance, if the scheduler has a historyLength of 8; and priority 
thresholds of 4,2,1; and saw calls made by these users in order:
Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice

* Another call by Alice would be placed in queue 3, since she has already made 
>= 4 calls
* Another call by Bob would be placed in queue 2, since he has >= 2 but less 
than 4 calls
* A call by Carlos would be placed in queue 0, since he has no calls in the 
history

Also, some versions of this patch include the concept of a 'service user', 
which is a user that is always scheduled high-priority. Currently this seems 
redundant and will probably be removed in later patches, since its not too 
useful.



As of now, the current scheduler is the DecayRpcScheduler, which only keeps 
track of the number of each type of call and decays these counts periodically.

  was:
The Scheduler decides which sub-queue to assign a given Call. It implements a 
single method getPriorityLevel(Schedulable call) which returns an integer 
corresponding to the subqueue the FairCallQueue should place the call in.

The HistoryRpcScheduler is one such implementation which uses the username of 
each call and determines what % of calls in recent history were made by this 
user.

It is configured with a historyLength (how many calls to track) and a list of 
integer thresholds which determine the boundaries between priority levels.

For instance, if the scheduler has a historyLength of 8; and priority 
thresholds of 4,2,1; and saw calls made by these users in order:
Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice

* Another call by Alice would be placed in queue 3, since she has already made 
>= 4 calls
* Another call by Bob would be placed in queue 2, since he has >= 2 but less 
than 4 calls
* A call by Carlos would be placed in queue 0, since he has no calls in the 
history

Also, some versions of this patch include the concept of a 'service user', 
which is a user that is always scheduled high-priority. Currently this seems 
redundant and will probably be removed in later patches, since its not too 
useful.


> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.
> 
> As of now, the current scheduler is the DecayRpcScheduler, which only keeps 
> track of the number of each type of call and decays these counts periodically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-10281:
--

Attachment: HADOOP-10281.patch

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-10281:
--

Status: Patch Available  (was: Open)

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093516#comment-14093516
 ] 

Hadoop QA commented on HADOOP-10820:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661083/HADOOP-10820-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4449//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4449//console

This message is automatically generated.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093482#comment-14093482
 ] 

Alejandro Abdelnur commented on HADOOP-10835:
-

failure is unrelated.

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093480#comment-14093480
 ] 

zhihai xu commented on HADOOP-10820:


I attached a new patch HADOOP-10820-2.patch based on [~andrew.wang] comments.
Also I find the filename with space character is not permitted in URI parser. 
My previous concern is already handled in original code.
So I add a test case "a, ,b" which will trigger URISyntaxException at line 394 
in GenericOptionsParser.java.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093479#comment-14093479
 ] 

Hadoop QA commented on HADOOP-10835:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661074/HADOOP-10835.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4448//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4448//console

This message is automatically generated.

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HADOOP-10820:
---

Attachment: HADOOP-10820-2.patch

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820-2.patch, 
> HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093471#comment-14093471
 ] 

Colin Patrick McCabe commented on HADOOP-10955:
---

also, you might consider adding a start argument as well as a length, while 
you're at it

> FSShell's get operation should have the ability to take a "length" argument
> ---
>
> Key: HADOOP-10955
> URL: https://issues.apache.org/jira/browse/HADOOP-10955
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 
> 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093470#comment-14093470
 ] 

Colin Patrick McCabe commented on HADOOP-10955:
---

Moving from HDFS to HADOOP, since FSShell is part of the common code.

> FSShell's get operation should have the ability to take a "length" argument
> ---
>
> Key: HADOOP-10955
> URL: https://issues.apache.org/jira/browse/HADOOP-10955
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 
> 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Moved] (HADOOP-10955) FSShell's get operation should have the ability to take a "length" argument

2014-08-11 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe moved HDFS-6818 to HADOOP-10955:
-

Key: HADOOP-10955  (was: HDFS-6818)
Project: Hadoop Common  (was: Hadoop HDFS)

> FSShell's get operation should have the ability to take a "length" argument
> ---
>
> Key: HADOOP-10955
> URL: https://issues.apache.org/jira/browse/HADOOP-10955
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Guo Ruijing
>
> Use Case: if HDFS file is corrupted, some tool can be used to copy out good 
> part of corrupted file. We may enhance "hdfs -get" to copy out good part.
> Existing:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc]  ... 
> Proposal:
> hadoop fs [generic options] -get [-p] [-ignoreCrc] [-crc] [-length]  ... 
> 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HADOOP-10835:


Attachment: HADOOP-10835.patch

thanks @atm, new patch addressing your comments.

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch, HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10940) RPC client does no bounds checking of responses

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093330#comment-14093330
 ] 

Colin Patrick McCabe commented on HADOOP-10940:
---

nit: maxDataLength should be final, since it can't change

{code}
  \@InterfaceAudience.Private  // ONLY exposed for SaslRpcClient
  public static class IpcStreams implements Closeable {
{code}
Is this comment still valid?  It looks like even non-SASL clients are now using 
{{IpcStreams}}.

{code}
  // don't flush!  we need to avoid broken pipes if server closes or
  // rejects the connection.  the perils of multiple sends before a read
  // insecure: header+context+call, flush
  // secure  : header+negotiate, flush, (sasl), context+call, flush
{code}
Hmm.  I wonder if we could rephrase this to be clearer.  Maybe something like 
"At this point, the data is buffered by the output stream.  We do not want to 
flush yet, since that would generate unnecessary context switches.  Another 
advantage of deferring the TCP write operation is that we do not get a "broken 
pipe" exception if the server closes or rejects the connection at this point."

{code}
  // again, don't flush!  see writeConnectionHeader
{code}
Do we need this comment here?  There wasn't a flush here earlier.

{code}
public void sendRequest(RpcRequestHeaderProto header, Message request,
boolean flush) throws IOException {
  try {
header.writeDelimitedTo(dob);
request.writeDelimitedTo(dob);
sendRequest(dob, flush);
  } finally {
dob.reset();
  }
}

public void sendRequest(DataOutputBuffer buffer, boolean flush)
throws IOException {
  out.writeInt(buffer.size()); // total Length
  buffer.writeTo(out); // request header + payload
  if (flush) {
out.flush();
  }
}
{code}

Rather than having a boolean argument, why not just have the callers who want 
to flush call {{ioStreams.out.flush()}}?  There seems to be no advantage to 
folding it into {{sendRequest}}, and it means that we need a comment to explain 
the value of the boolean everywhere.

{code}
  public boolean useWrap() {
{code}
add VisibleForTesting?

> RPC client does no bounds checking of responses
> ---
>
> Key: HADOOP-10940
> URL: https://issues.apache.org/jira/browse/HADOOP-10940
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HADOOP-10940.patch, HADOOP-10940.patch
>
>
> The rpc client does no bounds checking of server responses.  In the case of 
> communicating with an older and incompatible RPC, this may lead to OOM issues 
> and leaking of resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093331#comment-14093331
 ] 

Arpit Agarwal commented on HADOOP-10281:


It is probably not worth running the test again just for that.

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10835) Implement HTTP proxyuser support in HTTP authentication client/server libraries

2014-08-11 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093321#comment-14093321
 ] 

Aaron T. Myers commented on HADOOP-10835:
-

Patch looks pretty good to me, Tucu. A few small comments/nits. +1 once these 
are addressed.

Comments:

# Is it definitely correct that in 
{{DelegationTokenAuthenticationFilter#getProxyuserConfiguration}} we create a 
{{Configuration}} object without loading the defaults? That surprised me a bit, 
but maybe it's reasonable. Perhaps add a comment explaining why we're doing 
that here?
# Recommend using some constants for the many repeated strings in the tests, 
e.g. "ok-user" is repeated many times.

Nits:

# This change seems unnecessary and unhelpful:
{code}
-   * Sets an external DelegationTokenSecretManager instance to
+   * Sets an external   DelegationTokenSecretManager instance to
{code}
# Should have a comma here, instead of a period:
{code}
+   * Returns the remote {@link UserGroupInformation} in context for the current
+   * HTTP request. taking into account proxy user requests.
{code}
# One too many "using":
{code}
+  // requests using using delegation token as auth do not honor doAs
{code}

> Implement HTTP proxyuser support in HTTP authentication client/server 
> libraries
> ---
>
> Key: HADOOP-10835
> URL: https://issues.apache.org/jira/browse/HADOOP-10835
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 2.6.0
>
> Attachments: HADOOP-10835.patch, HADOOP-10835.patch, 
> HADOOP-10835.patch
>
>
> This is to implement generic handling of proxyuser in the 
> {{DelegationTokenAuthenticatedURL}} and 
> {{DelegationTokenAuthenticationFilter}} classes and to wire properly UGI on 
> the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10946) Fix a bunch of typos in log messages

2014-08-11 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093315#comment-14093315
 ] 

Ray Chiang commented on HADOOP-10946:
-

Good to know.  Thanks.

I just thought it odd that on Jenkins the only two non-PASS/non-FAILs in the 
recent job list were the two runs I listed above.

> Fix a bunch of typos in log messages
> 
>
> Key: HADOOP-10946
> URL: https://issues.apache.org/jira/browse/HADOOP-10946
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Ray Chiang
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP10946-01.patch, HADOOP10946-02.patch
>
>
> There are a bunch of typos in various log messages.  These need cleaning up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10946) Fix a bunch of typos in log messages

2014-08-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093307#comment-14093307
 ] 

Allen Wittenauer commented on HADOOP-10946:
---

FYI,

HDFS-6694 and HDFS-4663. So those TestPipeline failures are almost certainly 
not real failures.

> Fix a bunch of typos in log messages
> 
>
> Key: HADOOP-10946
> URL: https://issues.apache.org/jira/browse/HADOOP-10946
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Ray Chiang
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP10946-01.patch, HADOOP10946-02.patch
>
>
> There are a bunch of typos in various log messages.  These need cleaning up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093300#comment-14093300
 ] 

Chris Li commented on HADOOP-10281:
---

I didn't record it, but I can if you're interested. I suspect it'll be slightly 
worse than the minority user's latency in the LinkedBlockingQueue (since the 
resources have to come from somewhere).

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093288#comment-14093288
 ] 

Arpit Agarwal commented on HADOOP-10281:


Thanks Chris. Just curious if you measured the impact to the majority user 
latency.

> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10942) Globbing optimizations and regression fix

2014-08-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093278#comment-14093278
 ] 

Colin Patrick McCabe commented on HADOOP-10942:
---

bq. Colin Patrick McCabe, could you take a look since you made most of the 
changes after my 0.23 overhaul?

Will take a look.  I was on vacation last week so that's why I haven't 
responded til now

> Globbing optimizations and regression fix
> -
>
> Key: HADOOP-10942
> URL: https://issues.apache.org/jira/browse/HADOOP-10942
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HADOOP-10942.patch
>
>
> When globbing was commonized to support both filesystem and filecontext, it 
> regressed a fix that prevents an intermediate glob that matches a file from 
> throwing a confusing permissions exception.  The hdfs traverse check requires 
> the exec bit which a file does not have.
> Additional optimizations to reduce rpcs actually increases them if 
> directories contain 1 item.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-11 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093274#comment-14093274
 ] 

Aaron T. Myers commented on HADOOP-9902:


I agree with [~aw] on the trunk/branch-2 question. We quite clearly can't 
commit this patch to branch-2 because of the compat issues, at least not 
without some fairly substantial scaling back of this change.

Based on some recent discussions on some of the lists, seems like the 
motivation for a release off of trunk (i.e. 3.x) is building. This change being 
only on trunk would add to the motivation to make a release from that branch.

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, 
> HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093253#comment-14093253
 ] 

Alejandro Abdelnur commented on HADOOP-9902:


bq. If trunk is getting 'stale', then that sounds like an issue for the PMC to 
take up.

I'm being proactive on this one. I'm trying to avoid getting into that 
situation. I'd love to get this in, just in a way it is exercised and refined 
ASAP. Else, a year from now or more we'll be battling with it.

What are the key issues to be addressed for getting this in branch-2 and how 
can we take care of it?

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, 
> HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093244#comment-14093244
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

bq. I was under the impression we were targeting this for branch-2? is not the 
case?

It hasn't been my intention to commit this to branch-2 for a very long time.  
Others have expressed interest in a back port, though.  Of course, while this 
patch definitely moves the needle the largest, there are still lots of smaller 
projects that need to be finished (see the blocked by list) for a comprehensive 
fix.

bq.  we are at risk of getting things stale in trunk as people add changes in 
branch-2 only. 

There are already changes in trunk that aren't in branch-2.  This would just be 
another one (albeit probably the biggest one).  If trunk is getting 'stale', 
then that sounds like an issue for the PMC to take up.  It doesn't really have 
much bearing on this patch, IMO.

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, 
> HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10281) Create a scheduler, which assigns schedulables a priority level

2014-08-11 Thread Chris Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093229#comment-14093229
 ] 

Chris Li commented on HADOOP-10281:
---

Hi [~arpitagarwal] that's correct. It's not very scientific, but it's a sanity 
check to make sure that the scheduler performs under various loads. The 
workloads are mapreduce jobs that coordinate to perform a ddos attack on the 
namenode. Each job runs under 10 users, each job maps to 20 nodes, and spams 
the namenode using a varying number of threads.

Rest: No load
Equal: 100 threads each
Balanced: 10, 20, 30..., 80, 90, 100 threads respectievly
Majority: 100, then 1-2 for the rest

I think this is ready. I will post a patch shortly for CI


> Create a scheduler, which assigns schedulables a priority level
> ---
>
> Key: HADOOP-10281
> URL: https://issues.apache.org/jira/browse/HADOOP-10281
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Chris Li
>Assignee: Chris Li
> Attachments: HADOOP-10281-preview.patch, HADOOP-10281.patch, 
> HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093158#comment-14093158
 ] 

zhihai xu commented on HADOOP-10820:


Hi [~alex.holmes], It look like you don't have time to work on this issue, if 
you don't mind, I will create a patch based on your patch to address the 
comment from [~andrew.wang]. thanks zhihai

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10820) Empty entry in libjars results in working directory being recursively localized

2014-08-11 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093139#comment-14093139
 ] 

zhihai xu commented on HADOOP-10820:


A filename with only space character are valid. So my suggestion is not good. 
We should't trim space when check empty string.

> Empty entry in libjars results in working directory being recursively 
> localized
> ---
>
> Key: HADOOP-10820
> URL: https://issues.apache.org/jira/browse/HADOOP-10820
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Alex Holmes
>Priority: Minor
> Attachments: HADOOP-10820-1.patch, HADOOP-10820.patch
>
>
> An empty token (e.g. "a.jar,,b.jar") in the -libjars option causes the 
> current working directory to be recursively localized.
> Here's an example of this in action (using Hadoop 2.2.0):
> {code}
> # create a temp directory and touch three JAR files
> mkdir -p tmp/path && cd tmp && touch a.jar b.jar c.jar path/d.jar
> # Run an example job only specifying two of the JARs.
> # Include an empty entry in libjars.
> hadoop jar 
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar 
> pi -libjars a.jar,,c.jar 2 10
> # As the job is running examine the localized directory in HDFS.
> # Notice that not only are the two JAR's specified in libjars copied,
> # but in addition the contents of the working directory are also recursively 
> copied.
> $ hadoop fs -lsr 
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/a.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/b.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/c.jar
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path
> /tmp/hadoop-yarn/staging/aholmes/.staging/job_1404752711144_0018/libjars/tmp/path/d.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093115#comment-14093115
 ] 

Alejandro Abdelnur commented on HADOOP-9902:


[~aw], I was under the impression we were targeting this for branch-2? is not 
the case? If we don't do that, given that we don't have imminent plans to 
create a branch-3 out of trunk, we are at risk of getting things stale in trunk 
as people add changes in branch-2 only. 

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, 
> HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-08-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093075#comment-14093075
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Test failures are obviously unrelated.

Patch -14 deals with the issues that [~rvs] discovered.

> Shell script rewrite
> 
>
> Key: HADOOP-9902
> URL: https://issues.apache.org/jira/browse/HADOOP-9902
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: releasenotes
> Attachments: HADOOP-9902-10.patch, HADOOP-9902-11.patch, 
> HADOOP-9902-12.patch, HADOOP-9902-13-branch-2.patch, HADOOP-9902-13.patch, 
> HADOOP-9902-14.patch, HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
> HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
> HADOOP-9902-7.patch, HADOOP-9902-8.patch, HADOOP-9902-9.patch, 
> HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
>
>
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092844#comment-14092844
 ] 

Hudson commented on HADOOP-10402:
-

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1860 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1860/])
HADOOP-10402. Configuration.getValByRegex does not substitute for variables. 
(Robert Kanter via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


> Configuration.getValByRegex does not substitute for variables
> -
>
> Key: HADOOP-10402
> URL: https://issues.apache.org/jira/browse/HADOOP-10402
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.6.0
>
> Attachments: HADOOP-10402.patch
>
>
> When using Configuration.getValByRegex(...), variables are not resolved.  
> For example:
> {code:xml}
> 
>bar
>woot
> 
> 
>foo3
>${bar}
> 
> {code}
> If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, 
> it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10954) Adding site documents of hadoop-tools

2014-08-11 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HADOOP-10954:
-

 Summary: Adding site documents of hadoop-tools
 Key: HADOOP-10954
 URL: https://issues.apache.org/jira/browse/HADOOP-10954
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.4.1
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


There are no pages for hadoop-tools in the site documents of branch-2 or later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10954) Adding site documents of hadoop-tools

2014-08-11 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092774#comment-14092774
 ] 

Masatake Iwasaki commented on HADOOP-10954:
---

In the site documents of branch-1, there are pages such as
http://hadoop.apache.org/docs/current1/hadoop_archives.html
or
http://hadoop.apache.org/docs/current1/gridmix.html .
Those could be migrated from forrest to maven-site format.


> Adding site documents of hadoop-tools
> -
>
> Key: HADOOP-10954
> URL: https://issues.apache.org/jira/browse/HADOOP-10954
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>
> There are no pages for hadoop-tools in the site documents of branch-2 or 
> later.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092701#comment-14092701
 ] 

Hudson commented on HADOOP-10402:
-

ABORTED: Integrated in Hadoop-Hdfs-trunk #1834 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1834/])
HADOOP-10402. Configuration.getValByRegex does not substitute for variables. 
(Robert Kanter via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


> Configuration.getValByRegex does not substitute for variables
> -
>
> Key: HADOOP-10402
> URL: https://issues.apache.org/jira/browse/HADOOP-10402
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.6.0
>
> Attachments: HADOOP-10402.patch
>
>
> When using Configuration.getValByRegex(...), variables are not resolved.  
> For example:
> {code:xml}
> 
>bar
>woot
> 
> 
>foo3
>${bar}
> 
> {code}
> If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, 
> it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10953) a minor concurrent bug inside NetworkTopology

2014-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092651#comment-14092651
 ] 

Hadoop QA commented on HADOOP-10953:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12660965/HADOOP-10953.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4447//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4447//console

This message is automatically generated.

> a minor concurrent bug inside NetworkTopology
> -
>
> Key: HADOOP-10953
> URL: https://issues.apache.org/jira/browse/HADOOP-10953
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: net
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
>Priority: Minor
> Attachments: HADOOP-10953.txt
>
>
> Found this issue while reading the related code. In 
> NetworkTopology.toString() method, there is no thread safety guarantee 
> directly, it's called by add/remove, and inside add/remove, most of 
> this.toString() calls are protected by rwlock, except a couple of error 
> handling codes, one possible fix is that moving them into lock as well, due 
> to not heavy operations, so no obvious downgration should be observed per my 
> current knowledge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10953) a minor concurrent bug inside NetworkTopology

2014-08-11 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HADOOP-10953:
---

Status: Patch Available  (was: Open)

> a minor concurrent bug inside NetworkTopology
> -
>
> Key: HADOOP-10953
> URL: https://issues.apache.org/jira/browse/HADOOP-10953
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: net
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
>Priority: Minor
> Attachments: HADOOP-10953.txt
>
>
> Found this issue while reading the related code. In 
> NetworkTopology.toString() method, there is no thread safety guarantee 
> directly, it's called by add/remove, and inside add/remove, most of 
> this.toString() calls are protected by rwlock, except a couple of error 
> handling codes, one possible fix is that moving them into lock as well, due 
> to not heavy operations, so no obvious downgration should be observed per my 
> current knowledge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10953) a minor concurrent bug inside NetworkTopology

2014-08-11 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HADOOP-10953:
---

Attachment: HADOOP-10953.txt

> a minor concurrent bug inside NetworkTopology
> -
>
> Key: HADOOP-10953
> URL: https://issues.apache.org/jira/browse/HADOOP-10953
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: net
>Affects Versions: 3.0.0
>Reporter: Liang Xie
>Assignee: Liang Xie
>Priority: Minor
> Attachments: HADOOP-10953.txt
>
>
> Found this issue while reading the related code. In 
> NetworkTopology.toString() method, there is no thread safety guarantee 
> directly, it's called by add/remove, and inside add/remove, most of 
> this.toString() calls are protected by rwlock, except a couple of error 
> handling codes, one possible fix is that moving them into lock as well, due 
> to not heavy operations, so no obvious downgration should be observed per my 
> current knowledge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10953) a minor concurrent bug inside NetworkTopology

2014-08-11 Thread Liang Xie (JIRA)
Liang Xie created HADOOP-10953:
--

 Summary: a minor concurrent bug inside NetworkTopology
 Key: HADOOP-10953
 URL: https://issues.apache.org/jira/browse/HADOOP-10953
 Project: Hadoop Common
  Issue Type: Bug
  Components: net
Affects Versions: 3.0.0
Reporter: Liang Xie
Assignee: Liang Xie
Priority: Minor


Found this issue while reading the related code. In NetworkTopology.toString() 
method, there is no thread safety guarantee directly, it's called by 
add/remove, and inside add/remove, most of this.toString() calls are protected 
by rwlock, except a couple of error handling codes, one possible fix is that 
moving them into lock as well, due to not heavy operations, so no obvious 
downgration should be observed per my current knowledge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10402) Configuration.getValByRegex does not substitute for variables

2014-08-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092603#comment-14092603
 ] 

Hudson commented on HADOOP-10402:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #641 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/641/])
HADOOP-10402. Configuration.getValByRegex does not substitute for variables. 
(Robert Kanter via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1617166)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java


> Configuration.getValByRegex does not substitute for variables
> -
>
> Key: HADOOP-10402
> URL: https://issues.apache.org/jira/browse/HADOOP-10402
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.6.0
>
> Attachments: HADOOP-10402.patch
>
>
> When using Configuration.getValByRegex(...), variables are not resolved.  
> For example:
> {code:xml}
> 
>bar
>woot
> 
> 
>foo3
>${bar}
> 
> {code}
> If you then try to do something like {{Configuration.getValByRegex(foo.*)}}, 
> it will return a Map containing "foo3=$\{bar}" instead of "foo3=woot"



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-8989) hadoop dfs -find feature

2014-08-11 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092562#comment-14092562
 ] 

Akira AJISAKA commented on HADOOP-8989:
---

Thanks [~jonallen] for the update. +1 (non-binding).
Sorry for the late response.

> hadoop dfs -find feature
> 
>
> Key: HADOOP-8989
> URL: https://issues.apache.org/jira/browse/HADOOP-8989
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Marco Nicosia
>Assignee: Jonathan Allen
> Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
> HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
> HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
> HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
> HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
> HADOOP-8989.patch, HADOOP-8989.patch
>
>
> Both sysadmins and users make frequent use of the unix 'find' command, but 
> Hadoop has no correlate. Without this, users are writing scripts which make 
> heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs 
> -lsr is somewhat taxing on the NameNode, and a really slow experience on the 
> client side. Possibly an in-NameNode find operation would be only a bit more 
> taxing on the NameNode, but significantly faster from the client's point of 
> view?
> The minimum set of options I can think of which would make a Hadoop find 
> command generally useful is (in priority order):
> * -type (file or directory, for now)
> * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments)
> * -print0 (for piping to xargs -0)
> * -depth
> * -owner/-group (and -nouser/-nogroup)
> * -name (allowing for shell pattern, or even regex?)
> * -perm
> * -size
> One possible special case, but could possibly be really cool if it ran from 
> within the NameNode:
> * -delete
> The "hadoop dfs -lsr | hadoop dfs -rm" cycle is really, really slow.
> Lower priority, some people do use operators, mostly to execute -or searches 
> such as:
> * find / \(-nouser -or -nogroup\)
> Finally, I thought I'd include a link to the [Posix spec for 
> find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10885) Fix dead links to the javadocs of o.a.h.security.authorize

2014-08-11 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092526#comment-14092526
 ] 

Ray Chiang commented on HADOOP-10885:
-

Just observing.  Peeking through the org.apache.hadoop.security.authorize files:

  AccessControlList.java:@InterfaceAudience.LimitedPrivate({"HDFS", 
"MapReduce"})
  AuthorizationException.java:@InterfaceAudience.LimitedPrivate({"HDFS", 
"MapReduce"})
  DefaultImpersonationProvider.java:@InterfaceAudience.Public
  ImpersonationProvider.java:@InterfaceAudience.Public
  PolicyProvider.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"})
  ProxyUsers.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce", 
"HBase", "Hive"})
  
RefreshAuthorizationPolicyProtocol.java:@InterfaceAudience.LimitedPrivate({"HDFS",
 "MapReduce"})
  Service.java:@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"})
  ServiceAuthorizationManager.java:@InterfaceAudience.LimitedPrivate({"HDFS", 
"MapReduce"})
  package-info.java:@InterfaceAudience.LimitedPrivate({"HBase", "HDFS", 
"MapReduce"})

It also looks like package-info.java will override the "Hive" access part of 
ProxyUsers.java.  Is the right fix:

a) Update DefaultImpersonationProvider/ImpersonationProvider to be 
LimitedPrivate and fix the fields for ProxyUsers.java.  If so, what setting(s)?

b) Fix package-info.java to be @InterfaceAudience.Public.

I'd guess a), but it would be good to check with someone who actually has the 
right answer.

> Fix dead links to the javadocs of o.a.h.security.authorize
> --
>
> Key: HADOOP-10885
> URL: https://issues.apache.org/jira/browse/HADOOP-10885
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.6.0
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
>
> In API doc ([my trunk 
> build|http://aajisaka.github.io/hadoop-project/api/index.html]), 
> {{ImpersonationProvider}} and {{DefaultImpersonationProvider}} classes are 
> linked but these documents are not generated.
> There's an inconsistency about {{@InterfaceAudience}} between package-info 
> and these classes, so these dead links are generated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)