[jira] [Commented] (HADOOP-12723) S3A: Add ability to plug in any AWSCredentialsProvider

2016-05-13 Thread Steven Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283416#comment-15283416
 ] 

Steven Wong commented on HADOOP-12723:
--

The new Checkstyle warning "Missing package-info.java file" seems innocuous.

> S3A: Add ability to plug in any AWSCredentialsProvider
> --
>
> Key: HADOOP-12723
> URL: https://issues.apache.org/jira/browse/HADOOP-12723
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 2.7.1
>Reporter: Steven Wong
>Assignee: Steven Wong
> Attachments: HADOOP-12723.0.patch, HADOOP-12723.1.patch, 
> HADOOP-12723.2.patch, HADOOP-12723.3.patch
>
>
> Although S3A currently has built-in support for 
> {{org.apache.hadoop.fs.s3a.BasicAWSCredentialsProvider}}, 
> {{com.amazonaws.auth.InstanceProfileCredentialsProvider}}, and 
> {{org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider}}, it does not 
> support any other credentials provider that implements the 
> {{com.amazonaws.auth.AWSCredentialsProvider}} interface. Supporting the 
> ability to plug in any {{com.amazonaws.auth.AWSCredentialsProvider}} instance 
> will expand the options for S3 credentials, such as:
> * temporary credentials from STS, e.g. via 
> {{com.amazonaws.auth.STSSessionCredentialsProvider}}
> * IAM role-based credentials, e.g. via 
> {{com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider}}
> * a custom credentials provider that satisfies one's own needs, e.g. 
> bucket-specific credentials, user-specific credentials, etc.
> To support this, we can add a configuration for the fully qualified class 
> name of a credentials provider, to be loaded by 
> {{S3AFileSystem.initialize(URI, Configuration)}}.
> The configured credentials provider should implement 
> {{com.amazonaws.auth.AWSCredentialsProvider}} and have a constructor that 
> accepts {{(URI uri, Configuration conf)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13035) Add states INITING and STARTING to YARN Service model to cover in-transition states.

2016-05-13 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated HADOOP-13035:
--
Status: Open  (was: Patch Available)

> Add states INITING and STARTING to YARN Service model to cover in-transition 
> states.
> 
>
> Key: HADOOP-13035
> URL: https://issues.apache.org/jira/browse/HADOOP-13035
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
> Attachments: 0001-HADOOP-13035.patch, 0002-HADOOP-13035.patch, 
> 0003-HADOOP-13035.patch
>
>
> As per the discussion in YARN-3971 the we should be setting the service state 
> to STARTED only after serviceStart() 
> Currently {{AbstractService#start()}} is set
> {noformat} 
>  if (stateModel.enterState(STATE.STARTED) != STATE.STARTED) {
> try {
>   startTime = System.currentTimeMillis();
>   serviceStart();
> ..
>  }
> {noformat}
> enterState sets the service state to proposed state. So in 
> {{service.getServiceState}} in {{serviceStart()}} will return STARTED .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283396#comment-15283396
 ] 

Hadoop QA commented on HADOOP-13146:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 56s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 47s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-common-project/hadoop-common: The patch generated 
4 new + 7 unchanged - 4 fixed = 11 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 47s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 20s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 40s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.net.TestDNS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803999/c13146_20160513b.patch
 |
| JIRA Issue | HADOOP-13146 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 536afcbbb6a4 3.13.0-36-lowlatency #63-Ubuntu SMP 

[jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283385#comment-15283385
 ] 

Hadoop QA commented on HADOOP-12666:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 49s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
54s {color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools/hadoop-tools-dist hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 7s 
{color} | {color:red} hadoop-tools in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 5s 
{color} | {color:red} hadoop-azure-datalake in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-tools-dist in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s 
{color} | {color:red} root in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 6s {color} 
| {color:red} root in the patch failed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 7s 
{color} | {color:red} root in the patch failed with JDK v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 7s {color} 
| {color:red} root in the patch failed with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
7s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 6s 
{color} | {color:red} hadoop-tools in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 6s 
{color} | {color:red} hadoop-azure-datalake in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 7s 
{color} | {color:red} hadoop-tools-dist in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 6s 
{color} | {color:red} hadoop-tools in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 6s 
{color} | {color:red} hadoop-azure-datalake in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-tools-dist in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} 

[jira] [Updated] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HADOOP-13146:
-
Attachment: c13146_20160513b.patch

c13146_20160513b.patch: sync'ed with trunk.

> Refactor RetryInvocationHandler
> ---
>
> Key: HADOOP-13146
> URL: https://issues.apache.org/jira/browse/HADOOP-13146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: c13146_20160513.patch, c13146_20160513b.patch
>
>
> - The exception handling is quite long.  It is better to refactor it to a 
> separated method.
> - The failover logic and synchronization can be moved to a new inner class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283375#comment-15283375
 ] 

Hadoop QA commented on HADOOP-13146:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s {color} 
| {color:red} HADOOP-13146 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803993/c13146_20160513.patch 
|
| JIRA Issue | HADOOP-13146 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9422/console |
| Powered by | Apache Yetus 0.3.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Refactor RetryInvocationHandler
> ---
>
> Key: HADOOP-13146
> URL: https://issues.apache.org/jira/browse/HADOOP-13146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: c13146_20160513.patch
>
>
> - The exception handling is quite long.  It is better to refactor it to a 
> separated method.
> - The failover logic and synchronization can be moved to a new inner class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HADOOP-13146:
-
Status: Patch Available  (was: Open)

> Refactor RetryInvocationHandler
> ---
>
> Key: HADOOP-13146
> URL: https://issues.apache.org/jira/browse/HADOOP-13146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: c13146_20160513.patch
>
>
> - The exception handling is quite long.  It is better to refactor it to a 
> separated method.
> - The failover logic and synchronization can be moved to a new inner class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HADOOP-13146:
-
Attachment: c13146_20160513.patch

c13146_20160513.patch: 1st patch.

> Refactor RetryInvocationHandler
> ---
>
> Key: HADOOP-13146
> URL: https://issues.apache.org/jira/browse/HADOOP-13146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: c13146_20160513.patch
>
>
> - The exception handling is quite long.  It is better to refactor it to a 
> separated method.
> - The failover logic and synchronization can be moved to a new inner class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13146) Refactor RetryInvocationHandler

2016-05-13 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HADOOP-13146:


 Summary: Refactor RetryInvocationHandler
 Key: HADOOP-13146
 URL: https://issues.apache.org/jira/browse/HADOOP-13146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


- The exception handling is quite long.  It is better to refactor it to a 
separated method.
- The failover logic and synchronization can be moved to a new inner class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

2016-05-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HADOOP-12666:
---
Attachment: HADOOP-12666-012.patch

CachedRefreshTokenBasedAccessTokenProvider
- Since the AccessTokenProvider is only created by reflection, the Timer cstr 
is for testing and does not require an override in this subclass
- The static instance should be final and created during class initialization, 
but...
- {{ConfRefreshTokenBasedAccessTokenProvider}} is not threadsafe. {{setConf}} 
will update the static instance without synchronization, which is shared  by 
every instance of {{CachedRTBATP}}. This could cause undefined behavior. The 
intent is to be to pool clients with the same parameters? Would it make sense 
to add a small cache (v12)?

PrivateCachedRefreshTokenBasedAccessTokenProvider
- The override doesn't seem to serve a purpose. Since it's a workaround, adding 
audience/visibility annotations (HADOOP-5073) would emphasize that this is 
temporary.

PrivateAzureDataLakeFileSystem
- catching {{ArrayIndexOutOfBoundsException}} instead of performing proper 
bounds checking in {{BufferManager::get}} is not efficient:
{code:title=PrivateAzureDataLakeFileSystem.java}
synchronized (BufferManager.getLock()) {
  if (bm.hasData(fsPath.toString(), fileOffset, len)) {
try {
  bm.get(data, fileOffset);
  validDataHoldingSize = data.length;
  currentFileOffset = fileOffset;
} catch (ArrayIndexOutOfBoundsException e) {
  fetchDataOverNetwork = true;
}
  } else {
fetchDataOverNetwork = true;
  }
}
{code}
{code:title=BufferManager.java}
void get(byte[] data, long offset) {
  System.arraycopy(buffer.data, (int) (offset - buffer.offset), data, 0,
  data.length);
}
{code}

The BufferManager/PrivateAzureDataLakeFileSystem synchronization is unorthodox, 
and verifying its correctness is not straightforward. Layering that complexity 
on top of the readahead logic without simplifying abstractions makes it very 
difficult to review. I hope subsequent revisions will replace this code with a 
clearer model, because the current code will be very difficult to maintain.

> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --
>
> Key: HADOOP-12666
> URL: https://issues.apache.org/jira/browse/HADOOP-12666
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/azure, tools
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
> Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf, 
> HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-004.patch, 
> HADOOP-12666-005.patch, HADOOP-12666-006.patch, HADOOP-12666-007.patch, 
> HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-010.patch, 
> HADOOP-12666-011.patch, HADOOP-12666-012.patch, HADOOP-12666-1.patch
>
>   Original Estimate: 336h
>  Time Spent: 336h
>  Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft 
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing 
> Hadoop applications such has MR, HIVE, Hbase etc..,  to use ADL store as 
> input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich 
> management and security features. More details available at 
> https://azure.microsoft.com/en-us/services/data-lake-store/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12893) Verify LICENSE.txt and NOTICE.txt

2016-05-13 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283312#comment-15283312
 ] 

Sean Busbey commented on HADOOP-12893:
--

{quote}
My understanding is LICENSE / NOTICE of binary distribution should be a 
superset of source distribution. Is it good enough to have a separate 
binary-distribution-only LICENSE / NOTICE file and we can concat 
binary-distribution-only and source-distribution L/N while releasing?
{quote}

This is not necessarily true, though I haven't done a sufficient review to say 
if it is for Hadoop or not. As an example, one could have some third party code 
bundled in the test sources and produce a binary distribution tarball with no 
test files in it. Similarly, if the main classes include some third party work 
but the tests do not, then the main jar and the test jar would be different. 
(which would matter if the test jar is published to maven.)

{quote}
In the L we say whether something applies to the binary or the source
distribution. I saw this elsewhere, and it really reduces the POM work
required.
{quote}

I've seen this a few places, but unfortunately it's incorrect. I've been slowly 
working through projects to help correct them, but it's a long slog.

{quote}
I'd like to appeal to a reasonable person standard. We're making a big
effort here to be compliant, and if we do the above, it'll be clear what
does and doesn't apply to each artifact. In the meanwhile, our releases are
blocked.

If additional work really is required, maybe it could also be done as a
follow-on.
{quote}

That's entirely up to the Hadoop PMC. I can certainly understand the reasoning 
of an incremental approach that starts with getting us out of violating the 
licenses of third parties and works towards compliance with ASF Policy.

I would be concerned if "follow-on" turned into "next release" perpetually; 
having releases blocked provides a kind of motivation that little else can. We 
need to end up in a place where everything we distribute meets ASF Policy, but 
folks generally understand that this can take some time.

Keep in mind that release voting is majority, so it might be worth a straw poll 
of how the PMC would vote if a given release met the requirements for third 
party licenses but did not yet meet ASF policy on license notifications.

> Verify LICENSE.txt and NOTICE.txt
> -
>
> Key: HADOOP-12893
> URL: https://issues.apache.org/jira/browse/HADOOP-12893
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Xiao Chen
>Priority: Blocker
> Attachments: HADOOP-12893.01.patch
>
>
> We have many bundled dependencies in both the source and the binary artifacts 
> that are not in LICENSE.txt and NOTICE.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13145) In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283302#comment-15283302
 ] 

Hadoop QA commented on HADOOP-13145:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 49s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 52s 
{color} | {color:green} hadoop-distcp in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 31s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803972/HADOOP-13145.001.patch
 |
| JIRA Issue | HADOOP-13145 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6ad684579be4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HADOOP-13140) GlobalStorageStatistics should check null FileSystem scheme to avoid NPE

2016-05-13 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283289#comment-15283289
 ] 

Mingliang Liu commented on HADOOP-13140:


Thanks [~ste...@apache.org] and [~arpitagarwal] for your insightful comments.

The new design uses a TreeMap in GlobalStorageStatistics for "natural ordering 
iterator" which does not accept a null key. This is the root cause of this 
problem. Actually I think it's fine to disallow null scheme in the newly added 
GlobalStorageStatistics. To me, a null file system scheme is always a magic, 
though I'm happy with a schemeless URI.

For backward compatibility, schemaless statistics seems a good idea to me. Ping 
[~cmccabe].

> GlobalStorageStatistics should check null FileSystem scheme to avoid NPE
> 
>
> Key: HADOOP-13140
> URL: https://issues.apache.org/jira/browse/HADOOP-13140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Brahma Reddy Battula
>Assignee: Mingliang Liu
> Attachments: HADOOP-13140.000.patch
>
>
> {{org.apache.hadoop.fs.GlobalStorageStatistics#put}} is not checking the null 
> scheme, and the internal map will complain NPE. This was reported by a flaky 
> test {{TestFileSystemApplicationHistoryStore}}. Thanks [~brahmareddy] for 
> reporting.
> To address this,
> # Fix the test by providing a valid URI, e.g. {{file:///}}
> # Guard the null scheme in {{GlobalStorageStatistics#put}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-11487) FileNotFound on distcp to s3n/s3a due to creation inconsistency

2016-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283275#comment-15283275
 ] 

Chris Nauroth commented on HADOOP-11487:


I just attached a patch to HADOOP-13145 to prevent this {{getFileInfo}} call 
when DistCp is run without the {{-p}} option.  I suspect that patch can help us 
get past this problem with eventual consistency on DistCp to a destination on 
S3A, at least when DistCp is run without the {{-p}} option.

I filed that patch on a new JIRA instead of here, because the discussion here 
indicates that it may expand to a larger scope for addressing a wider set of 
eventual consistency concerns.  HADOOP-13145 is more of a spot performance 
enhancement.

> FileNotFound on distcp to s3n/s3a due to creation inconsistency 
> 
>
> Key: HADOOP-11487
> URL: https://issues.apache.org/jira/browse/HADOOP-11487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs, fs/s3
>Affects Versions: 2.7.2
>Reporter: Paulo Motta
>
> I'm trying to copy a large amount of files from HDFS to S3 via distcp and I'm 
> getting the following exception:
> {code:java}
> 2015-01-16 20:53:18,187 ERROR [main] 
> org.apache.hadoop.tools.mapred.CopyMapper: Failure in copying 
> hdfs://10.165.35.216/hdfsFolder/file.gz to s3n://s3-bucket/file.gz
> java.io.FileNotFoundException: No such file or directory 
> 's3n://s3-bucket/file.gz'
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>   at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2015-01-16 20:53:18,276 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.io.FileNotFoundException: No such file or 
> directory 's3n://s3-bucket/file.gz'
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>   at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> {code}
> However, when I try hadoop fs -ls s3n://s3-bucket/file.gz the file is there. 
> So probably due to Amazon's S3 eventual consistency the job failure.
> In my opinion, in order to fix this problem NativeS3FileSystem.getFileStatus 
> must use fs.s3.maxRetries property in order to avoid failures like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13145) In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.

2016-05-13 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HADOOP-13145:
---
Status: Patch Available  (was: Open)

> In DistCp, prevent unnecessary getFileStatus call when not preserving 
> metadata.
> ---
>
> Key: HADOOP-13145
> URL: https://issues.apache.org/jira/browse/HADOOP-13145
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13145.001.patch
>
>
> After DistCp copies a file, it calls {{getFileStatus}} to get the 
> {{FileStatus}} from the destination so that it can compare to the source and 
> update metadata if necessary.  If the DistCp command was run without the 
> option to preserve metadata attributes, then this additional 
> {{getFileStatus}} call is wasteful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13145) In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.

2016-05-13 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HADOOP-13145:
---
Attachment: HADOOP-13145.001.patch

The attached v001 patch avoids the unnecessary {{getFileStatus}} call.

The effect is particularly pronounced when running DistCp with a destination on 
S3A, where eventual consistency on S3 can cause the {{getFileStatus}} call to 
fail with {{FileNotFoundException}}.  Then, the whole MapReduce task fails, 
retries, and repeats copying all the data.  [~rajesh.balamohan], I know you saw 
this with some recent large copies to S3A.  Would you be interested in trying a 
test with this patch?  So far, I don't have my own repro.  Note that this patch 
is only helpful as long as the DistCp command is not preserving metadata 
attributes, so don't use the {{-p}} option.

Cc [~ste...@apache.org].

> In DistCp, prevent unnecessary getFileStatus call when not preserving 
> metadata.
> ---
>
> Key: HADOOP-13145
> URL: https://issues.apache.org/jira/browse/HADOOP-13145
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13145.001.patch
>
>
> After DistCp copies a file, it calls {{getFileStatus}} to get the 
> {{FileStatus}} from the destination so that it can compare to the source and 
> update metadata if necessary.  If the DistCp command was run without the 
> option to preserve metadata attributes, then this additional 
> {{getFileStatus}} call is wasteful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13145) In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.

2016-05-13 Thread Chris Nauroth (JIRA)
Chris Nauroth created HADOOP-13145:
--

 Summary: In DistCp, prevent unnecessary getFileStatus call when 
not preserving metadata.
 Key: HADOOP-13145
 URL: https://issues.apache.org/jira/browse/HADOOP-13145
 Project: Hadoop Common
  Issue Type: Improvement
  Components: tools/distcp
Reporter: Chris Nauroth
Assignee: Chris Nauroth


After DistCp copies a file, it calls {{getFileStatus}} to get the 
{{FileStatus}} from the destination so that it can compare to the source and 
update metadata if necessary.  If the DistCp command was run without the option 
to preserve metadata attributes, then this additional {{getFileStatus}} call is 
wasteful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12893) Verify LICENSE.txt and NOTICE.txt

2016-05-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283260#comment-15283260
 ] 

Andrew Wang commented on HADOOP-12893:
--

In the L we say whether something applies to the binary or the source
distribution. I saw this elsewhere, and it really reduces the POM work
required.

I'd like to do something similar for our JARs. We can list out which JARs
are affected by each L entry. Fortunately we only need to do this for the
source L, which are not too numerous.

I'd like to appeal to a reasonable person standard. We're making a big
effort here to be compliant, and if we do the above, it'll be clear what
does and doesn't apply to each artifact. In the meanwhile, our releases are
blocked.

If additional work really is required, maybe it could also be done as a
follow-on.




> Verify LICENSE.txt and NOTICE.txt
> -
>
> Key: HADOOP-12893
> URL: https://issues.apache.org/jira/browse/HADOOP-12893
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Xiao Chen
>Priority: Blocker
> Attachments: HADOOP-12893.01.patch
>
>
> We have many bundled dependencies in both the source and the binary artifacts 
> that are not in LICENSE.txt and NOTICE.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

2016-05-13 Thread Jason Kace (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283247#comment-15283247
 ] 

Jason Kace commented on HADOOP-13144:
-

One solution is to allow each user + remote address to utilize multiple 
connection threads.  This will require reconfiguration of the Client class to 
permit a pool of connections per ConnectionId.

An alternative solution is to create multiple ConnectionIds per user + remote 
address.  The current {{ConnectionId}} class does not support multiple 
hashCodes per user + remote address.   The {{ConnectionId}} can either be 
modified or made visible (attached solution) for inheritance.

Our use case for this feature requires a single user to be able to issue a 
large number of RPC requests to a single NN via the IPC client.  Better 
throughput is required in the existing IPC client to allow up to 100k 
requests/second from the same user to the same remote address.


> Enhancing IPC client throughput via multiple connections per user
> -
>
> Key: HADOOP-13144
> URL: https://issues.apache.org/jira/browse/HADOOP-13144
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Jason Kace
>Priority: Minor
> Fix For: 2.8.0
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single 
> connection thread for each {{ConnectionId}}.  The {{ConnectionId}} is unique 
> to the connection's remote address, ticket and protocol.  Each ConnectionId 
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single 
> thread for a each user/ticket + address.  If a single user makes repeated 
> calls (1k-100k/sec) to the same destination, the IPC client becomes a 
> bottleneck.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

2016-05-13 Thread Jason Kace (JIRA)
Jason Kace created HADOOP-13144:
---

 Summary: Enhancing IPC client throughput via multiple connections 
per user
 Key: HADOOP-13144
 URL: https://issues.apache.org/jira/browse/HADOOP-13144
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Jason Kace
Priority: Minor
 Fix For: 2.8.0


The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single 
connection thread for each {{ConnectionId}}.  The {{ConnectionId}} is unique to 
the connection's remote address, ticket and protocol.  Each ConnectionId is 1:1 
mapped to a connection thread by the client via a map cache.

The result is to serialize all IPC read/write activity through a single thread 
for a each user/ticket + address.  If a single user makes repeated calls 
(1k-100k/sec) to the same destination, the IPC client becomes a bottleneck.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12893) Verify LICENSE.txt and NOTICE.txt

2016-05-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283241#comment-15283241
 ] 

Wangda Tan commented on HADOOP-12893:
-

 [~xiaochen], [~andrew.wang] and [~ajisakaa], thanks a lot for the great work!

I can help with reviewing this patch.

bq.  ASF policy requires that the LICENSE/NOTICE file cover exactly the 
contents of each distributed artifact.

My understanding is LICENSE / NOTICE of binary distribution should be a 
superset of source distribution. Is it good enough to have a separate 
binary-distribution-only LICENSE / NOTICE file and we can concat 
binary-distribution-only and source-distribution L/N while releasing?

> Verify LICENSE.txt and NOTICE.txt
> -
>
> Key: HADOOP-12893
> URL: https://issues.apache.org/jira/browse/HADOOP-12893
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Xiao Chen
>Priority: Blocker
> Attachments: HADOOP-12893.01.patch
>
>
> We have many bundled dependencies in both the source and the binary artifacts 
> that are not in LICENSE.txt and NOTICE.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12723) S3A: Add ability to plug in any AWSCredentialsProvider

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283232#comment-15283232
 ] 

Hadoop QA commented on HADOOP-12723:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 27s 
{color} | {color:red} root: The patch generated 1 new + 18 unchanged - 0 fixed 
= 19 total (was 18) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 37s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 30s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not 

[jira] [Comment Edited] (HADOOP-13140) GlobalStorageStatistics should check null FileSystem scheme to avoid NPE

2016-05-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283061#comment-15283061
 ] 

Arpit Agarwal edited comment on HADOOP-13140 at 5/13/16 9:40 PM:
-

Yes as Steve suggested I think it's better to have a special statistics object 
for the "null"/missing scheme instead of assuming that null means "file".

Java maps support a null key so that part will work fine but it may require 
non-trivial changes in FileSystemStorageStatistics/GlobalStatistics if it is 
assumed the scheme is non-null everywhere.


was (Author: arpitagarwal):
Yes I think it's better to have a special statistics object for the "null" 
scheme instead of assuming that null means "file" as Steve suggests.

Java maps support a null key so that part will work fine but it may require 
non-trivial changes in FileSystemStorageStatistics/GlobalStatistics if it is 
assumed the scheme is non-null everywhere.

> GlobalStorageStatistics should check null FileSystem scheme to avoid NPE
> 
>
> Key: HADOOP-13140
> URL: https://issues.apache.org/jira/browse/HADOOP-13140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Brahma Reddy Battula
>Assignee: Mingliang Liu
> Attachments: HADOOP-13140.000.patch
>
>
> {{org.apache.hadoop.fs.GlobalStorageStatistics#put}} is not checking the null 
> scheme, and the internal map will complain NPE. This was reported by a flaky 
> test {{TestFileSystemApplicationHistoryStore}}. Thanks [~brahmareddy] for 
> reporting.
> To address this,
> # Fix the test by providing a valid URI, e.g. {{file:///}}
> # Guard the null scheme in {{GlobalStorageStatistics#put}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12723) S3A: Add ability to plug in any AWSCredentialsProvider

2016-05-13 Thread Steven Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong updated HADOOP-12723:
-
Attachment: HADOOP-12723.3.patch

Rebasing the patch. Also, fix two toString calls inside S3AFileSystem.toString, 
one of which causes NPE when there's no canned ACL.

> S3A: Add ability to plug in any AWSCredentialsProvider
> --
>
> Key: HADOOP-12723
> URL: https://issues.apache.org/jira/browse/HADOOP-12723
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 2.7.1
>Reporter: Steven Wong
>Assignee: Steven Wong
> Attachments: HADOOP-12723.0.patch, HADOOP-12723.1.patch, 
> HADOOP-12723.2.patch, HADOOP-12723.3.patch
>
>
> Although S3A currently has built-in support for 
> {{org.apache.hadoop.fs.s3a.BasicAWSCredentialsProvider}}, 
> {{com.amazonaws.auth.InstanceProfileCredentialsProvider}}, and 
> {{org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider}}, it does not 
> support any other credentials provider that implements the 
> {{com.amazonaws.auth.AWSCredentialsProvider}} interface. Supporting the 
> ability to plug in any {{com.amazonaws.auth.AWSCredentialsProvider}} instance 
> will expand the options for S3 credentials, such as:
> * temporary credentials from STS, e.g. via 
> {{com.amazonaws.auth.STSSessionCredentialsProvider}}
> * IAM role-based credentials, e.g. via 
> {{com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider}}
> * a custom credentials provider that satisfies one's own needs, e.g. 
> bucket-specific credentials, user-specific credentials, etc.
> To support this, we can add a configuration for the fully qualified class 
> name of a credentials provider, to be loaded by 
> {{S3AFileSystem.initialize(URI, Configuration)}}.
> The configured credentials provider should implement 
> {{com.amazonaws.auth.AWSCredentialsProvider}} and have a constructor that 
> accepts {{(URI uri, Configuration conf)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13083) The number of javadocs warnings is limited to 100

2016-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283142#comment-15283142
 ] 

Hudson commented on HADOOP-13083:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9761 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9761/])
HADOOP-13083. The number of javadocs warnings is limited to 100. (gtcarrera9: 
rev 3fa1380c221b9d659fb82c42284505ef19da38d0)
* hadoop-project/pom.xml


> The number of javadocs warnings is limited to 100 
> --
>
> Key: HADOOP-13083
> URL: https://issues.apache.org/jira/browse/HADOOP-13083
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Li Lu
>Assignee: Gergely Novák
>Priority: Critical
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-4978.001.patch
>
>
> We are generating a lot of javadoc warnings with jdk 1.8. Right now the 
> number is limited to 100. Enlarge this limitation can probably reveal more 
> problems in one batch for our javadoc generation process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13083) The number of javadocs warnings is limited to 100

2016-05-13 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated HADOOP-13083:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha1
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, and branch-2.8. Thanks [~GergelyNovak] for the 
patch and [~andrew.wang] for the review! 

> The number of javadocs warnings is limited to 100 
> --
>
> Key: HADOOP-13083
> URL: https://issues.apache.org/jira/browse/HADOOP-13083
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Li Lu
>Assignee: Gergely Novák
>Priority: Critical
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-4978.001.patch
>
>
> We are generating a lot of javadoc warnings with jdk 1.8. Right now the 
> number is limited to 100. Enlarge this limitation can probably reveal more 
> problems in one batch for our javadoc generation process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-10942) Globbing optimizations and regression fix

2016-05-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283108#comment-15283108
 ] 

Colin Patrick McCabe edited comment on HADOOP-10942 at 5/13/16 8:31 PM:


The regression referred to here was fixed in HADOOP-10957.  The optimizations 
are already implemented (we don't perform an RPC on each path component, only 
when we need to do so to implement a wildcard.)


was (Author: cmccabe):
The regression fix referred to here was fixed in HADOOP-10957.  The 
optimizations are already implemented (we don't perform an RPC on each path 
component, only when we need to do so to implement a wildcard.)

> Globbing optimizations and regression fix
> -
>
> Key: HADOOP-10942
> URL: https://issues.apache.org/jira/browse/HADOOP-10942
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.1.0-beta, 3.0.0-alpha1
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10942.patch
>
>
> When globbing was commonized to support both filesystem and filecontext, it 
> regressed a fix that prevents an intermediate glob that matches a file from 
> throwing a confusing permissions exception.  The hdfs traverse check requires 
> the exec bit which a file does not have.
> Additional optimizations to reduce rpcs actually increases them if 
> directories contain 1 item.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10942) Globbing optimizations and regression fix

2016-05-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10942:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

The regression fix referred to here was fixed in HADOOP-10957.  The 
optimizations are already implemented (we don't perform an RPC on each path 
component, only when we need to do so to implement a wildcard.)

> Globbing optimizations and regression fix
> -
>
> Key: HADOOP-10942
> URL: https://issues.apache.org/jira/browse/HADOOP-10942
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.1.0-beta, 3.0.0-alpha1
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10942.patch
>
>
> When globbing was commonized to support both filesystem and filecontext, it 
> regressed a fix that prevents an intermediate glob that matches a file from 
> throwing a confusing permissions exception.  The hdfs traverse check requires 
> the exec bit which a file does not have.
> Additional optimizations to reduce rpcs actually increases them if 
> directories contain 1 item.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13083) The number of javadocs warnings is limited to 100

2016-05-13 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283105#comment-15283105
 ] 

Li Lu commented on HADOOP-13083:


+1. Will commit shortly. 

> The number of javadocs warnings is limited to 100 
> --
>
> Key: HADOOP-13083
> URL: https://issues.apache.org/jira/browse/HADOOP-13083
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Li Lu
>Assignee: Gergely Novák
>Priority: Critical
> Attachments: YARN-4978.001.patch
>
>
> We are generating a lot of javadoc warnings with jdk 1.8. Right now the 
> number is limited to 100. Enlarge this limitation can probably reveal more 
> problems in one batch for our javadoc generation process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13138) Unable to append to a SequenceFile with Compression.NONE.

2016-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283096#comment-15283096
 ] 

Chris Nauroth commented on HADOOP-13138:


[~vinayrpet], thank you for the patch.  This looks good to me.  I think it will 
be ready to go after addressing the Checkstyle nitpicks.

> Unable to append to a SequenceFile with Compression.NONE.
> -
>
> Key: HADOOP-13138
> URL: https://issues.apache.org/jira/browse/HADOOP-13138
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Gervais Mickaël
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HADOOP-13138-01.patch
>
>
> Hi,
> I'm trying to use the append functionnality to an existing _SequenceFile_.
> If I set _Compression.NONE_, it works when the file is created, but when the 
> file already exists I've a _NullPointerException_, by the way it works if I 
> specify a compression with a codec.
> {code:title=Failing code|borderStyle=solid}
> Option compression = compression(CompressionType.NONE);
> Option keyClass = keyClass(LongWritable.class);
> Option valueClass = valueClass(BytesWritable.class);
> Option out = file(dfs);
> Option append = appendIfExists(true);
> writer = createWriter(conf,
>  out,
>  append,
>  compression,
>  keyClass,
>  valueClass);
> {code}
> The following exeception is thrown when the file exists because compression 
> option is checked:
> {code}
> Exception in thread "main" java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:1119)
>   at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:273)
> {code}
> This is due to the *codec* which is _null_:
> {code:title=SequenceFile.java|borderStyle=solid}
>  if (readerCompressionOption.value != compressionTypeOption.value
> || !readerCompressionOption.codec.getClass().getName()
> 
> .equals(compressionTypeOption.codec.getClass().getName())) {
>   throw new IllegalArgumentException(
>   "Compression option provided does not match the file");
> }
> {code}
> Thansk 
> Mickaël



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283094#comment-15283094
 ] 

Hadoop QA commented on HADOOP-13130:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
10s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 32s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 49s 
{color} | {color:red} root: The patch generated 31 new + 29 unchanged - 0 fixed 
= 60 total (was 29) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 26s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 18s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 57s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 17s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
28s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | 

[jira] [Commented] (HADOOP-12718) Incorrect error message by fs -put local dir without permission

2016-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283082#comment-15283082
 ] 

Chris Nauroth commented on HADOOP-12718:


I'm considering reverting this patch for a few reasons.

First, the patch was not coded to handle Windows correctly.  Changing it to use 
{{FileUtil#canRead}} would solve that.

However, the larger issue is that this might be backwards-incompatible.  Prior 
to the patch, lack of access would return {{null}}.  After the patch, lack of 
access throws an exception.  Although this matches HDFS semantics, applications 
often have different expectations of the local file system.  If an application 
was coded to check for {{null}}, but not handle {{AccessDeniedException}}, then 
there is a risk of breaking that application.

Cc [~ste...@apache.org] for a second opinion from the file system contract 
perspective.  The current spec for {{listStatus}} does not explicitly require a 
pre-condition check that lack of access results in {{AccessDeniedException}}.

http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/filesystem/filesystem.html

If we revert this, then something will have to be done about YARN-4842, which 
included tests that depended on the new exception message.  Cc [~xgong].

I won't revert immediately.  I'll wait until next week so that others get time 
to consider and comment.

> Incorrect error message by fs -put local dir without permission
> ---
>
> Key: HADOOP-12718
> URL: https://issues.apache.org/jira/browse/HADOOP-12718
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
> Attachments: HADOOP-12718.001.patch, HADOOP-12718.002.patch, 
> TestFsShellCopyPermission-output.001.txt, 
> TestFsShellCopyPermission-output.002.txt, TestFsShellCopyPermission.001.patch
>
>
> When the user doesn't have access permission to the local directory, the 
> "hadoop fs -put" command prints a confusing error message "No such file or 
> directory".
> {noformat}
> $ whoami
> systest
> $ cd /home/systest
> $ ls -ld .
> drwx--. 4 systest systest 4096 Jan 13 14:21 .
> $ mkdir d1
> $ sudo -u hdfs hadoop fs -put d1 /tmp
> put: `d1': No such file or directory
> {noformat}
> It will be more informative if the message is:
> {noformat}
> put: d1 (Permission denied)
> {noformat}
> If the source is a local file, the error message is ok:
> {noformat}
> put: f1 (Permission denied)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13114) DistCp should have option to compress data on write

2016-05-13 Thread Suraj Nayak (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283079#comment-15283079
 ] 

Suraj Nayak commented on HADOOP-13114:
--

JIRA was not accepting comments when I uploaded the latest patch with 
{{CodecPool}} changes. Adding the details of Jenkins build here with this 
comment :
Jenkins Console output Link : 
[https://builds.apache.org/job/PreCommit-HADOOP-Build/9414/console]
Jenkins output : 



+1 overall

| Vote |  Subsystem |  Runtime   | Comment

|   0  |reexec  |  0m 13s| Docker mode activated. 
|  +1  |   @author  |  0m 0s | The patch does not contain any @author 
|  ||| tags.
|  +1  |test4tests  |  0m 0s | The patch appears to include 2 new or 
|  ||| modified test files.
|  +1  |mvninstall  |  7m 1s | trunk passed 
|  +1  |   compile  |  0m 14s| trunk passed with JDK v1.8.0_91 
|  +1  |   compile  |  0m 17s| trunk passed with JDK v1.7.0_95 
|  +1  |checkstyle  |  0m 17s| trunk passed 
|  +1  |   mvnsite  |  0m 22s| trunk passed 
|  +1  |mvneclipse  |  0m 15s| trunk passed 
|  +1  |  findbugs  |  0m 28s| trunk passed 
|  +1  |   javadoc  |  0m 12s| trunk passed with JDK v1.8.0_91 
|  +1  |   javadoc  |  0m 15s| trunk passed with JDK v1.7.0_95 
|  +1  |mvninstall  |  0m 17s| the patch passed 
|  +1  |   compile  |  0m 13s| the patch passed with JDK v1.8.0_91 
|  +1  | javac  |  0m 13s| the patch passed 
|  +1  |   compile  |  0m 15s| the patch passed with JDK v1.7.0_95 
|  +1  | javac  |  0m 15s| the patch passed 
|  +1  |checkstyle  |  0m 14s| the patch passed 
|  +1  |   mvnsite  |  0m 20s| the patch passed 
|  +1  |mvneclipse  |  0m 11s| the patch passed 
|  +1  |whitespace  |  0m 0s | The patch has no whitespace issues. 
|  +1  |  findbugs  |  0m 36s| the patch passed 
|  +1  |   javadoc  |  0m 10s| the patch passed with JDK v1.8.0_91 
|  +1  |   javadoc  |  0m 12s| the patch passed with JDK v1.7.0_95 
|  +1  |  unit  |  8m 40s| hadoop-distcp in the patch passed with 
|  ||| JDK v1.8.0_91.
|  +1  |  unit  |  7m 55s| hadoop-distcp in the patch passed with 
|  ||| JDK v1.7.0_95.
|  +1  |asflicense  |  0m 17s| The patch does not generate ASF License 
|  ||| warnings.
|  ||  29m 51s   | 


|| Subsystem || Report/Notes ||

| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803827/HADOOP-13114-trunk_2016-05-12-1.patch
 |
| JIRA Issue | HADOOP-13114 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 62e2be2ea3c4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fa440a3 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| findbugs | v3.0.0 |
| JDK v1.7.0_95  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9414/testReport/ |
| modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9414/console |
| Powered by | Apache Yetus 0.3.0-SNAPSHOT   http://yetus.apache.org |

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput 

[jira] [Commented] (HADOOP-13140) GlobalStorageStatistics should check null FileSystem scheme to avoid NPE

2016-05-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283061#comment-15283061
 ] 

Arpit Agarwal commented on HADOOP-13140:


Yes I think it's better to have a special statistics object for the "null" 
scheme instead of assuming that null means "file" as Steve suggests.

Java maps support a null key so that part will work fine but it may require 
non-trivial changes in FileSystemStorageStatistics/GlobalStatistics if it is 
assumed the scheme is non-null everywhere.

> GlobalStorageStatistics should check null FileSystem scheme to avoid NPE
> 
>
> Key: HADOOP-13140
> URL: https://issues.apache.org/jira/browse/HADOOP-13140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Brahma Reddy Battula
>Assignee: Mingliang Liu
> Attachments: HADOOP-13140.000.patch
>
>
> {{org.apache.hadoop.fs.GlobalStorageStatistics#put}} is not checking the null 
> scheme, and the internal map will complain NPE. This was reported by a flaky 
> test {{TestFileSystemApplicationHistoryStore}}. Thanks [~brahmareddy] for 
> reporting.
> To address this,
> # Fix the test by providing a valid URI, e.g. {{file:///}}
> # Guard the null scheme in {{GlobalStorageStatistics#put}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12893) Verify LICENSE.txt and NOTICE.txt

2016-05-13 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283060#comment-15283060
 ] 

Sean Busbey commented on HADOOP-12893:
--

An important note: it looks like y'all are planning on a single LICENSE/NOTICE 
pair for all artifacts. ASF policy requires that the LICENSE/NOTICE file cover 
exactly the contents of each distributed artifact. That likely means that we 
need different ones for hte source tarball, the convenience binary tarball, and 
different ones depending on the specific contents of jar files (not just 
per-module, but for -source.jar, -javadoc.jar, -test.jar).

> Verify LICENSE.txt and NOTICE.txt
> -
>
> Key: HADOOP-12893
> URL: https://issues.apache.org/jira/browse/HADOOP-12893
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Xiao Chen
>Priority: Blocker
> Attachments: HADOOP-12893.01.patch
>
>
> We have many bundled dependencies in both the source and the binary artifacts 
> that are not in LICENSE.txt and NOTICE.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12893) Verify LICENSE.txt and NOTICE.txt

2016-05-13 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-12893:
---
Attachment: HADOOP-12893.01.patch

Thanks [~wangda] for offering help! I was blocked by jira...
Attached a patch 1 for the work against trunk.
I think for now the most helpful thing is to review this patch and have trunk 
done. After that I guess we'll need to work on every release branches.

The way we ([~andrew.wang] + [~ajisakaa] + myself) did this:
# Have whisker to generate all dependencies, consolidate it into a spreadsheet. 
# Manually find license/notice for each of them. Same license for different 
dependencies are merged per LEGAL-247.
# Parse the result into new LICENSE and NOTICE files
# Manually add merge the new L into current ones
# Add a new module {{hadoop-resource-bundle}} to easily patch L into jars' 
META-INF section. (Thank you HBase for showing an example)
# Verify that all jars contain the L by grepping the generated jars. e.g. 
{{for f in $(find ./target/hadoop-3.0.0-alpha1-SNAPSHOT/share -name 
hadoop*jar); do echo $f;jar -tf $f|grep "LICENSE\|NOTICE";done}}

> Verify LICENSE.txt and NOTICE.txt
> -
>
> Key: HADOOP-12893
> URL: https://issues.apache.org/jira/browse/HADOOP-12893
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Xiao Chen
>Priority: Blocker
> Attachments: HADOOP-12893.01.patch
>
>
> We have many bundled dependencies in both the source and the binary artifacts 
> that are not in LICENSE.txt and NOTICE.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13105) Support timeouts in LDAP queries in LdapGroupsMapping.

2016-05-13 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283007#comment-15283007
 ] 

Chris Nauroth commented on HADOOP-13105:


[~liuml07], thank you for the patch.

# I agree with handling both connection timeout and read timeout within the 
scope of this patch.
# I agree with the suggestion to make the timeout settings configurable, and 
use something pretty long, like 60 seconds, as the default.  A value of "0" 
could mean "don't set the timeout".  That way, there is minimal impact to 
existing LDAP deployments that experience long latency, and in case anyone 
really wants the old unbounded wait behavior, they can set it to 0.
# For testing, I suggest looking at {{TestWebHdfsTimeouts}}, which uses 
techniques similar to what your test here does.  To cover read timeout, it 
starts a TCP server that accepts connections but never responds, like what your 
patch already does.  To simulate connect timeout, it spams a bunch of 
connections at that server to consume the TCP connection backlog before running 
the test.

bq. Out of curiosity, doesn't the property 
{{hadoop.security.group.mapping.ldap.directory.search.timeout}} work for this 
purpose?

[~jojochuang], I'm pretty sure this is something different.  This is an 
application layer control, passed in the LDAP query, to give the LDAP server a 
hint that it should expect the query to complete in this amount of time.  An 
LDAP server may choose to abort its query if it cannot complete within this 
timeout.  This does not control timeouts at the TCP layer.  It would not catch 
connection timeouts due to an overloaded LDAP server that has exhausted its 
listen backlog.  It also would not catch timeouts if the LDAP server 
implementation chooses not to respect the search timeout.  It also wouldn't 
cover cases like firewall misconfigurations that accept the client's SYN packet 
for connection establishment, but then drop subsequent packets.

At least that's my recollection of what the search timeout does.  
Unfortunately, I can't find a definitive reference for that on the web right 
now to backup my claim.  :-)  I definitely have seen LDAP connection timeouts 
and read timeouts despite having the search timeout configured correctly.

If you were thinking of overloading 
{{hadoop.security.group.mapping.ldap.directory.search.timeout}} to also pass 
that same value for these new connect and read timeout settings, I'd instead 
prefer new properties for greater flexibility.

> Support timeouts in LDAP queries in LdapGroupsMapping.
> --
>
> Key: HADOOP-13105
> URL: https://issues.apache.org/jira/browse/HADOOP-13105
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13105.000.patch
>
>
> {{LdapGroupsMapping}} currently does not set timeouts on the LDAP queries.  
> This can create a risk of a very long/infinite wait on a connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282995#comment-15282995
 ] 

Andrew Wang commented on HADOOP-13139:
--

Possibly, but I thought deprecated keys forwarded to a new config key? Since 
this value is being removed, there's no new config key to forward to.

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282989#comment-15282989
 ] 

Steve Loughran commented on HADOOP-13139:
-

do you think maybe putting it in as a deprecated key would be enough?

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13130:

Target Version/s: 2.8.0
  Status: Patch Available  (was: Open)

> s3a failures can surface as RTEs, not IOEs
> --
>
> Key: HADOOP-13130
> URL: https://issues.apache.org/jira/browse/HADOOP-13130
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13130-001.patch
>
>
> S3A failures happening in the AWS library surface as 
> {{AmazonClientException}} derivatives, rather than IOEs. As the amazon 
> exceptions are runtime exceptions, any code which catches IOEs for error 
> handling breaks.
> The fix will be to catch and wrap. The hard thing will be to wrap it with 
> meaningful exceptions rather than a generic IOE. Furthermore, if anyone has 
> been catching AWS exceptions, they are going to be disappointed. That means 
> that fixing this situation could be considered "incompatible" —but only for 
> code which contains assumptions about the underlying FS and the exceptions 
> they raise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282954#comment-15282954
 ] 

Steve Loughran commented on HADOOP-13130:
-

I forgot to mention: patch 001 does make a couple of changes where handling of 
the file-open-in-lazy seek logic was raising EOF exceptions in non-positioned 
read() calls, calls where a -1 was expected to be raised instead.

> s3a failures can surface as RTEs, not IOEs
> --
>
> Key: HADOOP-13130
> URL: https://issues.apache.org/jira/browse/HADOOP-13130
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13130-001.patch
>
>
> S3A failures happening in the AWS library surface as 
> {{AmazonClientException}} derivatives, rather than IOEs. As the amazon 
> exceptions are runtime exceptions, any code which catches IOEs for error 
> handling breaks.
> The fix will be to catch and wrap. The hard thing will be to wrap it with 
> meaningful exceptions rather than a generic IOE. Furthermore, if anyone has 
> been catching AWS exceptions, they are going to be disappointed. That means 
> that fixing this situation could be considered "incompatible" —but only for 
> code which contains assumptions about the underlying FS and the exceptions 
> they raise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13130:

Attachment: HADOOP-13130-001.patch

Amazon S3, service and client exceptions are caught and wrapped into IOEs.

If they map to standard exceptions (e.g. 404 -> not found, 416-.EOF) then that 
is done...I opened up some of the constructors on the existing hadoop.fs 
exceptions to ease wrapping the amazon ones here.

If they aren't known, there are two new IOEs, {{AwsServiceIOException}} and 
{{AwsS3IOException}} which wrap {{AmazonServiceException}} and 
{{AmazonS3Exception}} respectively. These relay all the getters to the wrapped 
cause, such as {{getStatusCode()}}, {{getRawResponseContent()}} , etc

I've gone through all the code to make sure that all invocations of the s3 
object are, ultimately, caught and translated to IOEs. For the main FS 
operations, I've done this by splitting up an innerX and the public X 
operations (rename, delete, ..), with the outer one doing the catch and 
translate. Some operations do exception handling more internally 
({{getFileStatus}} in particular), so that's more complex. The 
{{S3AFastOutputStream}} is also somewhat convoluted. Reviewing there welcome.

It's hard to test the codepaths without fault injection or knowledge of 
specific buckets which don't exist, files you can't read, write, etc. We could 
get away with that for AWS S3, but they wouldn't work against other endpoints. 
What I have done is one test of
# create an 8k file
# seek to near the end
# overwrite- a 4K file
# seek to 6K
# attempt a read(), expect -1
# attempt a readFully at 5K, expect EOF exception
# attempt a read(byte[]), expect EOF addr

This shows that the logic can catch the situation of an InputStream having the 
file underneath shortened works reliably everywhere; I also check that deletion 
results in {{FileNotFoundEvents}} being passed up

> s3a failures can surface as RTEs, not IOEs
> --
>
> Key: HADOOP-13130
> URL: https://issues.apache.org/jira/browse/HADOOP-13130
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13130-001.patch
>
>
> S3A failures happening in the AWS library surface as 
> {{AmazonClientException}} derivatives, rather than IOEs. As the amazon 
> exceptions are runtime exceptions, any code which catches IOEs for error 
> handling breaks.
> The fix will be to catch and wrap. The hard thing will be to wrap it with 
> meaningful exceptions rather than a generic IOE. Furthermore, if anyone has 
> been catching AWS exceptions, they are going to be disappointed. That means 
> that fixing this situation could be considered "incompatible" —but only for 
> code which contains assumptions about the underlying FS and the exceptions 
> they raise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11505) Various native parts use bswap incorrectly and unportably

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-11505:
-
Target Version/s: 3.0.0-alpha1
   Fix Version/s: (was: 3.0.0-alpha1)

> Various native parts use bswap incorrectly and unportably
> -
>
> Key: HADOOP-11505
> URL: https://issues.apache.org/jira/browse/HADOOP-11505
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Colin Patrick McCabe
>Assignee: Alan Burlison
>Priority: Critical
> Attachments: HADOOP-11505.001.patch, HADOOP-11505.003.patch, 
> HADOOP-11505.004.patch, HADOOP-11505.005.patch, HADOOP-11505.006.patch, 
> HADOOP-11505.007.patch, HADOOP-11505.008.patch
>
>
> hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some 
> cases.  Also, on some alternate, non-x86, non-ARM architectures the generated 
> code is incorrect.  Thanks to Steve Loughran and Edward Nevill for finding 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282893#comment-15282893
 ] 

Andrew Wang commented on HADOOP-13139:
--

Given this is an incompatible change we're targeting for branch-2 (and we're 
sure we really want to go through with this), I'd also like to see a log WARN 
if the removed config key is set, explaining what to configure instead. That'd 
be a good improvement even for trunk too.

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-13139:
-
Fix Version/s: (was: 2.8.0)

Fix version shouldn't be set until the patch is committed.

Could you also add whatever user information is required in the release note 
field?

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12971) FileSystemShell doc should explain relative path

2016-05-13 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HADOOP-12971:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed the patch. Thanks for your contribution [~jzhuge]!

> FileSystemShell doc should explain relative path
> 
>
> Key: HADOOP-12971
> URL: https://issues.apache.org/jira/browse/HADOOP-12971
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
> Fix For: 2.8.0, 2.9.0
>
> Attachments: HADOOP-12971.001.patch
>
>
> Update 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
>  with information about relative path and current working directory, as 
> suggested by [~yzhangal] during HADOOP-10965 discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12971) FileSystemShell doc should explain relative path

2016-05-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282820#comment-15282820
 ] 

Sangjin Lee commented on HADOOP-12971:
--

+1 overall

| Vote |  Subsystem |  Runtime   | Comment

|   0  |reexec  |  0m 12s| Docker mode activated. 
|  +1  |   @author  |  0m 0s | The patch does not contain any @author 
|  ||| tags.
|  +1  |mvninstall  |  6m 35s| trunk passed 
|  +1  |   mvnsite  |  0m 59s| trunk passed 
|  +1  |   mvnsite  |  0m 54s| the patch passed 
|  +1  |whitespace  |  0m 0s | The patch has no whitespace issues. 
|  +1  |asflicense  |  0m 17s| The patch does not generate ASF License 
|  ||| warnings.
|  ||  9m 12s| 


|| Subsystem || Report/Notes ||

| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803807/HADOOP-12971.001.patch
 |
| JIRA Issue | HADOOP-12971 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux e903f2f8ccc1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3c5c57a |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9412/console |
| Powered by | Apache Yetus 0.3.0-SNAPSHOT   http://yetus.apache.org |


> FileSystemShell doc should explain relative path
> 
>
> Key: HADOOP-12971
> URL: https://issues.apache.org/jira/browse/HADOOP-12971
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
> Attachments: HADOOP-12971.001.patch
>
>
> Update 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
>  with information about relative path and current working directory, as 
> suggested by [~yzhangal] during HADOOP-10965 discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12971) FileSystemShell doc should explain relative path

2016-05-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282821#comment-15282821
 ] 

Sangjin Lee commented on HADOOP-12971:
--

+1. Will commit it today. 

> FileSystemShell doc should explain relative path
> 
>
> Key: HADOOP-12971
> URL: https://issues.apache.org/jira/browse/HADOOP-12971
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
> Attachments: HADOOP-12971.001.patch
>
>
> Update 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
>  with information about relative path and current working directory, as 
> suggested by [~yzhangal] during HADOOP-10965 discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282766#comment-15282766
 ] 

Steve Loughran commented on HADOOP-13130:
-

One aspect of this is that I've managed to create a test which triggers an EOF 
exception in read()... the {{lazySeek()}} logic is outside the try/catch clause 
here. If someone overwrites a file with a shorter one, and then you seek to 
somewhere which you think is in range —but which isn't, a 419 comes back, which 
I'm not mapping to EOF. Fix: move the {{lazySeek()}}

> s3a failures can surface as RTEs, not IOEs
> --
>
> Key: HADOOP-13130
> URL: https://issues.apache.org/jira/browse/HADOOP-13130
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> S3A failures happening in the AWS library surface as 
> {{AmazonClientException}} derivatives, rather than IOEs. As the amazon 
> exceptions are runtime exceptions, any code which catches IOEs for error 
> handling breaks.
> The fix will be to catch and wrap. The hard thing will be to wrap it with 
> meaningful exceptions rather than a generic IOE. Furthermore, if anyone has 
> been catching AWS exceptions, they are going to be disappointed. That means 
> that fixing this situation could be considered "incompatible" —but only for 
> code which contains assumptions about the underlying FS and the exceptions 
> they raise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13130) s3a failures can surface as RTEs, not IOEs

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-13130:
---

Assignee: Steve Loughran

> s3a failures can surface as RTEs, not IOEs
> --
>
> Key: HADOOP-13130
> URL: https://issues.apache.org/jira/browse/HADOOP-13130
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> S3A failures happening in the AWS library surface as 
> {{AmazonClientException}} derivatives, rather than IOEs. As the amazon 
> exceptions are runtime exceptions, any code which catches IOEs for error 
> handling breaks.
> The fix will be to catch and wrap. The hard thing will be to wrap it with 
> meaningful exceptions rather than a generic IOE. Furthermore, if anyone has 
> been catching AWS exceptions, they are going to be disappointed. That means 
> that fixing this situation could be considered "incompatible" —but only for 
> code which contains assumptions about the underlying FS and the exceptions 
> they raise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping

2016-05-13 Thread Esther Kundin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esther Kundin updated HADOOP-12291:
---
Status: In Progress  (was: Patch Available)

> Add support for nested groups in LdapGroupsMapping
> --
>
> Key: HADOOP-12291
> URL: https://issues.apache.org/jira/browse/HADOOP-12291
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Gautam Gopalakrishnan
>Assignee: Esther Kundin
>  Labels: features, patch
> Fix For: 2.8.0
>
> Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch, 
> HADOOP-12291.003.patch, HADOOP-12291.004.patch, HADOOP-12291.005.patch, 
> HADOOP-12291.006.patch
>
>
> When using {{LdapGroupsMapping}} with Hadoop, nested groups are not 
> supported. So for example if user {{jdoe}} is part of group A which is a 
> member of group B, the group mapping currently returns only group A.
> Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and 
> SSSD (or similar tools) but would be good to have this feature as part of 
> {{LdapGroupsMapping}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping

2016-05-13 Thread Esther Kundin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esther Kundin updated HADOOP-12291:
---
Status: Patch Available  (was: In Progress)

> Add support for nested groups in LdapGroupsMapping
> --
>
> Key: HADOOP-12291
> URL: https://issues.apache.org/jira/browse/HADOOP-12291
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Gautam Gopalakrishnan
>Assignee: Esther Kundin
>  Labels: features, patch
> Fix For: 2.8.0
>
> Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch, 
> HADOOP-12291.003.patch, HADOOP-12291.004.patch, HADOOP-12291.005.patch, 
> HADOOP-12291.006.patch
>
>
> When using {{LdapGroupsMapping}} with Hadoop, nested groups are not 
> supported. So for example if user {{jdoe}} is part of group A which is a 
> member of group B, the group mapping currently returns only group A.
> Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and 
> SSSD (or similar tools) but would be good to have this feature as part of 
> {{LdapGroupsMapping}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping

2016-05-13 Thread Esther Kundin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esther Kundin updated HADOOP-12291:
---
Attachment: HADOOP-12291.006.patch

> Add support for nested groups in LdapGroupsMapping
> --
>
> Key: HADOOP-12291
> URL: https://issues.apache.org/jira/browse/HADOOP-12291
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Gautam Gopalakrishnan
>Assignee: Esther Kundin
>  Labels: features, patch
> Fix For: 2.8.0
>
> Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch, 
> HADOOP-12291.003.patch, HADOOP-12291.004.patch, HADOOP-12291.005.patch, 
> HADOOP-12291.006.patch
>
>
> When using {{LdapGroupsMapping}} with Hadoop, nested groups are not 
> supported. So for example if user {{jdoe}} is part of group A which is a 
> member of group B, the group mapping currently returns only group A.
> Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and 
> SSSD (or similar tools) but would be good to have this feature as part of 
> {{LdapGroupsMapping}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Pieter Reuse (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Reuse updated HADOOP-13139:
--
Status: Patch Available  (was: In Progress)

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Fix For: 2.8.0
>
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Pieter Reuse (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Reuse updated HADOOP-13139:
--
Attachment: HADOOP-13139-branch-2.002.patch

Uploaded patch 002: changed patch w.r.t. HADOOP-13028, fixed the checkstyle 
issues and copied HADOOP-12553 to fix the javadoc error.

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Fix For: 2.8.0
>
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, 
> HADOOP-13139-branch-2.002.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13059) S3a over-reacts to potentially transient network problems in its init() logic

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282679#comment-15282679
 ] 

Steve Loughran commented on HADOOP-13059:
-

If the exception raised is an {{AmazonClientException}}, its {{isRetryable()}} 
flag can perhaps be queried: if the answer is false, then it's presumably not 
transient

> S3a over-reacts to potentially transient network problems in its init() logic
> -
>
> Key: HADOOP-13059
> URL: https://issues.apache.org/jira/browse/HADOOP-13059
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-13059-001.patch
>
>
> If there's a reason for s3a not being able to connect to AWS, then the 
> constructor fails, even if this is a potentially transient event.
> This happens because the code to check for a bucket existing will relay the 
> exceptions.
> The constructor should catch IOEs against the remote FS, downgrade to warn 
> and let the code continue; it may fail later, but it may also recover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13143) harden S3AFileSystem.copyFromLocalFile()

2016-05-13 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-13143:
---

 Summary: harden S3AFileSystem.copyFromLocalFile()
 Key: HADOOP-13143
 URL: https://issues.apache.org/jira/browse/HADOOP-13143
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


{{S3AFileSystem.copyFromLocalFile()}} is lacking some basic checks of arguments 
(source file exists, do parent directories of dest exist, ...).

The original {{FileSystem.copyFromLocalFile()}} codepaths need review to see 
how they handle such situations, and the {{S3AFileSystem}} implementation made 
consistent. This is best done with an update to the FS Specification and a 
contract test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13132) LoadBalancingKMSClientProvider ClassCastException on AuthenticationException

2016-05-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-13132:
-
Attachment: HADOOP-13132.001.patch

> LoadBalancingKMSClientProvider ClassCastException on AuthenticationException
> 
>
> Key: HADOOP-13132
> URL: https://issues.apache.org/jira/browse/HADOOP-13132
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Reporter: Miklos Szurap
>Assignee: Wei-Chiu Chuang
> Attachments: HADOOP-13132.001.patch
>
>
> An Oozie job with a single shell action fails (may not be important, but if 
> you needs the exact details I can provide them) with an error message coming 
> from NodeManager:
> {code}
> 2016-05-10 11:10:14,290 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[LogAggregationService #652,5,main] threw an Exception.
> java.lang.ClassCastException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException 
> cannot be cast to java.security.GeneralSecurityException
> at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:189)
> at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
> at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1419)
> at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1521)
> at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:108)
> at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:59)
> at org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:683)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:679)
> at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
> at org.apache.hadoop.fs.FileContext.create(FileContext.java:679)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:382)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:377)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.(AggregatedLogFormat.java:376)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:246)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:456)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:421)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:384)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The unsafe cast is here:
> https://github.com/apache/hadoop/blob/2e1d0ff4e901b8313c8d71869735b94ed8bc40a0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/LoadBalancingKMSClientProvider.java#L189
> Because of this ClassCastException:
> - an uncaught exception is raised
> - we do not see the exact "caused by" exception/message
> - the oozie job fails
> - YARN logs are not reported/saved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12878) Impersonate hosts in s3a for better data locality handling

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282631#comment-15282631
 ] 

Steve Loughran commented on HADOOP-12878:
-

Link to the azure code here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L2550-L2589

> Impersonate hosts in s3a for better data locality handling
> --
>
> Key: HADOOP-12878
> URL: https://issues.apache.org/jira/browse/HADOOP-12878
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Thomas Demoor
>Assignee: Thomas Demoor
>
> Currently, {{localhost}} is passed as locality for each block, causing all 
> blocks involved in job to initially target the same node (RM), before being 
> moved by the scheduler (to a rack-local node). This reduces parallelism for 
> jobs (with short-lived mappers). 
> We should mimic Azures implementation: a config setting 
> {{fs.s3a.block.location.impersonatedhost}} where the user can enter the list 
> of hostnames in the cluster to return to {{getFileBlockLocations}}. 
> Possible optimization: for larger systems, it might be better to return N 
> (5?) random hostnames to prevent passing a huge array (the downstream code 
> assumes size = O(3)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-11873) Include disk read/write time in FileSystem.Statistics

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282629#comment-15282629
 ] 

Steve Loughran commented on HADOOP-11873:
-

This could be implemented as a FilterFileSystem; it would then be available to 
collect metrics on any FS, rather than just HDFS

> Include disk read/write time in FileSystem.Statistics
> -
>
> Key: HADOOP-11873
> URL: https://issues.apache.org/jira/browse/HADOOP-11873
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Reporter: Kay Ousterhout
>Priority: Minor
>
> Measuring the time spent blocking on reading / writing data from / to disk is 
> very useful for debugging performance problems in applications that read data 
> from Hadoop, and can give much more information (e.g., to reflect disk 
> contention) than just knowing the total amount of data read.  I'd like to add 
> something like "diskMillis" to FileSystem#Statistics to track this.
> For data read from HDFS, this can be done with very low overhead by adding 
> logging around calls to RemoteBlockReader2.readNextPacket (because this reads 
> larger chunks of data, the time added by the instrumentation is very small 
> relative to the time to actually read the data).  For data written to HDFS, 
> this can be done in DFSOutputStream.waitAndQueueCurrentPacket.
> As far as I know, if you want this information today, it is only currently 
> accessible by turning on HTrace. It looks like HTrace can't be selectively 
> enabled, so a user can't just turn on the tracing on 
> RemoteBlockReader2.readNextPacket for example, and instead needs to turn on 
> tracing everywhere (which then introduces a bunch of overhead -- so sampling 
> is necessary).  It would be hugely helpful to have native metrics for time 
> reading / writing to disk that are sufficiently low-overhead to be always on. 
> (Please correct me if I'm wrong here about what's possible today!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11601) Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty files

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-11601:

Status: Patch Available  (was: Open)

> Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty 
> files
> ---
>
> Key: HADOOP-11601
> URL: https://issues.apache.org/jira/browse/HADOOP-11601
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-11601-001.patch, HADOOP-11601-002.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> HADOOP-11584 has shown that the contract tests are not validating that 
> {{FileStatus.getBlocksize()}} must be >0 for any analytics jobs to partition 
> workload correctly. 
> Clarify in text and add test to do this. Test MUST be designed to work 
> against eventually consistent filesystems where {{getFileStatus()}} may not 
> be immediately visible, by retrying operation if FS declares it is an object 
> store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11601) Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty files

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-11601:

Attachment: HADOOP-11601-002.patch

synced with trunk

> Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty 
> files
> ---
>
> Key: HADOOP-11601
> URL: https://issues.apache.org/jira/browse/HADOOP-11601
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-11601-001.patch, HADOOP-11601-002.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> HADOOP-11584 has shown that the contract tests are not validating that 
> {{FileStatus.getBlocksize()}} must be >0 for any analytics jobs to partition 
> workload correctly. 
> Clarify in text and add test to do this. Test MUST be designed to work 
> against eventually consistent filesystems where {{getFileStatus()}} may not 
> be immediately visible, by retrying operation if FS declares it is an object 
> store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11601) Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty files

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-11601:

Status: Open  (was: Patch Available)

> Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty 
> files
> ---
>
> Key: HADOOP-11601
> URL: https://issues.apache.org/jira/browse/HADOOP-11601
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-11601-001.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> HADOOP-11584 has shown that the contract tests are not validating that 
> {{FileStatus.getBlocksize()}} must be >0 for any analytics jobs to partition 
> workload correctly. 
> Clarify in text and add test to do this. Test MUST be designed to work 
> against eventually consistent filesystems where {{getFileStatus()}} may not 
> be immediately visible, by retrying operation if FS declares it is an object 
> store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13140) GlobalStorageStatistics should check null FileSystem scheme to avoid NPE

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282579#comment-15282579
 ] 

Steve Loughran commented on HADOOP-13140:
-

It's probably something just done in tests. Why not have a special schema for 
those without schemas, say "schemaless" which they can all share?

> GlobalStorageStatistics should check null FileSystem scheme to avoid NPE
> 
>
> Key: HADOOP-13140
> URL: https://issues.apache.org/jira/browse/HADOOP-13140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Brahma Reddy Battula
>Assignee: Mingliang Liu
> Attachments: HADOOP-13140.000.patch
>
>
> {{org.apache.hadoop.fs.GlobalStorageStatistics#put}} is not checking the null 
> scheme, and the internal map will complain NPE. This was reported by a flaky 
> test {{TestFileSystemApplicationHistoryStore}}. Thanks [~brahmareddy] for 
> reporting.
> To address this,
> # Fix the test by providing a valid URI, e.g. {{file:///}}
> # Guard the null scheme in {{GlobalStorageStatistics#put}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13113) Enable parallel test execution for hadoop-aws.

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13113:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> Enable parallel test execution for hadoop-aws.
> --
>
> Key: HADOOP-13113
> URL: https://issues.apache.org/jira/browse/HADOOP-13113
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 2.8.0
>
> Attachments: HADOOP-13113.001.patch, HADOOP-13113.002.patch, 
> HADOOP-13113.003.patch, HADOOP-13113.004.patch
>
>
> The full hadoop-aws test suite takes ~30 minutes to execute.  The tests spend 
> most of their time blocked on network I/O with the S3 back-end, but they 
> don't saturate the bandwidth of the NIC.  We can improve overall execution 
> time by enabling parallel test execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13113) Enable parallel test execution for hadoop-aws.

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282571#comment-15282571
 ] 

Steve Loughran commented on HADOOP-13113:
-

LGTM

+1

> Enable parallel test execution for hadoop-aws.
> --
>
> Key: HADOOP-13113
> URL: https://issues.apache.org/jira/browse/HADOOP-13113
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13113.001.patch, HADOOP-13113.002.patch, 
> HADOOP-13113.003.patch, HADOOP-13113.004.patch
>
>
> The full hadoop-aws test suite takes ~30 minutes to execute.  The tests spend 
> most of their time blocked on network I/O with the S3 back-end, but they 
> don't saturate the bandwidth of the NIC.  We can improve overall execution 
> time by enabling parallel test execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12079) Make 'X-Newest' header a configurable

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282570#comment-15282570
 ] 

Steve Loughran commented on HADOOP-12079:
-

couple of comments here

# can a lower case option go in, the way all other hadoop options are. using 
"." as the separator.? Simply {{.newest}} should suffice, as the X-bit is an 
implementation detail users don't need to worry about
# needs a test to verify reads work with the flag==false. Presumably a test 
which recognises that things *may* be inconsistent and slow spin. I recommend 
something which does a write, an initial read, then an overwrite with a larger 
file then + new open and read at an offset beyond the EOF of the original file. 
That's essentially where I saw inconsistencies in the past

> Make 'X-Newest' header a configurable
> -
>
> Key: HADOOP-12079
> URL: https://issues.apache.org/jira/browse/HADOOP-12079
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/swift
>Affects Versions: 2.6.0, 3.0.0-alpha1
>Reporter: Gil Vernik
>Assignee: Gil Vernik
> Attachments: x-newest-optional0001.patch, 
> x-newest-optional0002.patch, x-newest-optional0003.patch, 
> x-newest-optional0004.patch, x-newest-optional0005.patch
>
>
> Current code always sends X-Newest header to Swift. While it's true that 
> Swift is eventual consistent and X-Newest will always get the newest version 
> from Swift, in practice this header will make Swift response very slow. 
> This header should be configured as an optional, so that it will be possible 
> to access Swift without this header and get much better performance. 
> This patch doesn't modify current behavior. All is working as is, but there 
> is an option to provide fs.swift.service.useXNewest = false. 
> Some background on Swift and X-Newest: 
> When a GET or HEAD request is made to an object, the default behavior is to 
> get the data from one of the replicas (could be any of them). The downside to 
> this is that if there are older versions of the object (due to eventual 
> consistency) it is possible to get an older version of the object. The upside 
> is that the for the majority of use cases, this isn't an issue. For the small 
> subset of use cases that need to make sure that they get the latest version 
> of the object, they can set the "X-Newest" header to "True". If this is set, 
> the proxy server will check all replicas of the object and only return the 
> newest object. The downside to this is that the request can take longer, 
> since it has to contact all the replicas. It is also more expensive for the 
> backend, so only recommended when it is absolutely needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6208) Block loss in S3FS due to S3 inconsistency on file rename

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282565#comment-15282565
 ] 

Steve Loughran commented on HADOOP-6208:


I'm going to close this as a WONTFIX due to the imminent demise of the original 
S3FS

> Block loss in S3FS due to S3 inconsistency on file rename
> -
>
> Key: HADOOP-6208
> URL: https://issues.apache.org/jira/browse/HADOOP-6208
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 0.20.0, 0.20.1
> Environment: Ubuntu Linux 8.04 on EC2, Mac OS X 10.5, likely to 
> affect any Hadoop environment
>Reporter: Bradley Buda
>Assignee: Bradley Buda
> Attachments: HADOOP-6208.patch, S3FSConsistencyPollingTest.java, 
> S3FSConsistencyTest.java
>
>
> Under certain S3 consistency scenarios, Hadoop's S3FileSystem can 'truncate' 
> files, especially when writing reduce outputs.  We've noticed this at 
> tracksimple where we use the S3FS as the direct input and output of our 
> MapReduce jobs.  The symptom of this problem is a file in the filesystem that 
> is an exact multiple of the FS block size - exactly 32MB, 64MB, 96MB, etc. in 
> length.
> The issue appears to be caused by renaming a file that has recently been 
> written, and getting a stale INode read from S3.  When a reducer is writing 
> job output to the S3FS, the normal series of S3 key writes for a 3-block file 
> looks something like this:
> Task Output:
> 1) Write the first block (block_99)
> 2) Write an INode 
> (/myjob/_temporary/_attempt_200907142159_0306_r_000133_0/part-00133.gz) 
> containing [block_99]
> 3) Write the second block (block_81)
> 4) Rewrite the INode with new contents [block_99, block_81]
> 5) Write the last block (block_-101)
> 6) Rewrite the INode with the final contents [block_99, block_81, block_-101]
> Copy Output to Final Location (ReduceTask#copyOutput):
> 1) Read the INode contents from 
> /myjob/_temporary/_attempt_200907142159_0306_r_000133_0/part-00133.gz, which 
> gives [block_99, block_81, block_-101]
> 2) Write the data from #1 to the final location, /myjob/part-00133.gz
> 3) Delete the old INode 
> The output file is truncated if S3 serves a stale copy of the temporary 
> INode.  In copyOutput, step 1 above, it is possible for S3 to return a 
> version of the temporary INode that contains just [block_99, block_81].  In 
> this case, we write this new data to the final output location, and 'lose' 
> block_-101 in the process.  Since we then delete the temporary INode, we've 
> lost all references to the final block of this file and it's orphaned in the 
> S3 bucket.
> This type of consistency error is infrequent but not impossible. We've 
> observed these failures about once a week for one of our large jobs which 
> runs daily and has 200 reduce outputs; so we're seeing an error rate of 
> something like 0.07% per reduce.
> These kind of errors are generally difficult to handle in a system like S3.  
> We have a few ideas about how to fix this:
> 1) HACK! Sleep during S3OutputStream#close or #flush to wait for S3 to catch 
> up and make these less likely.
> 2) Poll for updated MD5 or INode data in Jets3tFileSystemStore#storeINode 
> until S3 says the INode contents are the same as our local copy.  This could 
> be a config option - "fs.s3.verifyInodeWrites" or something like that.
> 3) Cache INode contents in-process, so we don't have to go back to S3 to ask 
> for the current version of an INode.
> 4) Only write INodes once, when the output stream is closed.  This would 
> basically make S3OutputStream#flush() a no-op.
> 5) Modify the S3FS to somehow version INodes (unclear how we would do this, 
> need some design work).
> 6) Avoid using the S3FS for temporary task attempt files.
> 7) Avoid using the S3FS completely.
> We wanted to get some guidance from the community before we went down any of 
> these paths.  Has anyone seen this issue?  Any other suggested workarounds?  
> We at tracksimple are willing to invest some time in fixing this and (of 
> course) contributing our fix back, but we wanted to get an 'ack' from others 
> before we try anything crazy :-).
> I've attached a test app if anyone wants to try and reproduce this 
> themselves.  It takes a while to run (depending on the 'weather' in S3 right 
> now), but should eventually detect a consistency 'error' that manifests 
> itself as a truncated file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-6208) Block loss in S3FS due to S3 inconsistency on file rename

2016-05-13 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-6208.

Resolution: Won't Fix

> Block loss in S3FS due to S3 inconsistency on file rename
> -
>
> Key: HADOOP-6208
> URL: https://issues.apache.org/jira/browse/HADOOP-6208
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 0.20.0, 0.20.1
> Environment: Ubuntu Linux 8.04 on EC2, Mac OS X 10.5, likely to 
> affect any Hadoop environment
>Reporter: Bradley Buda
>Assignee: Bradley Buda
> Attachments: HADOOP-6208.patch, S3FSConsistencyPollingTest.java, 
> S3FSConsistencyTest.java
>
>
> Under certain S3 consistency scenarios, Hadoop's S3FileSystem can 'truncate' 
> files, especially when writing reduce outputs.  We've noticed this at 
> tracksimple where we use the S3FS as the direct input and output of our 
> MapReduce jobs.  The symptom of this problem is a file in the filesystem that 
> is an exact multiple of the FS block size - exactly 32MB, 64MB, 96MB, etc. in 
> length.
> The issue appears to be caused by renaming a file that has recently been 
> written, and getting a stale INode read from S3.  When a reducer is writing 
> job output to the S3FS, the normal series of S3 key writes for a 3-block file 
> looks something like this:
> Task Output:
> 1) Write the first block (block_99)
> 2) Write an INode 
> (/myjob/_temporary/_attempt_200907142159_0306_r_000133_0/part-00133.gz) 
> containing [block_99]
> 3) Write the second block (block_81)
> 4) Rewrite the INode with new contents [block_99, block_81]
> 5) Write the last block (block_-101)
> 6) Rewrite the INode with the final contents [block_99, block_81, block_-101]
> Copy Output to Final Location (ReduceTask#copyOutput):
> 1) Read the INode contents from 
> /myjob/_temporary/_attempt_200907142159_0306_r_000133_0/part-00133.gz, which 
> gives [block_99, block_81, block_-101]
> 2) Write the data from #1 to the final location, /myjob/part-00133.gz
> 3) Delete the old INode 
> The output file is truncated if S3 serves a stale copy of the temporary 
> INode.  In copyOutput, step 1 above, it is possible for S3 to return a 
> version of the temporary INode that contains just [block_99, block_81].  In 
> this case, we write this new data to the final output location, and 'lose' 
> block_-101 in the process.  Since we then delete the temporary INode, we've 
> lost all references to the final block of this file and it's orphaned in the 
> S3 bucket.
> This type of consistency error is infrequent but not impossible. We've 
> observed these failures about once a week for one of our large jobs which 
> runs daily and has 200 reduce outputs; so we're seeing an error rate of 
> something like 0.07% per reduce.
> These kind of errors are generally difficult to handle in a system like S3.  
> We have a few ideas about how to fix this:
> 1) HACK! Sleep during S3OutputStream#close or #flush to wait for S3 to catch 
> up and make these less likely.
> 2) Poll for updated MD5 or INode data in Jets3tFileSystemStore#storeINode 
> until S3 says the INode contents are the same as our local copy.  This could 
> be a config option - "fs.s3.verifyInodeWrites" or something like that.
> 3) Cache INode contents in-process, so we don't have to go back to S3 to ask 
> for the current version of an INode.
> 4) Only write INodes once, when the output stream is closed.  This would 
> basically make S3OutputStream#flush() a no-op.
> 5) Modify the S3FS to somehow version INodes (unclear how we would do this, 
> need some design work).
> 6) Avoid using the S3FS for temporary task attempt files.
> 7) Avoid using the S3FS completely.
> We wanted to get some guidance from the community before we went down any of 
> these paths.  Has anyone seen this issue?  Any other suggested workarounds?  
> We at tracksimple are willing to invest some time in fixing this and (of 
> course) contributing our fix back, but we wanted to get an 'ack' from others 
> before we try anything crazy :-).
> I've attached a test app if anyone wants to try and reproduce this 
> themselves.  It takes a while to run (depending on the 'weather' in S3 right 
> now), but should eventually detect a consistency 'error' that manifests 
> itself as a truncated file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13114) DistCp should have option to compress data on write

2016-05-13 Thread Suraj Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suraj Nayak updated HADOOP-13114:
-
Attachment: HADOOP-13114-trunk_2016-05-12-1.patch

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13114) DistCp should have option to compress data on write

2016-05-13 Thread Suraj Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suraj Nayak updated HADOOP-13114:
-
Attachment: (was: HADOOP-13114-trunk_2016-05-12-1.patch)

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

2016-05-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282538#comment-15282538
 ] 

Steve Loughran commented on HADOOP-13139:
-

I suspect I've just broken your patch with the User Agent and metrics patches. 
Can you do a new one against branch-2, also looking at the checkstyles and the 
javadocs. I think, the first checkstyle warning is because the field should be 
marked as {{final}}

> Branch-2: S3a to use thread pool that blocks clients
> 
>
> Key: HADOOP-13139
> URL: https://issues.apache.org/jira/browse/HADOOP-13139
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Pieter Reuse
>Assignee: Pieter Reuse
> Fix For: 2.8.0
>
> Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will 
> attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 
> 'fs.s3a.threads.core' has been been removed and the behavior of the 
> ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13127) Correctly cache delegation tokens in KMSClientProvider

2016-05-13 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-13127:
---
Status: Open  (was: Patch Available)

> Correctly cache delegation tokens in KMSClientProvider
> --
>
> Key: HADOOP-13127
> URL: https://issues.apache.org/jira/browse/HADOOP-13127
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.6.1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13127.01.patch
>
>
> In the initial implementation of HADOOP-10770, the authToken is updated with 
> delegation tokens during {{KMSClientProvider#addDelegationTokens }} in the 
> following line:
> {code}
> Token token = authUrl.getDelegationToken(url, authToken, renewer);
> {code}
> HADOOP-11482 is a good fix to handle UGI issue, but has a side effect in the 
> following code:
> {code}
> public Token run() throws Exception {
>   // Not using the cached token here.. Creating a new token here
>   // everytime.
>   return authUrl.getDelegationToken(url,
> new DelegationTokenAuthenticatedURL.Token(), renewer, doAsUser);
> }
> {code}
> IIUC, we should do {{setDelegationToken}} on the authToken here to cache it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12868) Fix hadoop-openstack undeclared and unused dependencies

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12868:
-
Assignee: Masatake Iwasaki

> Fix hadoop-openstack undeclared and unused dependencies
> ---
>
> Key: HADOOP-12868
> URL: https://issues.apache.org/jira/browse/HADOOP-12868
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Masatake Iwasaki
> Attachments: HADOOP-12868.001.patch
>
>
> Attempting to compile openstack on a fairly fresh maven repo fails due to 
> commons-httpclient not being a declared dependency.  After that is fixed, 
> doing a maven dependency:analyze shows other problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-12868) Fix hadoop-openstack undeclared and unused dependencies

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-12868.
--
  Resolution: Fixed
   Fix Version/s: 2.8.0
Target Version/s:   (was: )

Committed to trunk, branch-2, branch-2.8. Thanks for the patch [~iwasakims]!

> Fix hadoop-openstack undeclared and unused dependencies
> ---
>
> Key: HADOOP-12868
> URL: https://issues.apache.org/jira/browse/HADOOP-12868
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Assignee: Masatake Iwasaki
> Fix For: 2.8.0
>
> Attachments: HADOOP-12868.001.patch
>
>
> Attempting to compile openstack on a fairly fresh maven repo fails due to 
> commons-httpclient not being a declared dependency.  After that is fixed, 
> doing a maven dependency:analyze shows other problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12868) Fix hadoop-openstack undeclared and unused dependencies

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12868:
-
Summary: Fix hadoop-openstack undeclared and unused dependencies  (was: 
hadoop-openstack's pom has missing and unused dependencies)

> Fix hadoop-openstack undeclared and unused dependencies
> ---
>
> Key: HADOOP-12868
> URL: https://issues.apache.org/jira/browse/HADOOP-12868
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.0.0-alpha1
>Reporter: Allen Wittenauer
> Attachments: HADOOP-12868.001.patch
>
>
> Attempting to compile openstack on a fairly fresh maven repo fails due to 
> commons-httpclient not being a declared dependency.  After that is fixed, 
> doing a maven dependency:analyze shows other problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12868) hadoop-openstack's pom has missing and unused dependencies

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12868:
-
Priority: Major  (was: Blocker)

> hadoop-openstack's pom has missing and unused dependencies
> --
>
> Key: HADOOP-12868
> URL: https://issues.apache.org/jira/browse/HADOOP-12868
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.0.0-alpha1
>Reporter: Allen Wittenauer
> Attachments: HADOOP-12868.001.patch
>
>
> Attempting to compile openstack on a fairly fresh maven repo fails due to 
> commons-httpclient not being a declared dependency.  After that is fixed, 
> doing a maven dependency:analyze shows other problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11180) Change log message "token.Token: Cannot find class for token kind kms-dt" to debug

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-11180:
-
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed based on Steve's +1, thanks [~hitliuyi] for contributing this!

> Change log message "token.Token: Cannot find class for token kind kms-dt" to 
> debug
> --
>
> Key: HADOOP-11180
> URL: https://issues.apache.org/jira/browse/HADOOP-11180
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms, security
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0
>
> Attachments: HADOOP-11180.001.patch
>
>
> This issue is produced when running MapReduce job and encryption zones are 
> configured.
> {quote}
> 14/10/09 05:06:02 INFO security.TokenCache: Got dt for 
> hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: 
> 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user)
> 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt
> 14/10/09 05:06:02 INFO security.TokenCache: Got dt for 
> hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, 
> Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 
> 9b 09 07 04 02
> 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1
> 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1
> 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_141272197_0004
> 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt
> 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11180) Change log message "token.Token: Cannot find class for token kind kms-dt" to debug

2016-05-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-11180:
-
Summary: Change log message "token.Token: Cannot find class for token kind 
kms-dt" to debug  (was: Fix warning of "token.Token: Cannot find class for 
token kind kms-dt" for KMS when running jobs on Encryption zones)

> Change log message "token.Token: Cannot find class for token kind kms-dt" to 
> debug
> --
>
> Key: HADOOP-11180
> URL: https://issues.apache.org/jira/browse/HADOOP-11180
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms, security
>Affects Versions: 2.6.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-11180.001.patch
>
>
> This issue is produced when running MapReduce job and encryption zones are 
> configured.
> {quote}
> 14/10/09 05:06:02 INFO security.TokenCache: Got dt for 
> hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: 
> 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user)
> 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt
> 14/10/09 05:06:02 INFO security.TokenCache: Got dt for 
> hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, 
> Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 
> 9b 09 07 04 02
> 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1
> 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1
> 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_141272197_0004
> 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt
> 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12868) hadoop-openstack's pom has missing and unused dependencies

2016-05-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282466#comment-15282466
 ] 

Andrew Wang commented on HADOOP-12868:
--

LGTM +1 then, thanks for working on this [~iwasakims].

> hadoop-openstack's pom has missing and unused dependencies
> --
>
> Key: HADOOP-12868
> URL: https://issues.apache.org/jira/browse/HADOOP-12868
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.0.0-alpha1
>Reporter: Allen Wittenauer
>Priority: Blocker
> Attachments: HADOOP-12868.001.patch
>
>
> Attempting to compile openstack on a fairly fresh maven repo fails due to 
> commons-httpclient not being a declared dependency.  After that is fixed, 
> doing a maven dependency:analyze shows other problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12971) FileSystemShell doc should explain relative path

2016-05-13 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-12971:

Status: Patch Available  (was: Open)

> FileSystemShell doc should explain relative path
> 
>
> Key: HADOOP-12971
> URL: https://issues.apache.org/jira/browse/HADOOP-12971
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
> Attachments: HADOOP-12971.001.patch
>
>
> Update 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
>  with information about relative path and current working directory, as 
> suggested by [~yzhangal] during HADOOP-10965 discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-12971) FileSystemShell doc should explain relative path

2016-05-13 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-12971:

Attachment: HADOOP-12971.001.patch

> FileSystemShell doc should explain relative path
> 
>
> Key: HADOOP-12971
> URL: https://issues.apache.org/jira/browse/HADOOP-12971
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Critical
> Attachments: HADOOP-12971.001.patch
>
>
> Update 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
>  with information about relative path and current working directory, as 
> suggested by [~yzhangal] during HADOOP-10965 discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org