date:20170302

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893862#comment-15893862
 ] 

Hadoop QA commented on HADOOP-13914:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
38s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  9m 
36s{color} | {color:red} root in HADOOP-13345 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 4s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  9m  
7s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  9m  7s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  4s{color} | {color:orange} root: The patch generated 11 new + 77 unchanged 
- 2 fixed = 88 total (was 79) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} hadoop-tools/hadoop-aws generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 27s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-tools/hadoop-aws |
|  |  Redundant nullcheck of 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(Path, String), which is 
known to be non-null in org.apache.hadoop.fs.s3a.S3AFileSystem.s3Exists(Path)  
Redundant null check at S3AFileSystem.java:which is known to be non-null in 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3Exists(Path)  Redundant null check at 
S3AFileSystem.java:[line 1851] |
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855779/HADOOP-13914-HADOOP-13345.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2d73d524a8bd 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/pr

[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893857#comment-15893857
 ] 

Hadoop QA commented on HADOOP-6801:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
35s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  9m 
22s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  8m 
34s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  8m 34s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-common-project/hadoop-common: The patch 
generated 0 new + 462 unchanged - 2 fixed = 462 total (was 464) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
18s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-6801 |
| GITHUB PR | https://github.com/apache/hadoop/pull/146 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 4ef2ddc374bc 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3749152 |
| Default Java | 1.8.0_121 |
| compile | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11756/artifact/patchprocess/branch-compile-root.txt
 |
| findbugs | v3.0.0 |
| compile | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11756/artifact/patchprocess/patch-compile-root.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11756/artifact/patchprocess/patch-compile-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11756/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11756/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> io.sort.mb and io.sort.factor were renamed an

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893802#comment-15893802
 ] 

Yongjun Zhang commented on HADOOP-14104:


Thanks [~andrew.wang], good comments.

Hi [~daryn],

I like the sound of your proposal too
{quote}
 I think the cleanest/most-compatible way is leveraging the Credentials instead 
of the config. We could inject a mapping of filesystem uri to kms uri via the 
secrets map. So now when the client needs to talk to the kms it can check the 
map, else fallback to getServerDefaults.
{quote}

Did you mean to use the following UserProvider method
{code}
  @Override
  public synchronized CredentialEntry createCredentialEntry(String name, char[] 
credential) 
  throws IOException {
Text nameT = new Text(name);
if (credentials.getSecretKey(nameT) != null) {
  throw new IOException("Credential " + name + 
  " already exists in " + this);
}
credentials.addSecretKey(new Text(name), 
new String(credential).getBytes("UTF-8"));
return new CredentialEntry(name, credential);
  }
{code}
to add  mapping to the credential map? This mapping info 
for a remote cluster need to come from either the remote cluster conf, or the 
NN of the remote cluster,  what's your thinking here?

Would you please elaborate this approach? Is there any in-compatibility here?
 
Thanks.


> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-13914:
--
Attachment: HADOOP-13914-HADOOP-13345.004.patch

Attaching v4 patch.  Changes from previous patch:

- Add three test cases to MetadataStoreTestBase for {known empty, known 
non-empty, unknown} directory behavior with {{MetadataStore#get()}}.

- Replace assertion that [~mackrorysd] mentioned in the root dir test for 
{{TestDynamoDBMetadataStore}}, with the equivalent logic with the new empty dir 
API.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> HADOOP-13914-HADOOP-13345.004.patch, s3guard-empty-dirs.md, 
> test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14056) Update maven-javadoc-plugin to 2.10.4

2017-03-02 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893756#comment-15893756
 ] 

Akira Ajisaka commented on HADOOP-14056:


Hi [~ste...@apache.org], would you review this patch?

> Update maven-javadoc-plugin to 2.10.4
> -
>
> Key: HADOOP-14056
> URL: https://issues.apache.org/jira/browse/HADOOP-14056
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: HADOOP-14056.01.patch
>
>
> I'm seeing the following warning in OpenJDK 9.
> {noformat}
> [INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-minikdc 
> ---
> [WARNING] Unable to find the javadoc version: Unrecognized version of 
> Javadoc: 'java version "9-ea"
> Java(TM) SE Runtime Environment (build 9-ea+154)
> Java HotSpot(TM) 64-Bit Server VM (build 9-ea+154, mixed mode)
> ' near index 37
> (?s).*?([0-9]+\.[0-9]+)(\.([0-9]+))?.*
>  ^
> [WARNING] Using the Java the version instead of, i.e. 0.0
> {noformat}
> Need to update this to 2.10.4. (MJAVADOC-441)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893755#comment-15893755
 ] 

Aaron Fabbri commented on HADOOP-13914:
---

Yep, those two failures are HADOOP-14129 and HADOOP-14036.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893727#comment-15893727
 ] 

Hadoop QA commented on HADOOP-14094:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 4s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
11s{color} | {color:green} The patch generated 0 new + 98 unchanged - 1 fixed = 
98 total (was 99) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
18s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14094 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855769/HADOOP-14094-HADOOP-13345.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  shellcheck  shelldocs  |
| uname | Linux 89b8cd0f8b4b 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 0942c9f |
| Default Java | 1.8.0_121 |
| shellcheck | v0.4.5 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11754/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11754/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
>

[jira] [Created] (HADOOP-14145) Ensure GenericOptionParser is used for S3Guard CLI

2017-03-02 Thread Sean Mackrory (JIRA)

Sean Mackrory created HADOOP-14145:
--

 Summary: Ensure GenericOptionParser is used for S3Guard CLI
 Key: HADOOP-14145
 URL: https://issues.apache.org/jira/browse/HADOOP-14145
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sean Mackrory
Assignee: Sean Mackrory


As discussed in HADOOP-14094.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14094:
---
Attachment: HADOOP-14094-HADOOP-13345.006.patch

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch, 
> HADOOP-14094-HADOOP-13345.005.patch, HADOOP-14094-HADOOP-13345.006.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14141) Store KMS SSL keystore password in catalina.properties

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893690#comment-15893690
 ] 

Hadoop QA commented on HADOOP-14141:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
35s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
55s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
48s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
38s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 8s{color} | {color:green} The patch generated 0 new + 508 unchanged - 3 fixed 
= 508 total (was 511) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m  
9s{color} | {color:green} The patch generated 0 new + 46 unchanged - 2 fixed = 
46 total (was 48) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
44s{color} | {color:green} hadoop-kms in the patch passed with JDK v1.7.0_121. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HADOOP-14141 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855762/HADOOP-14141.branch-2.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  shellcheck  shelldocs  |
| uname | Linux 1a5579ca536c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / d737a26 |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /

[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893689#comment-15893689
 ] 

Hadoop QA commented on HADOOP-14094:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 9s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 3 
new + 4 unchanged - 0 fixed = 7 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
12s{color} | {color:green} The patch generated 0 new + 98 unchanged - 1 fixed = 
98 total (was 99) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
10s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14094 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855767/HADOOP-14094-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  shellcheck  shelldocs  |
| uname | Linux 2c2bd2f97ec0 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 0942c9f |
| Default Java | 1.8.0_121 |
| shellcheck | v0.4.5 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11753/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11753/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11753/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893676#comment-15893676
 ] 

Sean Mackrory commented on HADOOP-13914:


That all sounds reasonable. By the way I had a few test failures. A couple of 
tests fail in TestDynamoDBMetadataStore because I have configured a table named 
differently from my bucket (which I've been hoping to address in HADOOP-14068), 
ITestS3ACredentialsInURL failed (also being addressed by a separate JIRA), and 
I got the following failure which I believe we've seen before:
{code}
testRenameToDirWithSamePrefixAllowed(org.apache.hadoop.fs.s3a.ITestS3AFileSystemContract)
  Time elapsed: 4.834 sec  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSServiceIOException: move: 
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Provided list 
of item keys contains duplicates (Service: AmazonDynamoDBv2; Status Code: 400; 
Error Code: ValidationException; Request ID: 
KLU3MEVJVD2RAE269JD6SCLP9VVV4KQNSO5AEMVJF66Q9ASUAAJG): Provided list of item 
keys contains duplicates (Service: AmazonDynamoDBv2; Status Code: 400; Error 
Code: ValidationException; Request ID: 
KLU3MEVJVD2RAE269JD6SCLP9VVV4KQNSO5AEMVJF66Q9ASUAAJG)
{code}

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893667#comment-15893667
 ] 

Sean Mackrory commented on HADOOP-14135:


I'll review this first thing tomorrow, [~liuml07]!

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch, 
> HADOOP-14135.002.patch, HADOOP-14135.003.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14094:
---
Attachment: HADOOP-14094-HADOOP-13345.005.patch

Ah thanks - I hadn't even noticed the checkstyle issue since Yetus still gave 
+1 overall. Fixed in .005. and run through Yetus locally.

I do think we should use GenericOptionParser (although I had a quick browse of 
the code that uses it and it seemed like every Tool was executed by a wrapper 
that ran the parser and no tools were using it directly - so I'll need to dig 
to see what exactly this Tool does differently). I'd like to get the rest of 
the changes in so I don't have to keep rebasing this on top of other things, 
though - I'll file a separate JIRA for that particular improvement.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch, 
> HADOOP-14094-HADOOP-13345.005.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14132) Filesystem discovery to stop loading implementation classes

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893664#comment-15893664
 ] 

John Zhuge commented on HADOOP-14132:
-

There are FS service loader metafiles for common, hdfs, aws, wasb, and swift.

For adls, just abandon HADOOP-14123.
For wasb, hdfs, and common, shall we create JIRAs similar to HADOOP-14138?
For swift, revert HADOOP-13606?


> Filesystem discovery to stop loading implementation classes
> ---
>
> Key: HADOOP-14132
> URL: https://issues.apache.org/jira/browse/HADOOP-14132
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
>Affects Versions: 2.7.3
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> Integration testing of Hadoop with the HADOOP-14040 has shown up that the 
> move to a shaded AWS JAR is slowing all hadoop client code down.
> I believe this is due to how we use service discovery to identify FS 
> implementations: the implementation classes themselves are instantiated.
> This has known problems today with classloading, but clearly impacts 
> performance too, especially with complex transitive dependencies unique to 
> the loaded class.
> Proposed: have lightweight service declaration classes which implement an 
> interface declaring
> # schema
> # classname of FileSystem impl
> # classname of AbstractFS impl
> # homepage (for third party code, support, etc)
> These are what we register and scan in the FS to look for services.
> This will leave the question about what to do for existing filesystems? I 
> think we'll need to retain the old code for external ones, while moving the 
> hadoop modules to the new ones



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14123) Make AdlFileSystem a service provider for FileSystem

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893660#comment-15893660
 ] 

John Zhuge edited comment on HADOOP-14123 at 3/3/17 3:19 AM:
-

Found the existing service loader metafile for adls, but in the wrong path 
(missing {{services}} after {{META-INF}}):
{noformat}
hadoop-tools/hadoop-azure-datalake/src/main/resources/META-INF/org.apache.hadoop.fs.FileSystem
{noformat}


was (Author: jzhuge):
Found the existing service loader metafile for adls, but in the wrong path:
{noformat}
hadoop-tools/hadoop-azure-datalake/src/main/resources/META-INF/org.apache.hadoop.fs.FileSystem
{noformat}

> Make AdlFileSystem a service provider for FileSystem
> 
>
> Key: HADOOP-14123
> URL: https://issues.apache.org/jira/browse/HADOOP-14123
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Assignee: John Zhuge
> Attachments: HADOOP-14123.001.patch
>
>
> Add a provider-configuration file giving the FS impl of {{AdlFileSystem}}; 
> remove the entry from core-default.xml



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-03-02 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14043:

Priority: Critical  (was: Major)

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>Priority: Critical
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14123) Make AdlFileSystem a service provider for FileSystem

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893660#comment-15893660
 ] 

John Zhuge commented on HADOOP-14123:
-

Found the existing service loader metafile for adls, but in the wrong path:
{noformat}
hadoop-tools/hadoop-azure-datalake/src/main/resources/META-INF/org.apache.hadoop.fs.FileSystem
{noformat}

> Make AdlFileSystem a service provider for FileSystem
> 
>
> Key: HADOOP-14123
> URL: https://issues.apache.org/jira/browse/HADOOP-14123
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Assignee: John Zhuge
> Attachments: HADOOP-14123.001.patch
>
>
> Add a provider-configuration file giving the FS impl of {{AdlFileSystem}}; 
> remove the entry from core-default.xml



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14141) Store KMS SSL keystore password in catalina.properties

2017-03-02 Thread John Zhuge (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-14141:

Attachment: HADOOP-14141.branch-2.001.patch

Patch branch-2.001
* Store SSL keystore password and truststore password in catalina.properties
* Remove old code related to {{sed}} method
* Rename ssl-server.xml.conf to ssl-server.xml

Testing done
- Run https://github.com/jzhuge/hadoop-bats-tests/blob/master/kms.bats in 
insecure and SSL single node setup
- Test keystore password with special characters, e.g., {{}}


> Store KMS SSL keystore password in catalina.properties
> --
>
> Key: HADOOP-14141
> URL: https://issues.apache.org/jira/browse/HADOOP-14141
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HADOOP-14141.branch-2.001.patch
>
>
> HADOOP-14083 stores SSL ciphers in catalina.properties. We can do the same 
> for SSL keystore password, thus no longer need the current {{sed}} method:
> {noformat}
> # If ssl, the populate the passwords into ssl-server.xml before starting 
> tomcat
> if [ ! "${KMS_SSL_KEYSTORE_PASS}" = "" ] || [ ! "${KMS_SSL_TRUSTSTORE_PASS}" 
> = "" ]; then
>   # Set a KEYSTORE_PASS if not already set
>   KMS_SSL_KEYSTORE_PASS=${KMS_SSL_KEYSTORE_PASS:-password}
>   KMS_SSL_KEYSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_KEYSTORE_PASS")
>   KMS_SSL_TRUSTSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_TRUSTSTORE_PASS")
>   cat ${CATALINA_BASE}/conf/ssl-server.xml.conf \
> | sed 
> 's/"_kms_ssl_keystore_pass_"/'"\"${KMS_SSL_KEYSTORE_PASS_ESCAPED}\""'/g' \
> | sed 
> 's/"_kms_ssl_truststore_pass_"/'"\"${KMS_SSL_TRUSTSTORE_PASS_ESCAPED}\""'/g' 
> > ${CATALINA_BASE}/conf/ssl-server.xml
> fi
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14141) Store KMS SSL keystore password in catalina.properties

2017-03-02 Thread John Zhuge (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-14141:

Status: Patch Available  (was: Open)

> Store KMS SSL keystore password in catalina.properties
> --
>
> Key: HADOOP-14141
> URL: https://issues.apache.org/jira/browse/HADOOP-14141
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Attachments: HADOOP-14141.branch-2.001.patch
>
>
> HADOOP-14083 stores SSL ciphers in catalina.properties. We can do the same 
> for SSL keystore password, thus no longer need the current {{sed}} method:
> {noformat}
> # If ssl, the populate the passwords into ssl-server.xml before starting 
> tomcat
> if [ ! "${KMS_SSL_KEYSTORE_PASS}" = "" ] || [ ! "${KMS_SSL_TRUSTSTORE_PASS}" 
> = "" ]; then
>   # Set a KEYSTORE_PASS if not already set
>   KMS_SSL_KEYSTORE_PASS=${KMS_SSL_KEYSTORE_PASS:-password}
>   KMS_SSL_KEYSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_KEYSTORE_PASS")
>   KMS_SSL_TRUSTSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_TRUSTSTORE_PASS")
>   cat ${CATALINA_BASE}/conf/ssl-server.xml.conf \
> | sed 
> 's/"_kms_ssl_keystore_pass_"/'"\"${KMS_SSL_KEYSTORE_PASS_ESCAPED}\""'/g' \
> | sed 
> 's/"_kms_ssl_truststore_pass_"/'"\"${KMS_SSL_TRUSTSTORE_PASS_ESCAPED}\""'/g' 
> > ${CATALINA_BASE}/conf/ssl-server.xml
> fi
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893553#comment-15893553
 ] 

Mingliang Liu commented on HADOOP-14094:


The v4 patch looks good to me overall.

I'm wondering should we use the {{GenericOptionsParser}} as the Tool interface 
requires in javadoc. That way, the {{-D key=value}} will be propagated to 
configuration; we can save the required command parameter {{s3://BUCKET}} as 
value of the file system URI ({{-fs s3://BUCKET}}) will override the defaultFS 
from configuration file). Perhaps this has been discussed, or a separate JIRA? 
I may have missed the major conclusion.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893538#comment-15893538
 ] 

Mingliang Liu commented on HADOOP-14135:


Checkstyle warnings are false positive: we have to make the constructors public 
(as they were).

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch, 
> HADOOP-14135.002.patch, HADOOP-14135.003.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893534#comment-15893534
 ] 

Hadoop QA commented on HADOOP-14135:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 3 
new + 9 unchanged - 2 fixed = 12 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14135 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855748/HADOOP-14135.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 381f78aaae05 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8f4817f |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11751/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11751/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11751/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>

[jira] [Commented] (HADOOP-13037) Refactor Azure Data Lake Store as an independent FileSystem

2017-03-02 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893529#comment-15893529
 ] 

Hitesh Shah commented on HADOOP-13037:
--

[~chris.douglas] [~cnauroth] [~vishwajeet.dusane] [~steve_l] Given that 
HADOOP-13687 and HADOOP-13257 are resolved, can this be backported to branch-2?

> Refactor Azure Data Lake Store as an independent FileSystem
> ---
>
> Key: HADOOP-13037
> URL: https://issues.apache.org/jira/browse/HADOOP-13037
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Shrikant Naidu
>Assignee: Vishwajeet Dusane
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13037-001.patch, HADOOP-13037-002.patch, 
> HADOOP-13037-003.patch, HADOOP-13037-004.patch, HADOOP-13037.005.patch, 
> HADOOP-13037.006.patch, HADOOP-13037 Proposal.pdf
>
>
> The jira proposes an improvement over HADOOP-12666 to remove webhdfs 
> dependencies from the ADL file system client and build out a standalone 
> client. At a high level, this approach would extend the Hadoop file system 
> class to provide an implementation for accessing Azure Data Lake. The scheme 
> used for accessing the file system will continue to be 
> adl://.azuredatalake.net/path/to/file. 
> The Azure Data Lake Cloud Store will continue to provide a webHDFS rest 
> interface. The client will  access the ADLS store using WebHDFS Rest APIs 
> provided by the ADLS store. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14135:
---
Attachment: HADOOP-14135.003.patch

{code}
$ mvn -Dit.test='ITestS3A*' -Dtest=none -Dscale -q clean verify

Results :

Tests run: 346, Failures: 0, Errors: 0, Skipped: 12
{code}
Tested against us-west-1.

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch, 
> HADOOP-14135.002.patch, HADOOP-14135.003.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13055) Implement linkMergeSlash for ViewFileSystem

2017-03-02 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893509#comment-15893509
 ] 

Manoj Govindassamy commented on HADOOP-13055:
-

[~andrew.wang],

HADOOP-14136 has requested for DefaultFS to fallback whenever the mounts are 
not resolvable. Kind of Hierarchical Mount FS support for ViewFS but only at 
the root level. This does bring in whole new set of cases to consider for all 
other supported FS operations on ViewFS, but I am going to investigate on what 
it takes to get fallback DefaultFS for ViewFS. Would your concern on this jira 
HADOOP-13055 and the patch be addressed if we can club this in with the fix for 
HADOOP-14136 ? Please share your thoughts. 

> Implement linkMergeSlash for ViewFileSystem
> ---
>
> Key: HADOOP-13055
> URL: https://issues.apache.org/jira/browse/HADOOP-13055
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Zhe Zhang
>Assignee: Manoj Govindassamy
> Attachments: HADOOP-13055.00.patch, HADOOP-13055.01.patch, 
> HADOOP-13055.02.patch, HADOOP-13055.03.patch, HADOOP-13055.04.patch
>
>
> In a multi-cluster environment it is sometimes useful to operate on the root 
> / slash directory of an HDFS cluster. E.g., list all top level directories. 
> Quoting the comment in {{ViewFs}}:
> {code}
>  *   A special case of the merge mount is where mount table's root is merged
>  *   with the root (slash) of another file system:
>  *   
>  *   fs.viewfs.mounttable.default.linkMergeSlash=hdfs://nn99/
>  *   
>  *   In this cases the root of the mount table is merged with the root of
>  *hdfs://nn99/  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893451#comment-15893451
 ] 

Hadoop QA commented on HADOOP-14135:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hadoop-tools/hadoop-aws: The patch generated 0 new + 
9 unchanged - 2 fixed = 9 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14135 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855740/HADOOP-14135.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 629477c39bed 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8f4817f |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11750/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11750/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch, 
> HADOOP-14135.002.patch
>
>
> This was from comment in [HADOOP-13252].
> It lo

[jira] [Updated] (HADOOP-14144) s3guard: CLI diff non-empty after import on new table

2017-03-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-14144:
--
Description: 
I expected the following steps to yield zero diff from `hadoop s3guard diff` 
command.

(1) hadoop s3guard init ... (create fresh table)
(2) hadoop s3guard import (fresh table, existing bucket with data in it)
(3) hadoop s3guard diff ..

Instead I still get a non-zero diff on step #3.  I also noticed some entries 
are printed twice.

{noformat}
dude@computer:~/Code/hadoop$ hadoop s3guard diff -meta dynamodb://dude-dev 
-region us-west-2 s3a://dude-dev
S3  D   s3a://dude-dev/user/fabbri/test/parentdirdest
S3  D   s3a://dude-dev/user/fabbri/test/parentdirdest
{noformat}

  was:
I expected the following steps to yield zero diff from `hadoop s3guard diff` 
command.

(1) hadoop s3guard init ... (create fresh table)
(2) hadoop s3guard import (fresh table, existing bucket with data in it)
(3) hadoop s3guard diff ..

Instead I still get a non-zero diff on step #3, and also noticed some entries 
are printed twice.

{noformat}
dude@computer:~/Code/hadoop$ hadoop s3guard diff -meta dynamodb://dude-dev 
-region us-west-2 s3a://dude-dev
S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
{noformat}


> s3guard: CLI diff non-empty after import on new table
> -
>
> Key: HADOOP-14144
> URL: https://issues.apache.org/jira/browse/HADOOP-14144
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Priority: Minor
>
> I expected the following steps to yield zero diff from `hadoop s3guard diff` 
> command.
> (1) hadoop s3guard init ... (create fresh table)
> (2) hadoop s3guard import (fresh table, existing bucket with data in it)
> (3) hadoop s3guard diff ..
> Instead I still get a non-zero diff on step #3.  I also noticed some entries 
> are printed twice.
> {noformat}
> dude@computer:~/Code/hadoop$ hadoop s3guard diff -meta dynamodb://dude-dev 
> -region us-west-2 s3a://dude-dev
> S3  D   s3a://dude-dev/user/fabbri/test/parentdirdest
> S3  D   s3a://dude-dev/user/fabbri/test/parentdirdest
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14144) s3guard: CLI diff non-empty after import on new table

2017-03-02 Thread Aaron Fabbri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-14144:
--
Summary: s3guard: CLI diff non-empty after import on new table  (was: 
s3guard: CLI import does not yield an empty diff.)

> s3guard: CLI diff non-empty after import on new table
> -
>
> Key: HADOOP-14144
> URL: https://issues.apache.org/jira/browse/HADOOP-14144
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Priority: Minor
>
> I expected the following steps to yield zero diff from `hadoop s3guard diff` 
> command.
> (1) hadoop s3guard init ... (create fresh table)
> (2) hadoop s3guard import (fresh table, existing bucket with data in it)
> (3) hadoop s3guard diff ..
> Instead I still get a non-zero diff on step #3, and also noticed some entries 
> are printed twice.
> {noformat}
> dude@computer:~/Code/hadoop$ hadoop s3guard diff -meta dynamodb://dude-dev 
> -region us-west-2 s3a://dude-dev
> S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
> S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14144) s3guard: CLI import does not yield an empty diff.

2017-03-02 Thread Aaron Fabbri (JIRA)

Aaron Fabbri created HADOOP-14144:
-

 Summary: s3guard: CLI import does not yield an empty diff.
 Key: HADOOP-14144
 URL: https://issues.apache.org/jira/browse/HADOOP-14144
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Aaron Fabbri
Priority: Minor


I expected the following steps to yield zero diff from `hadoop s3guard diff` 
command.

(1) hadoop s3guard init ... (create fresh table)
(2) hadoop s3guard import (fresh table, existing bucket with data in it)
(3) hadoop s3guard diff ..

Instead I still get a non-zero diff on step #3, and also noticed some entries 
are printed twice.

{noformat}
dude@computer:~/Code/hadoop$ hadoop s3guard diff -meta dynamodb://dude-dev 
-region us-west-2 s3a://dude-dev
S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
S3  D   s3a://fabbri-dev/user/fabbri/test/parentdirdest
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13665) Erasure Coding codec should support fallback coder

2017-03-02 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893387#comment-15893387
 ] 

Kai Sasaki commented on HADOOP-13665:
-

[~jojochuang] Thanks for letting me know. Will check and update accordingly. 

> Erasure Coding codec should support fallback coder
> --
>
> Key: HADOOP-13665
> URL: https://issues.apache.org/jira/browse/HADOOP-13665
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Kai Sasaki
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HADOOP-13665.01.patch, HADOOP-13665.02.patch, 
> HADOOP-13665.03.patch, HADOOP-13665.04.patch
>
>
> The current EC codec supports a single coder only (by default pure Java 
> implementation). If the native coder is specified but is unavailable, it 
> should fallback to pure Java implementation.
> One possible solution is to follow the convention of existing Hadoop native 
> codec, such as transport encryption (see {{CryptoCodec.java}}). It supports 
> fallback by specifying two or multiple coders as the value of property, and 
> loads coders in order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-14136) Default FS For ViewFS

2017-03-02 Thread Manoj Govindassamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy reassigned HADOOP-14136:
---

Assignee: Manoj Govindassamy

> Default FS For ViewFS
> -
>
> Key: HADOOP-14136
> URL: https://issues.apache.org/jira/browse/HADOOP-14136
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Erik Krogen
>Assignee: Manoj Govindassamy
>
> It would be useful if ViewFS had the ability to designate one FileSystem as a 
> "primary"/"default" to fall back to in the case that an entry in the mount 
> table is not found. Consider the situation when you have a mount table that 
> looks like:
> {code}
> /data -> hdfs://nn1/data
> /logs -> hdfs://nn1/logs
> /user -> hdfs://nn1/user
> /remote -> hdfs://nn2/remote
> {code}
> {{nn1}} here is being used as the primary, with a specific directory 'remote' 
> being offloaded to another namenode. This works, but if we want to add 
> another top-level directory to {{nn1}}, we have to update all of our 
> client-side mount tables. Merge links (HADOOP-8298) could be used to achieve 
> this but they do not seem to be coming soon; this special case of a default 
> FS is much simpler - try to resolve through the VFS mount table, if not 
> found, then resolve to the defaultFS.
> There is a good discussion of this at 
> https://issues.apache.org/jira/browse/HADOOP-13055?focusedCommentId=15733822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15733822



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14136) Default FS For ViewFS

2017-03-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893381#comment-15893381
 ] 

Erik Krogen commented on HADOOP-14136:
--

Yes, supporting full hierarchical mounts would be ideal but we think the 
fallback FS approach is a stopgap solution with a good effort-to-reward ratio.

> Default FS For ViewFS
> -
>
> Key: HADOOP-14136
> URL: https://issues.apache.org/jira/browse/HADOOP-14136
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Erik Krogen
>
> It would be useful if ViewFS had the ability to designate one FileSystem as a 
> "primary"/"default" to fall back to in the case that an entry in the mount 
> table is not found. Consider the situation when you have a mount table that 
> looks like:
> {code}
> /data -> hdfs://nn1/data
> /logs -> hdfs://nn1/logs
> /user -> hdfs://nn1/user
> /remote -> hdfs://nn2/remote
> {code}
> {{nn1}} here is being used as the primary, with a specific directory 'remote' 
> being offloaded to another namenode. This works, but if we want to add 
> another top-level directory to {{nn1}}, we have to update all of our 
> client-side mount tables. Merge links (HADOOP-8298) could be used to achieve 
> this but they do not seem to be coming soon; this special case of a default 
> FS is much simpler - try to resolve through the VFS mount table, if not 
> found, then resolve to the defaultFS.
> There is a good discussion of this at 
> https://issues.apache.org/jira/browse/HADOOP-13055?focusedCommentId=15733822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15733822



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14136) Default FS For ViewFS

2017-03-02 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893378#comment-15893378
 ] 

Manoj Govindassamy commented on HADOOP-14136:
-

[~xkrogen], So, this boils down to having one unified global namespace which 
can have nested directory mounts, just like the root in *nix systems with 
various other filesystems mounted (hierarchically) under it ? I am not opposed 
this approach. The current ViewFS system mount resolution is very primitive and 
doesn't support hierarchical mounts. May be we don't need to support a full 
fledged hierarchical mounts as long we are providing a fallback FS. 

> Default FS For ViewFS
> -
>
> Key: HADOOP-14136
> URL: https://issues.apache.org/jira/browse/HADOOP-14136
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Erik Krogen
>
> It would be useful if ViewFS had the ability to designate one FileSystem as a 
> "primary"/"default" to fall back to in the case that an entry in the mount 
> table is not found. Consider the situation when you have a mount table that 
> looks like:
> {code}
> /data -> hdfs://nn1/data
> /logs -> hdfs://nn1/logs
> /user -> hdfs://nn1/user
> /remote -> hdfs://nn2/remote
> {code}
> {{nn1}} here is being used as the primary, with a specific directory 'remote' 
> being offloaded to another namenode. This works, but if we want to add 
> another top-level directory to {{nn1}}, we have to update all of our 
> client-side mount tables. Merge links (HADOOP-8298) could be used to achieve 
> this but they do not seem to be coming soon; this special case of a default 
> FS is much simpler - try to resolve through the VFS mount table, if not 
> found, then resolve to the defaultFS.
> There is a good discussion of this at 
> https://issues.apache.org/jira/browse/HADOOP-13055?focusedCommentId=15733822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15733822



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14135:
---
Attachment: HADOOP-14135.002.patch

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch, 
> HADOOP-14135.002.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893363#comment-15893363
 ] 

Aaron Fabbri commented on HADOOP-14094:
---

Ok.. I think we should fix the indent levels that checkstyle is flagging (USAGE 
string literals, etc).

After that I am +1.

I built the tool and tested it.  Ran diff / import.   Also ran all integration 
tests in us-west-2 w/ DDB. Only failure I saw was HADOOP-14036.

BTW, I may have found another bug in the CLI: I ran import then diff and saw 
(1) diff was non-empty after import, and (2) diff printed everything twice.  
I'll open a new JIRA as it appears to be unrelated to this change.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893357#comment-15893357
 ] 

Hadoop QA commented on HADOOP-14135:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  3m 
48s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 7 
new + 9 unchanged - 2 fixed = 16 total (was 11) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
19s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14135 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855736/HADOOP-14135.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ae22cb66a3c0 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e61491d |
| Default Java | 1.8.0_121 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11749/artifact/patchprocess/branch-mvninstall-root.txt
 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11749/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11749/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11749/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>

[jira] [Commented] (HADOOP-14136) Default FS For ViewFS

2017-03-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893347#comment-15893347
 ] 

Erik Krogen commented on HADOOP-14136:
--

[~manojg], so you would access the two via {{viewfs://ClusterX/}} and 
{{viewfs://ClusterY/remote}}, correct? The issue is that we want to hide the 
detail from the client of which cluster the physical data lies on. We want 
those locations to be managed by cluster admins who can update the mount table 
(in a shared config) as opposed to clients having to update their URLs and be 
aware of which cluster data resides on.

> Default FS For ViewFS
> -
>
> Key: HADOOP-14136
> URL: https://issues.apache.org/jira/browse/HADOOP-14136
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Erik Krogen
>
> It would be useful if ViewFS had the ability to designate one FileSystem as a 
> "primary"/"default" to fall back to in the case that an entry in the mount 
> table is not found. Consider the situation when you have a mount table that 
> looks like:
> {code}
> /data -> hdfs://nn1/data
> /logs -> hdfs://nn1/logs
> /user -> hdfs://nn1/user
> /remote -> hdfs://nn2/remote
> {code}
> {{nn1}} here is being used as the primary, with a specific directory 'remote' 
> being offloaded to another namenode. This works, but if we want to add 
> another top-level directory to {{nn1}}, we have to update all of our 
> client-side mount tables. Merge links (HADOOP-8298) could be used to achieve 
> this but they do not seem to be coming soon; this special case of a default 
> FS is much simpler - try to resolve through the VFS mount table, if not 
> found, then resolve to the defaultFS.
> There is a good discussion of this at 
> https://issues.apache.org/jira/browse/HADOOP-13055?focusedCommentId=15733822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15733822



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893332#comment-15893332
 ] 

Aaron Fabbri commented on HADOOP-13914:
---

Thanks for the review [~mackrorysd].

{quote}
I'm a bit concerned about all the S3AFileStatus -> FileStatus changes (although 
I could've sworn that we already removed that assertion in 
PathMetadataTranslation...). I think it's definitely a change for the better, 
but I'm a little worried there is some application out there that this change 
will break, despite being @Private and @Evolving... 
{quote}

Yes, this is a fundamental challenge for this patch.  I explicitly want to 
remove isEmptyDirectory from the public API.  It should only be used internally 
because we don't want the cost of always computing it, when it is usually never 
used.

{quote}
Along similar lines, I wondered if a `public S3AFileStatus getFileStatus(Path, 
bool)` function might be in order in S3AFileSystem? No idea how much it'll get 
used, but if anyone IS depending on the current S3AFileStatus return type and 
wants that work done it'd be useful.
{quote}

My understanding is that having isEmptyDirectory() on the S3AFileStatus is an 
internal optimization to save a round trip for delete and rename.  Without a 
compelling need for this shortcut in the public API, I would suggest people 
just use {{listStatus(dir)}} to determine emptiness.

{quote}
There's an assertion in TestDynamoDBMetadataStore.java that isEmptyDirectory() 
is what we expect. Why is that removed? Is it a Boolean -> Tristate issue? If 
so, shouldn't we modify the logic to accept either the expected one or UNKNOWN?
{quote}

After this patch, MetadataStore implementations are not required to return 
S3AFileStatus with isEmptyDirectory set properly (which was broken anyways for 
DDB).  So that assertion no longer makes sense.  I should probably:

- Replace that assert with one that checks PathMetadata#isEmptyDirectory() does 
not conflict with expected value (i.e. UNKNOWN is always correct, but false 
versus true would be a failure).
- We should probably have some new MetadataStore contract test cases around 
{{PathMetadata#isEmptyDirectory}}.  I can add one in the next patch. 

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893327#comment-15893327
 ] 

Mingliang Liu commented on HADOOP-14135:


Ping [~mackrorysd] for review. Thanks,

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-03-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14135:
---
Attachment: HADOOP-14135.001.patch

Thanks Steve. You're right we should be very careful about this change.

For logging, 1) we use the {{S3xLoginHelper.toString(binding)}} to remove 
credentials  2) we log all credentials used for the S3A URI at once at 
{{createAWSCredentialProviderSet()}}. So logging is kept, without credentials, 
without the help of fs URI.

> Remove URI parameter in AWSCredentialProvider constructors
> --
>
> Key: HADOOP-14135
> URL: https://issues.apache.org/jira/browse/HADOOP-14135
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HADOOP-14135.000.patch, HADOOP-14135.001.patch
>
>
> This was from comment in [HADOOP-13252].
> It looks like the URI parameter is not needed for our AWSCredentialProvider 
> constructors. This was useful because we relied on URI parameter for 
> retrieving user:pass. Now in binding URIs, we have
> {code}
> 279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
> 280 credentials.add(new BasicAWSCredentialsProvider(
> 281 creds.getUser(), creds.getPassword()));
> {code}
> This way, we only need configuration object (if necessary) for all 
> AWSCredentialProvider implementations. The benefit is that, if we create 
> AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
> associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893307#comment-15893307
 ] 

Hadoop QA commented on HADOOP-14129:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
20s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14129 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855724/HADOOP-14129-HADOOP-13345.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7c4f8bd16e44 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 0942c9f |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11748/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11748/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13

[jira] [Commented] (HADOOP-14136) Default FS For ViewFS

2017-03-02 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893304#comment-15893304
 ] 

Manoj Govindassamy commented on HADOOP-14136:
-

[~xkrogen],

HADOOP-13055 brings in LinkMergeSlash support where by the root directory from 
any NN can be mounted. We can't mix the root mount and its sub directory mounts 
from the same Cluster though. But, we can always have another cluster sub 
directories mounted in a normal way in a different mount table. So, your use 
case example will turn like something below. Wouldn't this be sufficient ? 
Please share your thoughts.

{noformat}
fs.viewfs.mounttable.ClusterX.linkMergeSlash = hdfs://nn1/
fs.viewfs.mounttable.ClusterY./remote = hdfs://nn2/remote
{noformat}

> Default FS For ViewFS
> -
>
> Key: HADOOP-14136
> URL: https://issues.apache.org/jira/browse/HADOOP-14136
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, viewfs
>Reporter: Erik Krogen
>
> It would be useful if ViewFS had the ability to designate one FileSystem as a 
> "primary"/"default" to fall back to in the case that an entry in the mount 
> table is not found. Consider the situation when you have a mount table that 
> looks like:
> {code}
> /data -> hdfs://nn1/data
> /logs -> hdfs://nn1/logs
> /user -> hdfs://nn1/user
> /remote -> hdfs://nn2/remote
> {code}
> {{nn1}} here is being used as the primary, with a specific directory 'remote' 
> being offloaded to another namenode. This works, but if we want to add 
> another top-level directory to {{nn1}}, we have to update all of our 
> client-side mount tables. Merge links (HADOOP-8298) could be used to achieve 
> this but they do not seem to be coming soon; this special case of a default 
> FS is much simpler - try to resolve through the VFS mount table, if not 
> found, then resolve to the defaultFS.
> There is a good discussion of this at 
> https://issues.apache.org/jira/browse/HADOOP-13055?focusedCommentId=15733822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15733822



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893273#comment-15893273
 ] 

Sean Mackrory commented on HADOOP-13914:


Already discussed this a bit with you while reviewing, but sharing all thoughts 
for everyone else's sake:
* In case other reviewers get confused by it, the hints from Git in each hunk 
about which function a change is in are misleading S3AFileSystem, and in the 
places where putAndReturn is removed it is only removed in the S3-only version 
of the function that does no metadatastore operation. putAndReturn now happens 
higher in the callstack only when there is a metadatastore context.
* I'm a bit concerned about all the S3AFileStatus -> FileStatus changes 
(although I could've sworn that we already removed that assertion in 
PathMetadataTranslation...). I think it's definitely a change for the better, 
but I'm a little worried there is some application out there that this change 
will break, despite being @Private and @Evolving... Let's definitely at least 
call this out in a release note or something.
* Along similar lines, I wondered if a `public S3AFileStatus 
getFileStatus(Path, bool)` function might be in order in S3AFileSystem? No idea 
how much it'll get used, but if anyone IS depending on the current 
S3AFileStatus return type and wants that work done it'd be useful.
* I also thought about a S3AFileStatus.setIsEmptyDirectory(bool) wrapper to 
avoid any unnecessary change in compatibility, but again, very unlikely to be 
used publicly, so feel free to ignore. isEmptyDirectory is probably going break 
the same set of applications that might use it anyway (which for all I know is 
an empty set), so no perfectly compatible way to do this anyway...
* JavaDoc comment on ITestSGuardEmptyDirs recommends a refactoring after 
HADOOP-13345 is merged - we should file a JIRA dependent on HADOOP-13345 for 
that when this is committed.
* There's an assertion in TestDynamoDBMetadataStore.java that 
isEmptyDirectory() is what we expect. Why is that removed? Is it a Boolean -> 
Tristate issue? If so, shouldn't we modify the logic to accept either the 
expected one or UNKNOWN?
* +1 to ignoring the findbugs issue that way.
Tests are running now - almost done and no related failures, will report back 
if the end result is otherwise.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893268#comment-15893268
 ] 

Andrew Wang commented on HADOOP-14104:
--

Hi Daryn, thanks for commenting! Looks like you have a more ambitious 
implementation in mind, since your usecases include dynamic configuration 
changes without client restarts (something not possible with the current 
config-based approach).

Generally speaking, I think it's pretty rare to change the KMS URI. I think the 
two situations are:

* Enabling HDFS encryption for the first time. This currently requires a client 
restart.
* Enabling KMS HA. As long as the old KMS is part of the HA group, then clients 
with the old value will still work.

Since the KMS is just a proxy, you can swap out the backing KeyProvider 
implementation without changing the URI.

I'm not familiar with the Credentials APIs, but I like the sound of your 
proposal. It lets most clients avoid calling getServerDefaults, which was my 
main concern about the current patch.

Since we're very interested in a NN-specified KMS URI but less interested in 
dynamic refresh, so if it's reasonable to do refresh as a follow-on JIRA, 
that'd be optimal from our perspective.

> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14129:
---
Attachment: HADOOP-14129-HADOOP-13345.005.patch

@Steve I think the configuration are isolated in each test case, but the 
sysprop is not. So the v5 patch will simply assume (read conf/sysprop instead 
of write) testing S3Guard is not enabled. I tested this (I was able to 
reproduce this test failure consistently now):
# {{-Ds3guard -Ddynamo}} from mvn command line
# set config {{fs.s3a.metadatastore.impl}} in configuration file 
{{hadoop-tools/hadoop-aws/src/test/resources/auth-keys.xml}}; *w and w/o* 
-Ds3guard from mvn command line
# set per bucket config {{fs.s3a.bucket.mliu-s3guard.metadatastore.impl}} in 
configuration file 
{{hadoop-tools/hadoop-aws/src/test/resources/auth-keys.xml}}; *w and w/o* 
-Ds3guard from mvn command line

They 5 cases all passed with v5 patch.

> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13345.003.patch, 
> HADOOP-14129-HADOOP-13345.004.patch, HADOOP-14129-HADOOP-13345.005.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14123) Make AdlFileSystem a service provider for FileSystem

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893211#comment-15893211
 ] 

John Zhuge commented on HADOOP-14123:
-

Sure

> Make AdlFileSystem a service provider for FileSystem
> 
>
> Key: HADOOP-14123
> URL: https://issues.apache.org/jira/browse/HADOOP-14123
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Assignee: John Zhuge
> Attachments: HADOOP-14123.001.patch
>
>
> Add a provider-configuration file giving the FS impl of {{AdlFileSystem}}; 
> remove the entry from core-default.xml



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893205#comment-15893205
 ] 

Mingliang Liu commented on HADOOP-14129:


Thanks [~fabbri] for your review. I believe not. I'll post a v5 patch soon. I 
also found the current v4 patch is not able to skip test with per-bucket 
configurations. The v5 patch will address that as well.

> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13345.003.patch, 
> HADOOP-14129-HADOOP-13345.004.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893217#comment-15893217
 ] 

Hadoop QA commented on HADOOP-14129:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
32s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
19s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
50s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
27s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14129 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855705/HADOOP-14129-HADOOP-13345.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c9c7c927801f 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 0942c9f |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11747/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11747/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-1334

[jira] [Created] (HADOOP-14143) S3A Path Style Being Ignore

2017-03-02 Thread Vishnu Vardhan (JIRA)

Vishnu Vardhan created HADOOP-14143:
---

 Summary: S3A Path Style Being Ignore
 Key: HADOOP-14143
 URL: https://issues.apache.org/jira/browse/HADOOP-14143
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Vishnu Vardhan


Hi:

In the following example, path style specification is being ignore

scala>:paste
sc.setLogLevel("DEBUG")
sc.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")
sc.hadoopConfiguration.set("fs.s3a.endpoint","webscaledemo.netapp.com:8082")
sc.hadoopConfiguration.set("fs.s3a.access.key","")
sc.hadoopConfiguration.set("fs.s3a.secret.key","")
sc.hadoopConfiguration.set("fs.s3a.path.style.access","false")
val s3Rdd = sc.textFile("s3a://myBkt8")
s3Rdd.count()





Debug Log:


application/x-www-form-urlencoded; charset=utf-8
Thu, 02 Mar 2017 22:46:56 GMT
/myBkt8/"
17/03/02 14:46:56 DEBUG request: Sending Request: GET 
https://webscaledemo.netapp.com:8082 /myBkt8/ Parameters: (max-keys: 1, prefix: 
user/vardhan/, delimiter: /, ) Headers: (Authorization: AWS 
2SNAJYEMQU45YPVYC89D:PIQqLcr6FV61H0+Ay7tw3WygGFo=, User-Agent: 
aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
22:46:56 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, ) 
17/03/02 14:46:56 DEBUG PoolingClientConnectionManager: Connection request: 
[route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; route 
allocated: 0 of 15; total allocated: 0 of 15]
17/03/02 14:46:56 DEBUG PoolingClientConnectionManager: Connection leased: [id: 
2][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; route 
allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 14:46:56 DEBUG DefaultClientConnectionOperator: Connecting to 
webscaledemo.netapp.com:8082
17/03/02 14:46:57 DEBUG RequestAddCookies: CookieSpec selected: default
17/03/02 14:46:57 DEBUG RequestAuthCache: Auth cache not set in the context
17/03/02 14:46:57 DEBUG RequestProxyAuthentication: Proxy auth state: 
UNCHALLENGED
17/03/02 14:46:57 DEBUG SdkHttpClient: Attempt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14142) S3A - Adding unexpected prefix

2017-03-02 Thread Vishnu Vardhan (JIRA)

Vishnu Vardhan created HADOOP-14142:
---

 Summary: S3A - Adding unexpected prefix
 Key: HADOOP-14142
 URL: https://issues.apache.org/jira/browse/HADOOP-14142
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Vishnu Vardhan
Priority: Critical


Hi:

S3A seems to prefix unexpected prefix to my s3 path

Specifically, in the debug log below the following line is unexpected

>  GET /myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1

It is not clear where the "prefix" is coming from and why.


I executed the following commands

sc.setLogLevel("DEBUG")
sc.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")
sc.hadoopConfiguration.set("fs.s3a.endpoint","webscaledemo.netapp.com:8082")
sc.hadoopConfiguration.set("fs.s3a.access.key","")
sc.hadoopConfiguration.set("fs.s3a.secret.key","")
sc.hadoopConfiguration.set("fs.s3a.path.style.access","false")
val s3Rdd = sc.textFile("s3a://myBkt98")
s3Rdd.count()





debug log is below


application/x-www-form-urlencoded; charset=utf-8
Thu, 02 Mar 2017 22:40:25 GMT
/myBkt8/"
17/03/02 14:40:25 DEBUG request: Sending Request: GET 
https://webscaledemo.netapp.com:8082 /myBkt8/ Parameters: (max-keys: 1, prefix: 
user/vardhan/, delimiter: /, ) Headers: (Authorization: AWS 
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=, User-Agent: 
aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
22:40:25 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, ) 
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection request: 
[route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; route 
allocated: 0 of 15; total allocated: 0 of 15]
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection leased: [id: 
10][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; 
route allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 14:40:25 DEBUG DefaultClientConnectionOperator: Connecting to 
webscaledemo.netapp.com:8082
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections 
idle longer than 60 SECONDS
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections 
idle longer than 60 SECONDS
17/03/02 14:40:26 DEBUG RequestAddCookies: CookieSpec selected: default
17/03/02 14:40:26 DEBUG RequestAuthCache: Auth cache not set in the context
17/03/02 14:40:26 DEBUG RequestProxyAuthentication: Proxy auth state: 
UNCHALLENGED
17/03/02 14:40:26 DEBUG SdkHttpClient: Attempt 1 to execute request
17/03/02 14:40:26 DEBUG DefaultClientConnection: Sending request: GET 
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
17/03/02 14:40:26 DEBUG wire:  >> "GET 
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "Host: webscaledemo.netapp.com:8082[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "Authorization: AWS 
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 22:40:25 GMT[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  >> "[\r][\n]"
17/03/02 14:40:26 DEBUG headers: >> GET 
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
17/03/02 14:40:26 DEBUG headers: >> Host: webscaledemo.netapp.com:8082
17/03/02 14:40:26 DEBUG headers: >> Authorization: AWS 
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=
17/03/02 14:40:26 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
17/03/02 14:40:26 DEBUG headers: >> Date: Thu, 02 Mar 2017 22:40:25 GMT
17/03/02 14:40:26 DEBUG headers: >> Content-Type: 
application/x-www-form-urlencoded; charset=utf-8
17/03/02 14:40:26 DEBUG headers: >> Connection: Keep-Alive
17/03/02 14:40:26 DEBUG wire:  << "HTTP/1.1 200 OK[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "Date: Thu, 02 Mar 2017 22:40:26 GMT[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "Connection: KEEP-ALIVE[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "Server: StorageGRID/10.3.0.1[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "x-amz-request-id: 563477649[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "Content-Length: 266[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "Content-Type: application/xml[\r][\n]"
17/03/02 14:40:26 DEBUG wire:  << "[\r][\n]"
17/03/02 14:40:26 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 
200 OK
17/03/02 14:40:26 DEBUG headers: << HTTP/1.1 200 OK
17/03/02 14:40:26 DEBUG headers: << Date: Thu, 02 Mar 2017 22:40:26 GMT
17/03/02 14:40:26 DEBUG header

[jira] [Commented] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893177#comment-15893177
 ] 

Aaron Fabbri commented on HADOOP-14129:
---

+1 on v4 patch.  One minor question:

{code}
@@ -85,8 +91,6 @@ public void testInstantiateFromURL() throws Throwable {
 conf.unset(Constants.SECRET_KEY);
 fs = S3ATestUtils.createTestFileSystem(conf);
  
-// Skip in the case of S3Guard with DynamoDB because it cannot get
-// credentials for its own use if they're only in S3 URLs
 Assume.assumeFalse(fs.hasMetadataStore());
  
 String fsURI = fs.getUri().toString();
{code}
Do we still need this assumeFalse() too? 


> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13345.003.patch, 
> HADOOP-14129-HADOOP-13345.004.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14140) S3A Not Working 3rd party S3 Interface

2017-03-02 Thread Vishnu Vardhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishnu Vardhan updated HADOOP-14140:

Description: 
Hi:


UPDATE: This stack trace is being caused because there are 0 objects in the 
target directory. S3A is unable to handle 0 objects

---
Connecting S3A to a 3rd party object store does not work. This is a publicly 
hosted grid and i can provide credentials if required. Please see the debug log 
below

There are two problems -
1. Path Style setting is ignored, and S3A always uses host style addressing
2. Even when host style is specified, it is unable to proceed, see debug log



17/03/02 13:35:03 DEBUG HadoopRDD: Creating new JobConf and caching it for 
later re-use
17/03/02 13:35:03 DEBUG InternalConfig: Configuration override 
awssdk_config_override.json not found.
17/03/02 13:35:03 DEBUG AWSCredentialsProviderChain: Loading credentials from 
BasicAWSCredentialsProvider
17/03/02 13:35:03 DEBUG S3Signer: Calculated string to sign:
"HEAD

application/x-www-form-urlencoded; charset=utf-8
Thu, 02 Mar 2017 21:35:03 GMT
/solidfire/"
17/03/02 13:35:03 DEBUG request: Sending Request: HEAD 
https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082 / Headers: 
(Authorization: AWS 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=, 
User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
21:35:03 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, ) 
17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection request: 
[route: 
{s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
kept alive: 0; route allocated: 0 of 15; total allocated: 0 of 15]
17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection leased: [id: 
0][route: 
{s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
kept alive: 0; route allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 13:35:03 DEBUG DefaultClientConnectionOperator: Connecting to 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
17/03/02 13:35:03 DEBUG RequestAddCookies: CookieSpec selected: default
17/03/02 13:35:03 DEBUG RequestAuthCache: Auth cache not set in the context
17/03/02 13:35:03 DEBUG RequestProxyAuthentication: Proxy auth state: 
UNCHALLENGED
17/03/02 13:35:03 DEBUG SdkHttpClient: Attempt 1 to execute request
17/03/02 13:35:03 DEBUG DefaultClientConnection: Sending request: HEAD / 
HTTP/1.1
17/03/02 13:35:03 DEBUG wire:  >> "HEAD / HTTP/1.1[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Host: 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Authorization: AWS 
2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 21:35:03 GMT[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "[\r][\n]"
17/03/02 13:35:03 DEBUG headers: >> HEAD / HTTP/1.1
17/03/02 13:35:03 DEBUG headers: >> Host: 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
17/03/02 13:35:03 DEBUG headers: >> Authorization: AWS 
2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=
17/03/02 13:35:03 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
17/03/02 13:35:03 DEBUG headers: >> Date: Thu, 02 Mar 2017 21:35:03 GMT
17/03/02 13:35:03 DEBUG headers: >> Content-Type: 
application/x-www-form-urlencoded; charset=utf-8
17/03/02 13:35:03 DEBUG headers: >> Connection: Keep-Alive
17/03/02 13:35:03 DEBUG wire:  << "HTTP/1.1 200 OK[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Date: Thu, 02 Mar 2017 21:35:03 GMT[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Connection: KEEP-ALIVE[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Server: StorageGRID/10.3.0.1[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "x-amz-request-id: 640939184[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Content-Length: 0[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "[\r][\n]"
17/03/02 13:35:03 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 
200 OK
17/03/02 13:35:03 DEBUG headers: << HTTP/1.1 200 OK
17/03/02 13:35:03 DEBUG headers: << Date: Thu, 02 Mar 2017 21:35:03 GMT
17/03/02 13:35:03 DEBUG headers: << Connection: KEEP-ALIVE
17/03/02 13:35:03 DEBUG headers: << Server: StorageGRID/10.3.0.1
17/03/02 13:35:03 DEBUG headers: << x-amz-request-id: 640939184
17/03/02 13:35:03 DEBUG headers: << Content-Length: 0
17/03/02 13:35:03 DEBUG SdkHttpClient: Connection can be kept alive indefinitely
17/03/02 13:35:04 DEBUG PoolingClientConnectionManager: Connection [id: 
0][route: {s}->https://s

[jira] [Updated] (HADOOP-14068) Add integration test version of TestMetadataStore for DynamoDB

2017-03-02 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14068:
---
Attachment: HADOOP-14068-HADOOP-13345.007.patch

> Add integration test version of TestMetadataStore for DynamoDB
> --
>
> Key: HADOOP-14068
> URL: https://issues.apache.org/jira/browse/HADOOP-14068
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14068-HADOOP-13345.001.patch, 
> HADOOP-14068-HADOOP-13345.002.patch, HADOOP-14068-HADOOP-13345.003.patch, 
> HADOOP-14068-HADOOP-13345.004.patch, HADOOP-14068-HADOOP-13345.005.patch, 
> HADOOP-14068-HADOOP-13345.006.patch, HADOOP-14068-HADOOP-13345.007.patch
>
>
> I tweaked TestDynamoDBMetadataStore to run against the actual Amazon DynamoDB 
> service (as opposed to the "local" edition). Several tests failed because of 
> minor variations in behavior. I think the differences that are clearly 
> possible are enough to warrant extending that class as an ITest (but 
> obviously keeping the existing test so 99% of the the coverage remains even 
> when not configured for actual DynamoDB usage).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14068) Add integration test version of TestMetadataStore for DynamoDB

2017-03-02 Thread Sean Mackrory (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14068:
---
Status: Open  (was: Patch Available)

Withdrawing patch. I've updated it so that it compiles and *mostly* works on 
top of the recent ClientFactory refactor, but there are still 5-6 tests that 
fail that need to be investigated.

> Add integration test version of TestMetadataStore for DynamoDB
> --
>
> Key: HADOOP-14068
> URL: https://issues.apache.org/jira/browse/HADOOP-14068
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14068-HADOOP-13345.001.patch, 
> HADOOP-14068-HADOOP-13345.002.patch, HADOOP-14068-HADOOP-13345.003.patch, 
> HADOOP-14068-HADOOP-13345.004.patch, HADOOP-14068-HADOOP-13345.005.patch, 
> HADOOP-14068-HADOOP-13345.006.patch
>
>
> I tweaked TestDynamoDBMetadataStore to run against the actual Amazon DynamoDB 
> service (as opposed to the "local" edition). Several tests failed because of 
> minor variations in behavior. I think the differences that are clearly 
> possible are enough to warrant extending that class as an ITest (but 
> obviously keeping the existing test so 99% of the the coverage remains even 
> when not configured for actual DynamoDB usage).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14132) Filesystem discovery to stop loading implementation classes

2017-03-02 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893148#comment-15893148
 ] 

Daryn Sharp commented on HADOOP-14132:
--

Sounds generally reasonable.  More than anything, reducing the absurd class 
load times will be very welcome!

I'd caution against bundling other features like auto-config loading.  There's 
often unforeseen complications with what should be simple changes.  I could 
envision bugs due to unexpected resource loading, order of loading, etc.  If 
that happened I wouldn't want the core classloader speedup to be reverted.

> Filesystem discovery to stop loading implementation classes
> ---
>
> Key: HADOOP-14132
> URL: https://issues.apache.org/jira/browse/HADOOP-14132
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
>Affects Versions: 2.7.3
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> Integration testing of Hadoop with the HADOOP-14040 has shown up that the 
> move to a shaded AWS JAR is slowing all hadoop client code down.
> I believe this is due to how we use service discovery to identify FS 
> implementations: the implementation classes themselves are instantiated.
> This has known problems today with classloading, but clearly impacts 
> performance too, especially with complex transitive dependencies unique to 
> the loaded class.
> Proposed: have lightweight service declaration classes which implement an 
> interface declaring
> # schema
> # classname of FileSystem impl
> # classname of AbstractFS impl
> # homepage (for third party code, support, etc)
> These are what we register and scan in the FS to look for services.
> This will leave the question about what to do for existing filesystems? I 
> think we'll need to retain the old code for external ones, while moving the 
> hadoop modules to the new ones



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Mingliang Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-14129:
---
Attachment: HADOOP-14129-HADOOP-13345.004.patch

> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13345.003.patch, 
> HADOOP-14129-HADOOP-13345.004.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893132#comment-15893132
 ] 

Hudson commented on HADOOP-14138:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11333 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11333/])
HADOOP-14138. Remove S3A ref from META-INF service discovery, rely on (stevel: 
rev a97833e0ed4b31f0403ee3d789163615c7cdd9af)
* (edit) 
hadoop-tools/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem


> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha3
>
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14138:

Description: 
As discussed in HADOOP-14132, the shaded AWS library is killing performance 
starting all hadoop operations, due to classloading on FS service discovery.

This is despite the fact that there is an entry for fs.s3a.impl in 
core-default.xml, *we don't need service discovery here*

Proposed:
# cut the entry from 
{{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
# when HADOOP-14132 is in, move to that, including declaring an XML file 
exclusively for s3a entries

I want this one in first as its a major performance regression, and one we 
coula actually backport to 2.7.x, just to improve load time slightly there too

  was:
As discussed in HADOOP-14132, the shaded AWS library is killing performance 
starting all hadoop operations, due to classloading on FS service discovery.

This is despite the fact that there is an entry for fs.s3a.impl in 
core-default.xml, *we don't need service discovery here*

Proposed:
# cut the entry from 
{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
# when HADOOP-14132 is in, move to that, including declaring an XML file 
exclusively for s3a entries

I want this one in first as its a major performance regression, and one we 
coula actually backport to 2.7.x, just to improve load time slightly there too


> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha3
>
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14083) KMS should support old SSL clients

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893116#comment-15893116
 ] 

John Zhuge edited comment on HADOOP-14083 at 3/2/17 10:06 PM:
--

Filed a follow-up HADOOP-14141 Store KMS SSL keystore password in 
catalina.properties.


was (Author: jzhuge):
Filed a follow-up HDFS-11490 Store KMS SSL keystore password in 
catalina.properties.

> KMS should support old SSL clients
> --
>
> Key: HADOOP-14083
> URL: https://issues.apache.org/jira/browse/HADOOP-14083
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.8.0, 2.7.4, 2.6.6
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HADOOP-14083.branch-2.001.patch, 
> HADOOP-14083.branch-2.002.patch
>
>
> HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL 
> clients such as curl stop working. The symptom is {{NSS error -12286}} when 
> running {{curl -v}}.
> Instead of forcing the SSL clients to upgrade, we can configure Tomcat to 
> explicitly allow enough weak ciphers so that old SSL clients can work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Moved] (HADOOP-14141) Store KMS SSL keystore password in catalina.properties

2017-03-02 Thread John Zhuge (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge moved HDFS-11490 to HADOOP-14141:


Affects Version/s: (was: 2.9.0)
   2.9.0
 Target Version/s: 2.9.0  (was: 2.9.0)
  Component/s: (was: kms)
   kms
  Key: HADOOP-14141  (was: HDFS-11490)
  Project: Hadoop Common  (was: Hadoop HDFS)

> Store KMS SSL keystore password in catalina.properties
> --
>
> Key: HADOOP-14141
> URL: https://issues.apache.org/jira/browse/HADOOP-14141
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.9.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>
> HADOOP-14083 stores SSL ciphers in catalina.properties. We can do the same 
> for SSL keystore password, thus no longer need the current {{sed}} method:
> {noformat}
> # If ssl, the populate the passwords into ssl-server.xml before starting 
> tomcat
> if [ ! "${KMS_SSL_KEYSTORE_PASS}" = "" ] || [ ! "${KMS_SSL_TRUSTSTORE_PASS}" 
> = "" ]; then
>   # Set a KEYSTORE_PASS if not already set
>   KMS_SSL_KEYSTORE_PASS=${KMS_SSL_KEYSTORE_PASS:-password}
>   KMS_SSL_KEYSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_KEYSTORE_PASS")
>   KMS_SSL_TRUSTSTORE_PASS_ESCAPED=$(hadoop_escape "$KMS_SSL_TRUSTSTORE_PASS")
>   cat ${CATALINA_BASE}/conf/ssl-server.xml.conf \
> | sed 
> 's/"_kms_ssl_keystore_pass_"/'"\"${KMS_SSL_KEYSTORE_PASS_ESCAPED}\""'/g' \
> | sed 
> 's/"_kms_ssl_truststore_pass_"/'"\"${KMS_SSL_TRUSTSTORE_PASS_ESCAPED}\""'/g' 
> > ${CATALINA_BASE}/conf/ssl-server.xml
> fi
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14083) KMS should support old SSL clients

2017-03-02 Thread John Zhuge (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893116#comment-15893116
 ] 

John Zhuge commented on HADOOP-14083:
-

Filed a follow-up HDFS-11490 Store KMS SSL keystore password in 
catalina.properties.

> KMS should support old SSL clients
> --
>
> Key: HADOOP-14083
> URL: https://issues.apache.org/jira/browse/HADOOP-14083
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.8.0, 2.7.4, 2.6.6
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HADOOP-14083.branch-2.001.patch, 
> HADOOP-14083.branch-2.002.patch
>
>
> HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL 
> clients such as curl stop working. The symptom is {{NSS error -12286}} when 
> running {{curl -v}}.
> Instead of forcing the SSL clients to upgrade, we can configure Tomcat to 
> explicitly allow enough weak ciphers so that old SSL clients can work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-14138:

Fix Version/s: 2.7.4

I committed this to branch-2.7 as well.

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha3
>
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14140) S3A Not Working 3rd party S3 Interface

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893109#comment-15893109
 ] 

Steve Loughran commented on HADOOP-14140:
-

vishnu, we ought to be doing this, I Think [~Thomas Demoor] tests this way 
against their in-house endpoint.

Interesting here is that a 200 is coming back, yet the XML result is triggering 
an interpretation as a 404.

# can you check out and build Hadoop branch-2.8.0; this is what is about to 
ship and so the last chance place to get a fix in.
# Don't bother with building spark, just see if you can set up the hadoop-aws 
tests (only bother with the s3a ones), See: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
# on the failure, it'd be good to have the full stack trace; spark appears to 
have dropped the interesting bits (i.e we don't know exactly which operation 
failed)

There are major changes between 2.7 and 2.8 here (everything in HADOOP-11694); 
hopefully this will have been addressed too. We just need your assistance in 
testing and debugging the problem.



> S3A Not Working 3rd party S3 Interface
> --
>
> Key: HADOOP-14140
> URL: https://issues.apache.org/jira/browse/HADOOP-14140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.3
>Reporter: Vishnu Vardhan
>
> Hi:
> Connecting S3A to a 3rd party object store does not work. This is a publicly 
> hosted grid and i can provide credentials if required. Please see the debug 
> log below
> There are two problems -
> 1. Path Style setting is ignored, and S3A always uses host style addressing
> 2. Even when host style is specified, it is unable to proceed, see debug log
> 17/03/02 13:35:03 DEBUG HadoopRDD: Creating new JobConf and caching it for 
> later re-use
> 17/03/02 13:35:03 DEBUG InternalConfig: Configuration override 
> awssdk_config_override.json not found.
> 17/03/02 13:35:03 DEBUG AWSCredentialsProviderChain: Loading credentials from 
> BasicAWSCredentialsProvider
> 17/03/02 13:35:03 DEBUG S3Signer: Calculated string to sign:
> "HEAD
> application/x-www-form-urlencoded; charset=utf-8
> Thu, 02 Mar 2017 21:35:03 GMT
> /solidfire/"
> 17/03/02 13:35:03 DEBUG request: Sending Request: HEAD 
> https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082 / Headers: 
> (Authorization: AWS 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=, 
> User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
> Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
> 21:35:03 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, 
> ) 
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection request: 
> [route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 0 of 15; total allocated: 0 of 15]
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection leased: 
> [id: 0][route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 1 of 15; total allocated: 1 of 15]
> 17/03/02 13:35:03 DEBUG DefaultClientConnectionOperator: Connecting to 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG RequestAddCookies: CookieSpec selected: default
> 17/03/02 13:35:03 DEBUG RequestAuthCache: Auth cache not set in the context
> 17/03/02 13:35:03 DEBUG RequestProxyAuthentication: Proxy auth state: 
> UNCHALLENGED
> 17/03/02 13:35:03 DEBUG SdkHttpClient: Attempt 1 to execute request
> 17/03/02 13:35:03 DEBUG DefaultClientConnection: Sending request: HEAD / 
> HTTP/1.1
> 17/03/02 13:35:03 DEBUG wire:  >> "HEAD / HTTP/1.1[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 21:35:03 
> GMT[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Content-Type: 
> application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "[\r][\n]"
> 17/03/02 13:35:03 DEBUG headers: >> HEAD / HTTP/1.1
> 17/03/02 13:35:03 DEBUG headers: >> Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG headers: >> Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=
> 17/03/02 13:35:03 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/

[jira] [Updated] (HADOOP-14140) S3A Not Working 3rd party S3 Interface

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14140:

Environment: Non-AWS s3 implementation
Component/s: fs/s3

> S3A Not Working 3rd party S3 Interface
> --
>
> Key: HADOOP-14140
> URL: https://issues.apache.org/jira/browse/HADOOP-14140
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.3
> Environment: Non-AWS s3 implementation
>Reporter: Vishnu Vardhan
>
> Hi:
> Connecting S3A to a 3rd party object store does not work. This is a publicly 
> hosted grid and i can provide credentials if required. Please see the debug 
> log below
> There are two problems -
> 1. Path Style setting is ignored, and S3A always uses host style addressing
> 2. Even when host style is specified, it is unable to proceed, see debug log
> 17/03/02 13:35:03 DEBUG HadoopRDD: Creating new JobConf and caching it for 
> later re-use
> 17/03/02 13:35:03 DEBUG InternalConfig: Configuration override 
> awssdk_config_override.json not found.
> 17/03/02 13:35:03 DEBUG AWSCredentialsProviderChain: Loading credentials from 
> BasicAWSCredentialsProvider
> 17/03/02 13:35:03 DEBUG S3Signer: Calculated string to sign:
> "HEAD
> application/x-www-form-urlencoded; charset=utf-8
> Thu, 02 Mar 2017 21:35:03 GMT
> /solidfire/"
> 17/03/02 13:35:03 DEBUG request: Sending Request: HEAD 
> https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082 / Headers: 
> (Authorization: AWS 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=, 
> User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
> Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
> 21:35:03 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, 
> ) 
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection request: 
> [route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 0 of 15; total allocated: 0 of 15]
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection leased: 
> [id: 0][route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 1 of 15; total allocated: 1 of 15]
> 17/03/02 13:35:03 DEBUG DefaultClientConnectionOperator: Connecting to 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG RequestAddCookies: CookieSpec selected: default
> 17/03/02 13:35:03 DEBUG RequestAuthCache: Auth cache not set in the context
> 17/03/02 13:35:03 DEBUG RequestProxyAuthentication: Proxy auth state: 
> UNCHALLENGED
> 17/03/02 13:35:03 DEBUG SdkHttpClient: Attempt 1 to execute request
> 17/03/02 13:35:03 DEBUG DefaultClientConnection: Sending request: HEAD / 
> HTTP/1.1
> 17/03/02 13:35:03 DEBUG wire:  >> "HEAD / HTTP/1.1[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 21:35:03 
> GMT[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Content-Type: 
> application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "[\r][\n]"
> 17/03/02 13:35:03 DEBUG headers: >> HEAD / HTTP/1.1
> 17/03/02 13:35:03 DEBUG headers: >> Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG headers: >> Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=
> 17/03/02 13:35:03 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
> 17/03/02 13:35:03 DEBUG headers: >> Date: Thu, 02 Mar 2017 21:35:03 GMT
> 17/03/02 13:35:03 DEBUG headers: >> Content-Type: 
> application/x-www-form-urlencoded; charset=utf-8
> 17/03/02 13:35:03 DEBUG headers: >> Connection: Keep-Alive
> 17/03/02 13:35:03 DEBUG wire:  << "HTTP/1.1 200 OK[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Date: Thu, 02 Mar 2017 21:35:03 
> GMT[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Connection: KEEP-ALIVE[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Server: StorageGRID/10.3.0.1[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "x-amz-request-id: 640939184[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Content-Length: 0[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "[\r][\n]"
> 17/03/02 13:35:03 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 
> 200 OK
> 17/03/02 13:35:03 DEBUG headers: << HTTP/1.1 200 OK
> 17/03/02 13:35:03 DE

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893104#comment-15893104
 ] 

Yongjun Zhang commented on HADOOP-14104:


Hm, my jira window was not refreshed so I missed updates from you guys when I 
did my last one. Thanks for further discussions.


> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14140) S3A Not Working 3rd party S3 Interface

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14140:

Priority: Major  (was: Blocker)

> S3A Not Working 3rd party S3 Interface
> --
>
> Key: HADOOP-14140
> URL: https://issues.apache.org/jira/browse/HADOOP-14140
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Vishnu Vardhan
>
> Hi:
> Connecting S3A to a 3rd party object store does not work. This is a publicly 
> hosted grid and i can provide credentials if required. Please see the debug 
> log below
> There are two problems -
> 1. Path Style setting is ignored, and S3A always uses host style addressing
> 2. Even when host style is specified, it is unable to proceed, see debug log
> 17/03/02 13:35:03 DEBUG HadoopRDD: Creating new JobConf and caching it for 
> later re-use
> 17/03/02 13:35:03 DEBUG InternalConfig: Configuration override 
> awssdk_config_override.json not found.
> 17/03/02 13:35:03 DEBUG AWSCredentialsProviderChain: Loading credentials from 
> BasicAWSCredentialsProvider
> 17/03/02 13:35:03 DEBUG S3Signer: Calculated string to sign:
> "HEAD
> application/x-www-form-urlencoded; charset=utf-8
> Thu, 02 Mar 2017 21:35:03 GMT
> /solidfire/"
> 17/03/02 13:35:03 DEBUG request: Sending Request: HEAD 
> https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082 / Headers: 
> (Authorization: AWS 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=, 
> User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
> Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
> 21:35:03 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, 
> ) 
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection request: 
> [route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 0 of 15; total allocated: 0 of 15]
> 17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection leased: 
> [id: 0][route: 
> {s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
> kept alive: 0; route allocated: 1 of 15; total allocated: 1 of 15]
> 17/03/02 13:35:03 DEBUG DefaultClientConnectionOperator: Connecting to 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG RequestAddCookies: CookieSpec selected: default
> 17/03/02 13:35:03 DEBUG RequestAuthCache: Auth cache not set in the context
> 17/03/02 13:35:03 DEBUG RequestProxyAuthentication: Proxy auth state: 
> UNCHALLENGED
> 17/03/02 13:35:03 DEBUG SdkHttpClient: Attempt 1 to execute request
> 17/03/02 13:35:03 DEBUG DefaultClientConnection: Sending request: HEAD / 
> HTTP/1.1
> 17/03/02 13:35:03 DEBUG wire:  >> "HEAD / HTTP/1.1[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 21:35:03 
> GMT[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Content-Type: 
> application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  >> "[\r][\n]"
> 17/03/02 13:35:03 DEBUG headers: >> HEAD / HTTP/1.1
> 17/03/02 13:35:03 DEBUG headers: >> Host: 
> solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
> 17/03/02 13:35:03 DEBUG headers: >> Authorization: AWS 
> 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=
> 17/03/02 13:35:03 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
> 17/03/02 13:35:03 DEBUG headers: >> Date: Thu, 02 Mar 2017 21:35:03 GMT
> 17/03/02 13:35:03 DEBUG headers: >> Content-Type: 
> application/x-www-form-urlencoded; charset=utf-8
> 17/03/02 13:35:03 DEBUG headers: >> Connection: Keep-Alive
> 17/03/02 13:35:03 DEBUG wire:  << "HTTP/1.1 200 OK[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Date: Thu, 02 Mar 2017 21:35:03 
> GMT[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Connection: KEEP-ALIVE[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Server: StorageGRID/10.3.0.1[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "x-amz-request-id: 640939184[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "Content-Length: 0[\r][\n]"
> 17/03/02 13:35:03 DEBUG wire:  << "[\r][\n]"
> 17/03/02 13:35:03 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 
> 200 OK
> 17/03/02 13:35:03 DEBUG headers: << HTTP/1.1 200 OK
> 17/03/02 13:35:03 DEBUG headers: << Date: Thu, 02 Mar 2017 21:35:03 GMT
> 17/03/02 13:35:03 DEBUG headers: << Connection: KEEP-A

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-14138:

Fix Version/s: 3.0.0-alpha3

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 2.8.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14140) S3A Not Working 3rd party S3 Interface

2017-03-02 Thread Vishnu Vardhan (JIRA)

Vishnu Vardhan created HADOOP-14140:
---

 Summary: S3A Not Working 3rd party S3 Interface
 Key: HADOOP-14140
 URL: https://issues.apache.org/jira/browse/HADOOP-14140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.3
Reporter: Vishnu Vardhan
Priority: Blocker


Hi:

Connecting S3A to a 3rd party object store does not work. This is a publicly 
hosted grid and i can provide credentials if required. Please see the debug log 
below

There are two problems -
1. Path Style setting is ignored, and S3A always uses host style addressing
2. Even when host style is specified, it is unable to proceed, see debug log



17/03/02 13:35:03 DEBUG HadoopRDD: Creating new JobConf and caching it for 
later re-use
17/03/02 13:35:03 DEBUG InternalConfig: Configuration override 
awssdk_config_override.json not found.
17/03/02 13:35:03 DEBUG AWSCredentialsProviderChain: Loading credentials from 
BasicAWSCredentialsProvider
17/03/02 13:35:03 DEBUG S3Signer: Calculated string to sign:
"HEAD

application/x-www-form-urlencoded; charset=utf-8
Thu, 02 Mar 2017 21:35:03 GMT
/solidfire/"
17/03/02 13:35:03 DEBUG request: Sending Request: HEAD 
https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082 / Headers: 
(Authorization: AWS 2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=, 
User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.12.3 
Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017 
21:35:03 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, ) 
17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection request: 
[route: 
{s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
kept alive: 0; route allocated: 0 of 15; total allocated: 0 of 15]
17/03/02 13:35:03 DEBUG PoolingClientConnectionManager: Connection leased: [id: 
0][route: 
{s}->https://solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082][total 
kept alive: 0; route allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 13:35:03 DEBUG DefaultClientConnectionOperator: Connecting to 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
17/03/02 13:35:03 DEBUG RequestAddCookies: CookieSpec selected: default
17/03/02 13:35:03 DEBUG RequestAuthCache: Auth cache not set in the context
17/03/02 13:35:03 DEBUG RequestProxyAuthentication: Proxy auth state: 
UNCHALLENGED
17/03/02 13:35:03 DEBUG SdkHttpClient: Attempt 1 to execute request
17/03/02 13:35:03 DEBUG DefaultClientConnection: Sending request: HEAD / 
HTTP/1.1
17/03/02 13:35:03 DEBUG wire:  >> "HEAD / HTTP/1.1[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Host: 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Authorization: AWS 
2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Date: Thu, 02 Mar 2017 21:35:03 GMT[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "Connection: Keep-Alive[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  >> "[\r][\n]"
17/03/02 13:35:03 DEBUG headers: >> HEAD / HTTP/1.1
17/03/02 13:35:03 DEBUG headers: >> Host: 
solidfire.vmasgwwebg01-tst.webscaledemo.netapp.com:8082
17/03/02 13:35:03 DEBUG headers: >> Authorization: AWS 
2SNAJYEMQU45YPVYC89D:WO0R+mPeYoQ2V29L4dMUJSSSVsQ=
17/03/02 13:35:03 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4 
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
17/03/02 13:35:03 DEBUG headers: >> Date: Thu, 02 Mar 2017 21:35:03 GMT
17/03/02 13:35:03 DEBUG headers: >> Content-Type: 
application/x-www-form-urlencoded; charset=utf-8
17/03/02 13:35:03 DEBUG headers: >> Connection: Keep-Alive
17/03/02 13:35:03 DEBUG wire:  << "HTTP/1.1 200 OK[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Date: Thu, 02 Mar 2017 21:35:03 GMT[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Connection: KEEP-ALIVE[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Server: StorageGRID/10.3.0.1[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "x-amz-request-id: 640939184[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "Content-Length: 0[\r][\n]"
17/03/02 13:35:03 DEBUG wire:  << "[\r][\n]"
17/03/02 13:35:03 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1 
200 OK
17/03/02 13:35:03 DEBUG headers: << HTTP/1.1 200 OK
17/03/02 13:35:03 DEBUG headers: << Date: Thu, 02 Mar 2017 21:35:03 GMT
17/03/02 13:35:03 DEBUG headers: << Connection: KEEP-ALIVE
17/03/02 13:35:03 DEBUG headers: << Server: StorageGRID/10.3.0.1
17/03/02 13:35:03 DEBUG headers: << x-amz-request-id: 640939184
17/03/02 13:35:03 DEBUG headers: << Content-Length: 0
17/03/02 13:35:03 DEBUG SdkHttpClient: Connection can be kept alive indefinitely
17/03/02 13:35:04 DEBUG PoolingClientConnectionM

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14138:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks, committed to 2.8.0+, just to save some trouble there. There's no 
penalty for backporting to 2.7 if you want to do this Jason.

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893065#comment-15893065
 ] 

Hadoop QA commented on HADOOP-14138:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
19s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14138 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855695/HADOOP-14138.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  |
| uname | Linux cbc3f557766e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / b3ec531 |
| Default Java | 1.8.0_121 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11746/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11746/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an X

[jira] [Commented] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-03-02 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893031#comment-15893031
 ] 

Jian He commented on HADOOP-14062:
--

sorry, I forgot about this. I uploaded a dummy patch for branch-2.8.0, if it 
fails the same. we can ignore these failures and get this committed.

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, HADOOP-14062-branch-2.8.0.005.patch, 
> HADOOP-14062-branch-2.8.0.dummy.patch, yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy80.allocate(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org

[jira] [Updated] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-03-02 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated HADOOP-14062:
-
Status: Open  (was: Patch Available)

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, HADOOP-14062-branch-2.8.0.005.patch, 
> HADOOP-14062-branch-2.8.0.dummy.patch, yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy80.allocate(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Ca

[jira] [Updated] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-03-02 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated HADOOP-14062:
-
Attachment: HADOOP-14062-branch-2.8.0.dummy.patch

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, HADOOP-14062-branch-2.8.0.005.patch, 
> HADOOP-14062-branch-2.8.0.dummy.patch, yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy80.allocate(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocat

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893027#comment-15893027
 ] 

Yongjun Zhang commented on HADOOP-14104:


Discussed with [~jojochuang] and [~jzhuge], we looked at the code together and 
saw KMS provider already support multiple key provider servers in its 
configuration, for example:

{code}
 
hadoop.security.key.provider.path
kms://ht...@kms01.example.com;kms02.example.com:16000/kms
  
{code}

and
{code}
 * If multiple hosts are provider, the Factory will create a
 * {@link LoadBalancingKMSClientProvider} that round-robins requests
 * across the provided list of hosts.
{code}
This is a form of KeyProvider HA handling (also handles load balancing).  

[~andrew.wang], 
{quote}
I like that getServerDefaults is lock-free, but I'm still worried about the 
overhead. MR tasks are short lived and thus don't benefit from the caching. 
This also affects all clients, on both encrypted and unencrypted clusters. I 
think getServerDefault is also currently only called when SASL is enabled. Have 
you done any performance testing of this RPC?
{quote}
getServerDefaults mechanism has been there and the patch here just included an 
additional field. Calling it once an hour should not be a problem to me, at 
least not a regression. It's just that if things changed within the hour,  the 
errors may not be handled correctly (for example, all old key provider hosts 
are replaced with new ones), but it's less of a concern assuming that's rare. I 
don't see that sasl is checked when getServerDefaults is called.

Thanks.











> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-14138:

Attachment: HADOOP-14138.001.patch

Attaching essentially the same patch for trunk to get a Jenkins run against 
trunk before the commit tomorrow.

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892979#comment-15892979
 ] 

Daryn Sharp commented on HADOOP-14104:
--

bq. ... how client handle namenode HA ...  we specify keyproviderservice in 
config file ...

The need for configs is the problem, so more configs is not the answer (aside: 
the NN HA token handling is an example of exactly what not to do).

bq. One thing I might recommend is that we don't query getServerDefaults after 
we get the KP initially. 

Enabling EZ on a cluster must not require a restart of all daemon and proxy 
services that communicate with said cluster.  It can't be cached forever.

––

I reviewed Rushabh's approach with him this morning.  The main goal should be a 
config-free token acquisition and selection.  How do we get there?

The first challenge is how does a client intelligently request a kms token, 
when needed, and from the right kms?   The NN is the authoritative and dynamic 
source for the correct kms, ala this patch..  Token acquisition should use the 
kp uri provided by the NN, and I'm not too worried about caching when a typical 
cluster has a few dozen app submits/sec (equaling token requests) vs 10s of 
thousand of NN ops/sec.  This is only a small part of the problem.

The second challenge is how does a client select the correct kms for a given 
NN?  The client could again ask the NN but you stumble into the morass of 
caching.  However as soon as the NN reports a different key provider than when 
a job launched, clients won't be able to find a token for the new kms - even 
when the old one is still legit.  Now jobs fail that should/could have 
completed.  It's very messy.  The simpler answer is a client should always use 
the key provider for a given NN as it existed when the token was acquired (ie. 
job submit).

So how do we implement a config-free mapping of NN to key provider?  When the 
hdfs and kms tokens are acquired we need a way to later associate them as a 
pair.  I think the cleanest/most-compatible way is leveraging the Credentials 
instead of the config.  We could inject a mapping of filesystem uri to kms uri 
via the secrets map.  So now when the client needs to talk to the kms it can 
check the map, else fallback to getServerDefaults.


> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892955#comment-15892955
 ] 

Mingliang Liu commented on HADOOP-14138:


+1

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-03-02 Thread Steven Rand (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892915#comment-15892915
 ] 

Steven Rand commented on HADOOP-14062:
--

[~jianhe], I'm not really sure what to do about this since we can't get the 
tests to pass, but the failures don't seem related to the patch as far as I can 
tell. Do you have any suggestions for how to proceed?

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, HADOOP-14062-branch-2.8.0.005.patch, 
> yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy80.allocate(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationH

[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2017-03-02 Thread Mike Percy (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892879#comment-15892879
 ] 

Mike Percy commented on HADOOP-8522:


I got some review feedback on this offline, I think the patch needs to be 
updated again. This was the feedback:

Why does resetState() write a new header to the stream.. versus, say, doing it 
lazily if and when more data is written?

> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used
> 
>
> Key: HADOOP-8522
> URL: https://issues.apache.org/jira/browse/HADOOP-8522
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Mike Percy
>Assignee: Mike Percy
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-8522-4.patch
>
>
> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used. The issue is that finish() flushes the compressor 
> buffer and writes the gzip CRC32 + data length trailer. After that, 
> resetState() does not repeat the gzip header, but simply starts writing more 
> deflate-compressed data. The resultant files are not readable by the Linux 
> "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip 
> files.
> The gzip format is specified in [RFC 
> 1952|https://tools.ietf.org/html/rfc1952].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892850#comment-15892850
 ] 

Steve Loughran commented on HADOOP-14138:
-

Thanks

why against branch-2? I went for branch-2 is that branch-2 registers s3:// as 
URL; we've dropped that from trunk, the patches won't backport. A branch-2 
patch works agains what I'm trying to do internally, and I can change the trunk 
patch as it goes in.

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892840#comment-15892840
 ] 

Andrew Wang commented on HADOOP-14104:
--

Hi Yongjun, what you describe is how the existing KMS HA already works.

One thing I might recommend is that we don't query getServerDefaults after we 
get the KP initially. This way we don't need to worry about the value changing 
later (or an exception being thrown).

> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.

2017-03-02 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892797#comment-15892797
 ] 

Yongjun Zhang commented on HADOOP-14104:


Thanks [~andrew.wang].

I gave some more thoughts, I think a better solution instead of the period 
polling is, just like how client handle namenode HA, we can do the same for 
KeyProvider. Say, if we specify keyproviderservice in config file to associate 
with a list of KeyProviders, if one keyProvider is down, the client can try to 
access the next one in the list (client failover). This is essentially 
KeyProvider HA.  But this would be a larger scope solution. Does this make 
sense to you?






> Client should always ask namenode for kms provider path.
> 
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14139) Tracing canonized server name from HTTP request during SPNEGO

2017-03-02 Thread Xiaoyu Yao (JIRA)

Xiaoyu Yao created HADOOP-14139:
---

 Summary: Tracing canonized server name from HTTP request during 
SPNEGO
 Key: HADOOP-14139
 URL: https://issues.apache.org/jira/browse/HADOOP-14139
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Xiaoyu Yao
Assignee: Hanisha Koneru
Priority: Minor


The serverName can be helpful to trouble shoot SPNEGO related authenticated 
issue.

{code}
 final String serverName = InetAddress.getByName(request.getServerName())
   .getCanonicalHostName();
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13665) Erasure Coding codec should support fallback coder

2017-03-02 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892751#comment-15892751
 ] 

Wei-Chiu Chuang commented on HADOOP-13665:
--

FYI [~lewuathe],
see the comment at 
https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=15890254&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15890254
There are folks who are implementing a new EC codec. Let's make sure the patch 
is pluggable so that adding a new EC codec is easy.

Thanks!

> Erasure Coding codec should support fallback coder
> --
>
> Key: HADOOP-13665
> URL: https://issues.apache.org/jira/browse/HADOOP-13665
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: io
>Reporter: Wei-Chiu Chuang
>Assignee: Kai Sasaki
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HADOOP-13665.01.patch, HADOOP-13665.02.patch, 
> HADOOP-13665.03.patch, HADOOP-13665.04.patch
>
>
> The current EC codec supports a single coder only (by default pure Java 
> implementation). If the native coder is specified but is unavailable, it 
> should fallback to pure Java implementation.
> One possible solution is to follow the convention of existing Hadoop native 
> codec, such as transport encryption (see {{CryptoCodec.java}}). It supports 
> fallback by specifying two or multiple coders as the value of property, and 
> loads coders in order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14132) Filesystem discovery to stop loading implementation classes

2017-03-02 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892604#comment-15892604
 ] 

Jason Lowe commented on HADOOP-14132:
-

Seems reasonable.  If the resource file is specifically intended for injecting 
configuration defaults then it something like 
getDefaultConfigurationResourceName would be clearer what's intended there.

> Filesystem discovery to stop loading implementation classes
> ---
>
> Key: HADOOP-14132
> URL: https://issues.apache.org/jira/browse/HADOOP-14132
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
>Affects Versions: 2.7.3
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> Integration testing of Hadoop with the HADOOP-14040 has shown up that the 
> move to a shaded AWS JAR is slowing all hadoop client code down.
> I believe this is due to how we use service discovery to identify FS 
> implementations: the implementation classes themselves are instantiated.
> This has known problems today with classloading, but clearly impacts 
> performance too, especially with complex transitive dependencies unique to 
> the loaded class.
> Proposed: have lightweight service declaration classes which implement an 
> interface declaring
> # schema
> # classname of FileSystem impl
> # classname of AbstractFS impl
> # homepage (for third party code, support, etc)
> These are what we register and scan in the FS to look for services.
> This will leave the question about what to do for existing filesystems? I 
> think we'll need to retain the old code for external ones, while moving the 
> hadoop modules to the new ones



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14137) Faster distcp by taking file list from fsimage or -lsr result

2017-03-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892582#comment-15892582
 ] 

Erik Krogen commented on HADOOP-14137:
--

+1 on this, we have just recently made similar efforts when trying to do a 
DistCp of very large numbers of files and I think it is useful in general. One 
note, you'll also have to provide a way for the user to specify what directory 
the files should be considered relative to (e.g. if one of the listed files is 
"/user/erik/dir/file", how much of that directory structure ends up being 
replicated on the target).

Also agreed with Steve that {{--listingFile}} is better than {{-list}}. 

> Faster distcp by taking file list from fsimage or -lsr result
> -
>
> Key: HADOOP-14137
> URL: https://issues.apache.org/jira/browse/HADOOP-14137
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Reporter: Zheng Shao
>
> DistCp is very slow to start when the src directory has a huge number of 
> subdirectories.  In our case, we already have the directory listing (via 
> "hdfs oiv -i fsimage" or via nightly "hdfs dfs -lr -r /" dumps), and we would 
> like to use that instead of doing realtime listing on the NameNode.
> The "-f" option doesn't help in this case because it would try to put 
> everything into a single flat target directory.
> We'd like to introduce a new option "-list " for distcp.  The  
> contains the result of listing the src directory.
> In order to achieve this, we plan to:
> 1. Add a new CopyListing class PregeneratedCopyListing similar to 
> SimpleCopyListing which doesn't "-ls -r" into the directory, but takes the 
> listing via "-list"
> 2. Add an option "-list " which will automatically make distcp use the 
> new PregeneratedCopyListing class.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892569#comment-15892569
 ] 

Jason Lowe commented on HADOOP-14138:
-

+1 patch looks good to me.  I agree that this would be a good candidate for 2.8 
and 2.7  Will commit this tomorrow if there are no objections.

Curious why the patch wasn't against trunk instead of branch-2.  Is there a 
reason this shouldn't go into trunk?  It applies (mostly) cleanly there, and I 
see the fs.s3a.impl property is still in core-default there.


> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892367#comment-15892367
 ] 

Sean Mackrory commented on HADOOP-14094:


Correct my comment :) HADOOP-14130.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892367#comment-15892367
 ] 

Sean Mackrory edited comment on HADOOP-14094 at 3/2/17 3:01 PM:


Corrected my comment :) HADOOP-14130.


was (Author: mackrorysd):
Correct my comment :) HADOOP-14130.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14094) Rethink S3GuardTool options

2017-03-02 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890384#comment-15890384
 ] 

Sean Mackrory edited comment on HADOOP-14094 at 3/2/17 3:01 PM:


Now conflicting with HADOOP-14130 changes.


was (Author: mackrorysd):
Now conflicting with HADOOP-13130 changes.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14094-HADOOP-13345.001.patch, 
> HADOOP-14094-HADOOP-13345.002.patch, HADOOP-14094-HADOOP-13345.003.patch, 
> HADOOP-14094-HADOOP-13345.003.patch, HADOOP-14094-HADOOP-13345.004.patch
>
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14137) Faster distcp by taking file list from fsimage or -lsr result

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892319#comment-15892319
 ] 

Steve Loughran commented on HADOOP-14137:
-

A few more thoughts

# the listing should be randomized before the copy begins. We've seen 
performance benefits there related to hotspots of object stores
# if distcp could also generate the listing file of everything it copies, 
including path at far end, the checksums of both source and dest, then it could 
be used for incremental copying between any two filesystems each of which 
supported any checksum mech —even when they were different between the two 
separate filesystems. Instead of verifying that dest checksum == src checksum, 
we could verify that src == cached source checksum and that dest==cached dest 
value, Any difference would trigger a copy

> Faster distcp by taking file list from fsimage or -lsr result
> -
>
> Key: HADOOP-14137
> URL: https://issues.apache.org/jira/browse/HADOOP-14137
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Reporter: Zheng Shao
>
> DistCp is very slow to start when the src directory has a huge number of 
> subdirectories.  In our case, we already have the directory listing (via 
> "hdfs oiv -i fsimage" or via nightly "hdfs dfs -lr -r /" dumps), and we would 
> like to use that instead of doing realtime listing on the NameNode.
> The "-f" option doesn't help in this case because it would try to put 
> everything into a single flat target directory.
> We'd like to introduce a new option "-list " for distcp.  The  
> contains the result of listing the src directory.
> In order to achieve this, we plan to:
> 1. Add a new CopyListing class PregeneratedCopyListing similar to 
> SimpleCopyListing which doesn't "-ls -r" into the directory, but takes the 
> listing via "-list"
> 2. Add an option "-list " which will automatically make distcp use the 
> new PregeneratedCopyListing class.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892196#comment-15892196
 ] 

Steve Loughran commented on HADOOP-14138:
-

Did run one test with Filesystem log to DEBUG, so have it print what goes on 
with discovery. 

s3 and s3n are coming in from service loader, s3a is coming in from the 
core-default file. It probably always did, given that entry existed ... that 
service load has always been unneeded
{code}
2017-03-02 12:54:07,050 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3147)) - Loading filesystems
2017-03-02 12:54:07,057 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - s3:// = class 
org.apache.hadoop.fs.s3.S3FileSystem from null
2017-03-02 12:54:07,062 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - s3n:// = class 
org.apache.hadoop.fs.s3native.NativeS3FileSystem from null
2017-03-02 12:54:07,070 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - file:// = class 
org.apache.hadoop.fs.LocalFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-common/2.9.0-SNAPSHOT/hadoop-common-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,075 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - viewfs:// = class 
org.apache.hadoop.fs.viewfs.ViewFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-common/2.9.0-SNAPSHOT/hadoop-common-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,077 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - ftp:// = class 
org.apache.hadoop.fs.ftp.FTPFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-common/2.9.0-SNAPSHOT/hadoop-common-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,079 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - har:// = class 
org.apache.hadoop.fs.HarFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-common/2.9.0-SNAPSHOT/hadoop-common-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,085 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - hdfs:// = class 
org.apache.hadoop.hdfs.DistributedFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-hdfs-client/2.9.0-SNAPSHOT/hadoop-hdfs-client-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,201 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - webhdfs:// = class 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-hdfs-client/2.9.0-SNAPSHOT/hadoop-hdfs-client-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,202 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - swebhdfs:// = class 
org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-hdfs-client/2.9.0-SNAPSHOT/hadoop-hdfs-client-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,206 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - hftp:// = class 
org.apache.hadoop.hdfs.web.HftpFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-hdfs-client/2.9.0-SNAPSHOT/hadoop-hdfs-client-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,206 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:loadFileSystems(3159)) - hsftp:// = class 
org.apache.hadoop.hdfs.web.HsftpFileSystem from 
/Users/stevel/.m2/repository/org/apache/hadoop/hadoop-hdfs-client/2.9.0-SNAPSHOT/hadoop-hdfs-client-2.9.0-SNAPSHOT.jar
2017-03-02 12:54:07,206 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3202)) - Looking for FS supporting s3a
2017-03-02 12:54:07,206 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3206)) - looking for configuration option 
fs.s3a.impl
2017-03-02 12:54:07,250 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3216)) - Filesystem s3a defined in 
configuration option
2017-03-02 12:54:07,251 [Thread-0] DEBUG fs.FileSystem 
(FileSystem.java:getFileSystemClass(3222)) - FS for s3a is class 
org.apache.hadoop.fs.s3a.S3AFileSystem

// and the actual test itself

2017-03-02 12:54:08,231 [Thread-0] INFO  contract.AbstractFSContractTestBase 
(AbstractFSContractTestBase.java:setup(184)) - Test filesystem = 
s3a://hwdev-steve-ireland-new implemented by 
S3AFileSystem{uri=s3a://hwdev-steve-ireland-new, 
workingDir=s3a://hwdev-steve-ireland-new/user/stevel, inputPolicy=normal, 
partSize=800, enableMultiObjectsDelete=true, maxKeys=5000, readAhead=65536, 
blockSize=33554432, multiPartThreshold=2147483647, 
serverSideEncryptionAlgorithm='SSE_S3', 
blockFactory=org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@be1e9c6, 
boundedExecutor=BlockingThreadPoolExecutorService{SemaphoredDelegatingExecutor{permitCount=30,
 available=30, waiting=0}, activeCount=0}, 
unboundedExecutor=java.util.concurrent.ThreadPoolExecutor@503f1b5a[Running, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], 
statistics

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892115#comment-15892115
 ] 

Steve Loughran commented on HADOOP-14138:
-

No tests needed, because removing the service entry means the only way the s3a 
fs will be found is through on-demand loading of the class when an s3a schema 
is reached. The s3a tests do exactly that

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892100#comment-15892100
 ] 

Hadoop QA commented on HADOOP-14138:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
37s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_121. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | HADOOP-14138 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855613/HADOOP-14138-branch-2-001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  |
| uname | Linux d5cc96dcf8eb 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / cacaa29 |
| Default Java | 1.7.0_121 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_121 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 |
| JDK v1.7.0_121  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11744/testReport/ |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11744/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---

[jira] [Commented] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892081#comment-15892081
 ] 

Steve Loughran commented on HADOOP-14129:
-

Does that sysprop need to be restored after?



> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch, HADOOP-14129-HADOOP-13345.003.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14137) Faster distcp by taking file list from fsimage or -lsr result

2017-03-02 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892077#comment-15892077
 ] 

Steve Loughran commented on HADOOP-14137:
-

I'd actually like the listing to be done via listFiles(path, recursive=true). 
Why: on object stores we can do the recursive listing as a flat operation, 
rather than a treewalk. Wouldn't help here, as HDFS does do the walk. Moving 
onto an image would reduce NN load, so make for happy people all round.

Not sure about the option name, something to make clear its a file, e.g 
{{--listingFile }}? Supporting hdfs as well as file:// paths could be 
useful in future: you could store the listing in HDFS, ready for next time

> Faster distcp by taking file list from fsimage or -lsr result
> -
>
> Key: HADOOP-14137
> URL: https://issues.apache.org/jira/browse/HADOOP-14137
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Reporter: Zheng Shao
>
> DistCp is very slow to start when the src directory has a huge number of 
> subdirectories.  In our case, we already have the directory listing (via 
> "hdfs oiv -i fsimage" or via nightly "hdfs dfs -lr -r /" dumps), and we would 
> like to use that instead of doing realtime listing on the NameNode.
> The "-f" option doesn't help in this case because it would try to put 
> everything into a single flat target directory.
> We'd like to introduce a new option "-list " for distcp.  The  
> contains the result of listing the src directory.
> In order to achieve this, we plan to:
> 1. Add a new CopyListing class PregeneratedCopyListing similar to 
> SimpleCopyListing which doesn't "-ls -r" into the directory, but takes the 
> listing via "-list"
> 2. Add an option "-list " which will automatically make distcp use the 
> new PregeneratedCopyListing class.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14138:

Attachment: HADOOP-14138-branch-2-001.patch

Patch 001, cuts the service entry. This just moves to on demand FS creation 
from the core-default entry

testing: s3 ireland

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14138:

Status: Patch Available  (was: Open)

> Remove S3A ref from META-INF service discovery, rely on existing core-default 
> entry
> ---
>
> Key: HADOOP-14138
> URL: https://issues.apache.org/jira/browse/HADOOP-14138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-14138-branch-2-001.patch
>
>
> As discussed in HADOOP-14132, the shaded AWS library is killing performance 
> starting all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in 
> core-default.xml, *we don't need service discovery here*
> Proposed:
> # cut the entry from 
> {/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file 
> exclusively for s3a entries
> I want this one in first as its a major performance regression, and one we 
> coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry

2017-03-02 Thread Steve Loughran (JIRA)

Steve Loughran created HADOOP-14138:
---

 Summary: Remove S3A ref from META-INF service discovery, rely on 
existing core-default entry
 Key: HADOOP-14138
 URL: https://issues.apache.org/jira/browse/HADOOP-14138
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 2.9.0
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Critical


As discussed in HADOOP-14132, the shaded AWS library is killing performance 
starting all hadoop operations, due to classloading on FS service discovery.

This is despite the fact that there is an entry for fs.s3a.impl in 
core-default.xml, *we don't need service discovery here*

Proposed:
# cut the entry from 
{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
# when HADOOP-14132 is in, move to that, including declaring an XML file 
exclusively for s3a entries

I want this one in first as its a major performance regression, and one we 
coula actually backport to 2.7.x, just to improve load time slightly there too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 103 matches

Mail list logo