[jira] [Commented] (HADOOP-13321) Deprecate FileSystem APIs that promote inefficient call patterns.

2017-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872962#comment-15872962
 ] 

Hadoop QA commented on HADOOP-13321:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 30s{color} 
| {color:red} root generated 87 new + 707 unchanged - 0 fixed = 794 total (was 
707) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 12s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-openstack in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13321 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853402/HADOOP-13321.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e6aaf270e136 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / dbbfcf7 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11654/artifact/patchprocess/diff-compile-javac-root.txt
 |
| unit | 
https://builds.apache.org/job/PreCommi

[jira] [Commented] (HADOOP-13321) Deprecate FileSystem APIs that promote inefficient call patterns.

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872945#comment-15872945
 ] 

Mingliang Liu commented on HADOOP-13321:


Thanks Steve, the point is very valid. I changed the patch.

> Deprecate FileSystem APIs that promote inefficient call patterns.
> -
>
> Key: HADOOP-13321
> URL: https://issues.apache.org/jira/browse/HADOOP-13321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.0.0-alpha3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13321.000.patch, HADOOP-13321.001.patch, 
> HADOOP-13321.002.patch, HADOOP-13321.003.patch
>
>
> {{FileSystem}} contains several methods that act as convenience wrappers over 
> calling {{getFileStatus}} and retrieving a single property of the returned 
> {{FileStatus}}.  These methods have a habit of fostering inefficient call 
> patterns in applications, resulting in multiple redundant {{getFileStatus}} 
> calls.  For HDFS, this translates into wasteful NameNode RPC traffic.  For 
> file systems backed by cloud object stores, this translates into wasteful 
> HTTP traffic.  This issue proposes to deprecate these methods and instead 
> encourage applications to call {{getFileStatus}} and then reuse the same 
> {{FileStatus}} instance as needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872936#comment-15872936
 ] 

Mingliang Liu commented on HADOOP-13345:


Thanks [~fabbri] for prompt reviewing test report!

{quote}
Thats ok.. It does miss ITestS3Guard\{ListConsistency, ToolDynamoDB\}
{quote}
That's a good catch. I just learned these two tests. However, 
{{org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency}} fails before/after 
merge. Do I need to configure something special?
{code}
cmvn -Dit.test='ITestS3Guard*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean 
verify

---
 T E S T S
---
Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 14.992 sec <<< 
FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  
Time elapsed: 13.552 sec  <<< FAILURE!
java.lang.AssertionError: Unexpected number of results from metastore. 
Metastore should only know about /XYZ: 
DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack, 
listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ;
 isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}, 
s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123;
 isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}}, 
isAuthoritative=false}
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
{code}

{quote}
Curious, what is our difference in HADOOP-13345 that changes this? Is our 
feature branch exception behavior different?
{quote}
We don't really change anything in that part. I guess the reason is that, when 
enabling S3Guard, the code path that fails in S3AFileSystem changes for that 
test somehow. For example (to be confirmed), the request w/o S3Guard was 
calling {{getFileStatus()}} and fails with access denied exception containing 
"Forbidden" keyword; while the request w/ S3Guard is able to call 
{{getFileStatus()}} and fails later with read operations, which then fails with 
access denied exception containing "Access Denied" keyword. So I think relaxing 
exception message assertion in test should work just fine.

> S3Guard: Improved Consistency for S3A
> -
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13321) Deprecate FileSystem APIs that promote inefficient call patterns.

2017-02-17 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-13321:
---
Attachment: HADOOP-13321.003.patch

> Deprecate FileSystem APIs that promote inefficient call patterns.
> -
>
> Key: HADOOP-13321
> URL: https://issues.apache.org/jira/browse/HADOOP-13321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.0.0-alpha3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13321.000.patch, HADOOP-13321.001.patch, 
> HADOOP-13321.002.patch, HADOOP-13321.003.patch
>
>
> {{FileSystem}} contains several methods that act as convenience wrappers over 
> calling {{getFileStatus}} and retrieving a single property of the returned 
> {{FileStatus}}.  These methods have a habit of fostering inefficient call 
> patterns in applications, resulting in multiple redundant {{getFileStatus}} 
> calls.  For HDFS, this translates into wasteful NameNode RPC traffic.  For 
> file systems backed by cloud object stores, this translates into wasteful 
> HTTP traffic.  This issue proposes to deprecate these methods and instead 
> encourage applications to call {{getFileStatus}} and then reuse the same 
> {{FileStatus}} instance as needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872867#comment-15872867
 ] 

Aaron Fabbri commented on HADOOP-13345:
---

Thanks [~liuml07]!

{quote}
$ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean 
verify
{quote}
Thats ok.. It does miss ITestS3Guard{ListConsistency, ToolDynamoDB}, FYI, but 
you got most of the tests.

{quote}
2. ITestS3ACredentialsInURL#testInstantiateFromURL is not supported. Should we 
simply skip this test if the metadata store is enabled (in a separate JIRA)?
{quote}

Yes.  Nothing new here and we do need to fix it.

{quote}
3. ITestS3AEncryptionSSEC started failing after merge because of the strict 
exception message assertion; it is fine in trunk. The only change is to remove 
"Forbidden" word as it would be "Access Denied" sometimes along with the same 
exception class java.nio.file.AccessDeniedException and message Service: Amazon 
S3; Status Code: 403; Error Code: AccessDenied;. For this I made the change 
when merging.
{quote}

Sounds ok.  Curious, what is our difference in HADOOP-13345 that changes this?  
Is our feature branch exception behavior different?

> S3Guard: Improved Consistency for S3A
> -
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872859#comment-15872859
 ] 

Mingliang Liu commented on HADOOP-13345:


Thanks for your comments.

The summary of test report is:
{code}
$ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean 
verify

Results :

Failed tests:
  ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 
Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got 
unexpected exception:java.nio.file.AccessDeniedException: 
s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: 
Reopen at position 0 on 
s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: 
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: 
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 
8A23739237751886), S3 Extended Request ID: 
BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY=
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374)
at java.io.DataInputStream.read(DataInputStream.java:149)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.readDataset(ContractTestUtils.java:180)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.verifyFileContents(ContractTestUtils.java:204)
at 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.lambda$testCreateFileAndReadWithDifferentEncryptionKey$4(ITestS3AEncryptionSSEC.java:80)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:346)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:418)
at 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey(ITestS3AEncryptionSSEC.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied 
(Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 
8A23739237751886), S3 Extended Request ID: 
BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY=
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1586)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1254)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132)
at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1373)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:158)
... 21 more


Tests in error:
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: 
...
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->

[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872846#comment-15872846
 ] 

Aaron Fabbri commented on HADOOP-14041:
---

Thanks for the follow-up patch [~mackrorysd].  Looks good.. Of the comments 
below, I think the important ones are the prune() method prototype, and errors 
going to stderr.

{noformat}
+  public void testPruneDirs() throws Exception {
+// This test does not necessarily define required behavior: directories
+// that become empty after a prune operation could be cleaned up, but
+// currently they don't because if a file was created in that directory
+// mid-prune, it would violate the invariant that all ancestors of a file
{noformat}

Tiny nit: this invariant is an implementation detail of the dynamo MS.  Not a 
MetadataStore invariant per se.  Could mention the word dynamo here.

{noformat}
+// exist in the metastore. If an implementation could satisfy this, it
+// would be okay for this test not to pass.
+Assume.assumeFalse(ms instanceof NullMetadataStore);
+createNewDirs("/pruneDirs/dir");
{noformat}

Did you mean to change this Assume to call {{supportsPruning()}}?
Technically, seems like you should use that, and maybe {{allowMissing()}}?  
Basically, when allowMissing() returns true, the metadata store may not return 
results you just put into it (like a cache where something got evicted before 
you asked for it again).

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStore.java
@@ -165,4 +165,15 @@ void move(Collection pathsToDelete, 
Collection
* @throws IOException if there is an error
*/
   void destroy() throws IOException;
+
+  /**
+   * Clear any metadata older than a specified time from the repository. Note
+   * that modification times should be in UTC, as returned by System
+   * .currentTimeMillis at the time of modification.
+   *
+   * @param modTime Oldest modification time to allow
+   * @throws IOException if there is an error
+   * @throws InterruptedException if the process is interrupted
+   */
+  void prune(long modTime) throws InterruptedException, IOException;
 }
{noformat}
Couple of things:
1. We should mention here that implementations:  *must* clear any file metadata 
older than modTime, *may* clear any directory metadata older than modTime, and 
throw an UnsupportedOperationException(*) otherwise?
2. Instead of declaring a checked exception (InterruptedException), IMO, that 
should always be wrapped in an IOException.. So this should only be throws 
IOException.

(*) [~ste...@apache.org] is this the idiomatic thing to do here in Hadoop?

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/NullMetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/NullMetadataStore.java
@@ -87,6 +87,10 @@ public void destroy() throws IOException {
   }
  
   @Override
+  public void prune(long modTime) throws IOException {
+  }
+
{noformat}
Love the algorithm here.   Classic no-op, my fave.

{noformat}
+  if (confDelta <= 0 && cliDelta <= 0) {
+System.out.println(
+"You must specify a positive age for metadata to prune.");
+  }
+
{noformat}
I think this should go to stderr (search for "stderr" 
[here|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html]).

{noformat}
--- 
a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
+++ 
b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
@@ -51,6 +51,12 @@ public boolean allowMissing() {
 return true;
   }
  
+  /** This MetadataStore won't store anything, so there's nothing to prune. */
+  @Override
+  public boolean supportsPruning() {
+return false;
+  }
{noformat}

This part of the change could be left out, I think?  NullMetadataStore always 
prunes!  Where prune is defined as removing anything older than X.. always true 
for empty set.  :-)

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extend

[jira] [Updated] (HADOOP-13923) Allow changing password on JavaKeyStoreProvider generated keystores

2017-02-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-13923:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Created HADOOP-14095 for the doc follow up. Resolving this. Thanks, Larry!

> Allow changing password on JavaKeyStoreProvider generated keystores 
> 
>
> Key: HADOOP-13923
> URL: https://issues.apache.org/jira/browse/HADOOP-13923
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13923.01.patch
>
>
> {{JavaKeyStoreProvider}} generates a jceks keystore file for key storage. 
> Although we have different fall backs in {{ProviderUtils#locatePassword}} to 
> specify the keystore password, it appears the password itself can never be 
> changed after generation.
> This jira is to make it possible to change the keystore password.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14095) Document caveats about the default JavaKeyStoreProvider in KMS

2017-02-17 Thread Xiao Chen (JIRA)
Xiao Chen created HADOOP-14095:
--

 Summary: Document caveats about the default JavaKeyStoreProvider 
in KMS
 Key: HADOOP-14095
 URL: https://issues.apache.org/jira/browse/HADOOP-14095
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation, kms
Affects Versions: 2.6.0
Reporter: Xiao Chen
Assignee: Xiao Chen


KMS doc provides and example to use JavaKeyStoreProvider. But we should 
document the caveats of using it and setting it up, specifically about keystore 
passwords.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13923) Allow changing password on JavaKeyStoreProvider generated keystores

2017-02-17 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872820#comment-15872820
 ] 

Larry McCay commented on HADOOP-13923:
--

That sounds reasonable to me - either way works in my mind.


> Allow changing password on JavaKeyStoreProvider generated keystores 
> 
>
> Key: HADOOP-13923
> URL: https://issues.apache.org/jira/browse/HADOOP-13923
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13923.01.patch
>
>
> {{JavaKeyStoreProvider}} generates a jceks keystore file for key storage. 
> Although we have different fall backs in {{ProviderUtils#locatePassword}} to 
> specify the keystore password, it appears the password itself can never be 
> changed after generation.
> This jira is to make it possible to change the keystore password.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13923) Allow changing password on JavaKeyStoreProvider generated keystores

2017-02-17 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872813#comment-15872813
 ] 

Xiao Chen commented on HADOOP-13923:


Thanks for the prompt response Larry.

The Keystore Passwords section reads good, and makes sense. The intention of 
this jira was more KMS focused, since it can use the java keystore as the 
backing key provider. I understand the more general concern of changing jksp 
now.

So, how about close this exact jira out as {{won't fix}}, and let it known for 
future searches. And file a new jira documenting these reasonings?

> Allow changing password on JavaKeyStoreProvider generated keystores 
> 
>
> Key: HADOOP-13923
> URL: https://issues.apache.org/jira/browse/HADOOP-13923
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13923.01.patch
>
>
> {{JavaKeyStoreProvider}} generates a jceks keystore file for key storage. 
> Although we have different fall backs in {{ProviderUtils#locatePassword}} to 
> specify the keystore password, it appears the password itself can never be 
> changed after generation.
> This jira is to make it possible to change the keystore password.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13923) Allow changing password on JavaKeyStoreProvider generated keystores

2017-02-17 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872788#comment-15872788
 ] 

Larry McCay commented on HADOOP-13923:
--

In general, I agree that it is not worth the trouble to add the change password 
API.
I don't exactly agree on the following statements through.

bq. Idea on adding a move functionality to migrate keyprovider works, and I 
like that idea.  But feels this is a parallel feature. From admin's POV, 
changing a keystore password would then require to: setup a new keyprovider 
service, migrate, change all client configs to point to the new keyprovider.

You don't have to change client configs if you just rename the keystore. :)

bq. I think we can document hard that jksp isn't supposed to be used anywhere 
outside of dev/poc, to discourage its use... and use this patch to let who's 
running on jksp change there password to something other than the default 
'none'.

I disagree here. It is perfectly legitimate to use a java keystore provider but 
folks should be aware of the details of doing so.
Just as in the use of the same for the Credential Provider API, the keystore 
password is only a formality of persistence. The actual protection of the key 
is in the proper use of file permissions. I wouldn't be opposed to describing 
the use of KMS as a stronger option and describe why this is so in a similar 
set of docs.

The following documentation attempts to communicate these details with enough 
fidelity to make an informed decision for credential provider approaches: 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Credential_Management

See the provider types and then the keystore management sections.

Pursuing proper Key Provider API documentation is certainly worth doing.

> Allow changing password on JavaKeyStoreProvider generated keystores 
> 
>
> Key: HADOOP-13923
> URL: https://issues.apache.org/jira/browse/HADOOP-13923
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13923.01.patch
>
>
> {{JavaKeyStoreProvider}} generates a jceks keystore file for key storage. 
> Although we have different fall backs in {{ProviderUtils#locatePassword}} to 
> specify the keystore password, it appears the password itself can never be 
> changed after generation.
> This jira is to make it possible to change the keystore password.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13923) Allow changing password on JavaKeyStoreProvider generated keystores

2017-02-17 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872723#comment-15872723
 ] 

Xiao Chen commented on HADOOP-13923:


Hi [~lmccay],
Thanks for the earlier discussions again. Looking at this again 'soon' after 
last comment, and I'm still reluctant to add a {{changePassword}} API, for the 
following reasons.
- Adding such an API to KeyProvider makes sense in general. But to make it work 
with a {{JavaKeyStoreProvider}}, besides {{KeyShell}}, we also need to change 
the KMS which is what uses it now. For KMS, we'll need to add that interface 
all the way from {{KMSClientProvider}} to {{KMS}} server, where the 
communication is via http rest api. Although all communications are supposed to 
be encrypted, this poses new security risks..
- Also need carefully add a new KMS ACL to control this {{changePassword}} 
operation, complicating the already complex KMS ACLs. KMS ACLs now all have 2 
levels: kms level and key level. This new operation is only kms level but not 
key level, further complicating things.
- Real production keystores shouldn't be JKSP, so the KMS rest api should not 
be used. But simply being there is a confusion, and if some admin mistakenly 
called that api with a password, they may leak sensitive information.
- Current patch doesn't have compatibility issue, because it falls back to the 
old format.
- Idea on adding a {{move}} functionality to migrate keyprovider works, and I 
like that idea. :) But feels this is a parallel feature. From admin's POV, 
changing a keystore password would then require to: setup a new keyprovider 
service, migrate, change all client configs to point to the new keyprovider. 

I think we can document hard that jksp isn't supposed to be used anywhere 
outside of dev/poc, to discourage its use... and use this patch to let who's 
running on jksp change there password to something other than the default 
'none'.

Thoughts?

> Allow changing password on JavaKeyStoreProvider generated keystores 
> 
>
> Key: HADOOP-13923
> URL: https://issues.apache.org/jira/browse/HADOOP-13923
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HADOOP-13923.01.patch
>
>
> {{JavaKeyStoreProvider}} generates a jceks keystore file for key storage. 
> Although we have different fall backs in {{ProviderUtils#locatePassword}} to 
> specify the keystore password, it appears the password itself can never be 
> changed after generation.
> This jira is to make it possible to change the keystore password.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872699#comment-15872699
 ] 

Aaron Fabbri commented on HADOOP-14094:
---

Yes, s3a as the subcommand is confusing.  e.g. {{hadoop s3a init}}, {{hadoop 
s3a destroy}}.. fsck, etc...   Especially since we want some s3a-only commands 
in the near future, e.g. HADOOP-14007.

Three options I thought of:
{{hadoop s3guard }}
{{hadoop s3a s3guard }}
{{hadoop s3a [s3guard-init | s3guard-destroy | ...]}}

Any thoughts on implementation complexity for different options?  I'm fine with 
any of these to be honest.

What do other folks think?

As for the -Dkey=value option not working, i suppose we should look at why we 
don't seem to be getting the generic options parsing. The basic command changes 
feel like higher priority though, as adding support for generic command options 
later should not break compatibility AFAICT.  So maybe that can be followup 
JIRA if needed?

IMO any syntax / option changes, along with consistent behavior of options 
(endpoint etc) is the big priority at the moment.


> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872689#comment-15872689
 ] 

Aaron Fabbri commented on HADOOP-14090:
---

+1.. I'll add that I originally suggested we add support for Region, so thank 
you for wasting your time on my behalf.  :-)

Agree dynamodb client init code could use some refactoring. Higher priority is 
getting settled on good CLI commands  and behavior because that is harder to 
change in the future.

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory resolved HADOOP-14090.

Resolution: Won't Fix
  Assignee: Sean Mackrory

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872655#comment-15872655
 ] 

Sean Mackrory commented on HADOOP-14090:


Just had a quick offline chat with [~fabbri] about this and changed my thinking 
a little bit. Some key points (not all of which directly relate, but that are 
important to know):
* Initially in the patch I wasn't clear on if specifying a region and a 
specific endpoint together was ever valid. Having reviewed the docs, it does 
seem that regions are a convenient abstraction to look up endpoints, so they 
should be treated as mutually exclusive, and in the end this change would just 
be adding a very similar option and making the code more complex.
* DynamoDB table names are not globally unique, unlike S3 buckets, so there are 
a few more pain points if you DON'T specify the endpoint or region.
* Regions are slightly more intuitive for many AWS users, but endpoints are not 
very hard to look up, and may be required for more advanced users.

So over all, I'll just withdraw this suggestion / patch, I think. There's 
definitely some clean up I'd like to do in this code: in the functions that 
call DynamoDBClientFactory.createDynamoDBClient(URI, String), there are several 
layers in which the region name might be extracted from the bucket that appear 
unnecessary), and I think we need to switch to full-word flags in S3GuardTool 
anyway (see HADOOP-14094), but I think we should just allow the endpoint to be 
set like we currently do for S3 too.

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-02-17 Thread Steven Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872658#comment-15872658
 ] 

Steven Rand commented on HADOOP-14062:
--

For me the tests also succeed if I comment out that code, but I think it's only 
because the new test happens to run last. When I add 
{{@FixMethodOrder(MethodSorters.NAME_ASCENDING)}} to the class, the test that 
runs immediately after the new test ({{testAllocationWithBlacklist}}) fails if 
I comment out that code. I think it's because that test calls 
{{amClient.init(conf)}}, and since we call 
{{conf.unset(CommonConfigurationKeysPublic.HADOOP_RPC_PROTECTION);}} in the new 
test, there's a mismatch between the client and the server.

So I think the two options are:

1. Don't remove the code in your above comment
2. Do remove that code, but also remove 
{{conf.unset(CommonConfigurationKeysPublic.HADOOP_RPC_PROTECTION);}}. When I do 
this all tests succeed regardless of order. 

[~jianhe] I'll defer to you on which option you prefer. Both seem okay to me. 
The first is a smaller change, since if we do the second, all tests after 
{{testAMRMClientWithSaslEncryption}} run with SASL RPC.

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(P

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872640#comment-15872640
 ] 

Mingliang Liu commented on HADOOP-13914:


Thanks for helping me out (and the doc/test patch). I didn't upload the patch 
before vacation because the test was still failing. I did some analysis and 
will post comment here soon.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Mingliang Liu
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872633#comment-15872633
 ] 

Aaron Fabbri commented on HADOOP-13914:
---

Ok.. I don't want to step on your toes [~liuml07].  I was not sure if you were 
working on it or not.I am here to help.   I'd like to get this JIRA 
resolved soon.

There are some interesting design decisions here so I look forward to 
discussing more. Feel free to post in-progress / RFC patches if you want. 
Thanks and welcome back!

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Mingliang Liu
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14062) ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when RPC privacy is enabled

2017-02-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872602#comment-15872602
 ] 

Jian He commented on HADOOP-14062:
--

After I removed these code, the test seems also pass, do we need these ?
{code}
tearDown();
createClientAndCluster(conf);
// unless we start an application the cancelApp() method will fail when
// it runs after this test
startApp();
{code}

> ApplicationMasterProtocolPBClientImpl.allocate fails with EOFException when 
> RPC privacy is enabled
> --
>
> Key: HADOOP-14062
> URL: https://issues.apache.org/jira/browse/HADOOP-14062
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Steven Rand
>Priority: Critical
> Attachments: HADOOP-14062.001.patch, HADOOP-14062.002.patch, 
> HADOOP-14062.003.patch, HADOOP-14062-branch-2.8.0.004.patch, 
> HADOOP-14062-branch-2.8.0.005.patch, yarn-rm-log.txt
>
>
> When privacy is enabled for RPC (hadoop.rpc.protection = privacy), 
> {{ApplicationMasterProtocolPBClientImpl.allocate}} sometimes (but not always) 
> fails with an EOFException. I've reproduced this with Spark 2.0.2 built 
> against latest branch-2.8 and with a simple distcp job on latest branch-2.8.
> Steps to reproduce using distcp:
> 1. Set hadoop.rpc.protection equal to privacy
> 2. Write data to HDFS. I did this with Spark as follows: 
> {code}
> sc.parallelize(1 to (5*1024*1024)).map(k => Seq(k, 
> org.apache.commons.lang.RandomStringUtils.random(1024, 
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWxyZ0123456789")).mkString("|")).toDF().repartition(100).write.parquet("hdfs:///tmp/testData")
> {code}
> 3. Attempt to distcp that data to another location in HDFS. For example:
> {code}
> hadoop distcp -Dmapreduce.framework.name=yarn hdfs:///tmp/testData 
> hdfs:///tmp/testDataCopy
> {code}
> I observed this error in the ApplicationMaster's syslog:
> {code}
> 2016-12-19 19:13:50,097 INFO [eventHandlingThread] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer 
> setup for JobId: job_1482189777425_0004, File: 
> hdfs://:8020/tmp/hadoop-yarn/staging//.staging/job_1482189777425_0004/job_1482189777425_0004_1.jhist
> 2016-12-19 19:13:51,004 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-19 19:13:51,031 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1482189777425_0004: ask=1 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=3
> 2016-12-19 19:13:52,043 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking 
> ApplicationMasterProtocolPBClientImpl.allocate over null. Retrying after 
> sleeping for 3ms.
> java.io.EOFException: End of File Exception between local host is: 
> "/"; destination host is: "":8030; 
> : java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1486)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1428)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1338)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy80.allocate(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationH

[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872582#comment-15872582
 ] 

Mingliang Liu commented on HADOOP-13914:


[~fabbri] I came back to work this week. I'll wrap up the current patch 
boolean->tristate and upload that one before you take it over if necessary.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Mingliang Liu
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872436#comment-15872436
 ] 

Hadoop QA commented on HADOOP-14041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
10s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
0s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
47s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} HADOOP-13345 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 37s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.sftp.TestSFTPFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14041 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853293/HADOOP-14041-HADOOP-13345.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux a931b4ee63e0 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HADOOP-13345 / 8b37b6a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11653/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11653/testReport/ |
| modules | C: hadoop-comm

[jira] [Commented] (HADOOP-13904) DynamoDBMetadataStore to handle DDB throttling failures through retry policy

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872401#comment-15872401
 ] 

Aaron Fabbri commented on HADOOP-13904:
---

Jenkins above is clean except HADOOP-14030 (TestKDiag failure) which should be 
unrelated.

> DynamoDBMetadataStore to handle DDB throttling failures through retry policy
> 
>
> Key: HADOOP-13904
> URL: https://issues.apache.org/jira/browse/HADOOP-13904
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Aaron Fabbri
> Attachments: HADOOP-13904-HADOOP-13345.001.patch, 
> HADOOP-13904-HADOOP-13345.002.patch, HADOOP-13904-HADOOP-13345.003.patch, 
> screenshot-1.png
>
>
> When you overload DDB, you get error messages warning of throttling, [as 
> documented by 
> AWS|http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html#Programming.Errors.MessagesAndCodes]
> Reduce load on DDB by doing a table lookup before the create, then, in table 
> create/delete operations and in get/put actions, recognise the error codes 
> and retry using an appropriate retry policy (exponential backoff + ultimate 
> failure) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872387#comment-15872387
 ] 

Aaron Fabbri commented on HADOOP-13914:
---

Hi [~liuml07] I've started working on a patch for this.  Wanted to make sure we 
don't duplicate effort.  I'll reassign this to me if you are OK with it.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> 
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Aaron Fabbri
>Assignee: Mingliang Liu
> Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872384#comment-15872384
 ] 

Aaron Fabbri commented on HADOOP-13345:
---

Once we figure out the SSE test failure I'm +1 to do a merge.  Looks like 
exception behavior is different in trunk?  Or the trunk test is also broken?

Also if you can paste a summary of tests (cli command used, number tests run / 
error / etc), if you still have it handy, that would be awesome.  Trying to 
make sure this branch is stable.

> S3Guard: Improved Consistency for S3A
> -
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-02-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872326#comment-15872326
 ] 

Sean Mackrory commented on HADOOP-14094:


So long as we're doing that, we should ensure that the outcome of the command 
is printed properly. Often you'll just the log message that the metastore was 
"initialized", but really that means the client connection was established 
while preparing for the rest of the command, so it seems like nothing really 
got done.

CC'ing [~ste...@apache.org], [~fabbri] and [~liuml07] for any opinions, since 
these issues have come  up in other JIRAs we've discussed.

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14092) Typo in hadoop-aws index.md

2017-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872267#comment-15872267
 ] 

Hadoop QA commented on HADOOP-14092:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-14092 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853292/HADOOP-14092.001.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 783dc545bb14 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 4c26c24 |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11652/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Typo in hadoop-aws index.md
> ---
>
> Key: HADOOP-14092
> URL: https://issues.apache.org/jira/browse/HADOOP-14092
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-14092.001.patch
>
>
> In section {{Testing against different regions}}, {{contract-tests.xml}} 
> should be {{contract-test-options.xml}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872249#comment-15872249
 ] 

Sean Mackrory commented on HADOOP-14090:


{quote}-1 to TODOs in code. Better to add a new JIRA item rather than leave 
comments in the source with will inevitably get forgotten about. JIRA should be 
where the entire TODO list should be documented.{quote}

Yeah - sorry I wasn't more clear on this, but I didn't mean for folks to start 
reviewing. [~fabbri] was finding some flaring issues with my prune patch, so I 
wanted to stash this somewhere someone else could pick it up if they wanted to 
while I gave that some more due diligence. I agree the patch as I put it up is 
not ready to go :)

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14041:
---
Attachment: HADOOP-14041-HADOOP-13345.006.patch

Thanks for the reviews, all - good stuff.

The problems [~fabbri] saw boil down to 2 things, one of which I fixed: I had 
not tested this with anything being inferred from an S3 path, and I wasn't 
trying to parse and use that like the other commands. That is now fixed and 
added to the tests. The other thing is that it appears to not be parsing 
generic options (which does indeed seem wrong - according to the docs, if you 
implement Tool you should get that for free - and we do), but the behavior 
wouldn't be what you expect anyway because it will set the table config based 
on the -m flag or the S3 path you provide. I think the CLI behavior is badly 
defined here in general, so I've filed HADOOP-14094 to really rethink what 
options are exposed and how.

I like [~ste...@apache.org]'s recommendation to just throw the IOException. I 
think what I was thinking was that if there's an issue deleting one row, we can 
keep retrying the others. But I think an exception that affects one row but not 
subsequent others is probably unlikely, so it's worth bubbling that up so we 
know about the problem. Also, removing that block highlighted that my batching 
logic was bad: instead of processing complete batches inside the loop and 
processing whatever is left over afterwards, I was effectively always 
processing whatever contents the batch had at the end of each iteration. That's 
been fixed, and I tested the number of events was correct with several hundred 
objects getting pruned.

On a related note, I also changed the log message to INFO and had it count 
items and report batch size rather than just the number of batches. Without 
that the last message you get out-of-the-box on the CLI is that the metastore 
has been initialized, which is misleading. It will now log when the 
metadatastore connection has been initialized and then finish off by logging 
how many items were deleted and what he batch size was. I think that's more 
friendly: and probably something we want to do more of for the other commands 
if / when we rethink the interface.

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch, 
> HADOOP-14041-HADOOP-13345.006.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14092) Typo in hadoop-aws index.md

2017-02-17 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-14092:

Assignee: John Zhuge
  Status: Patch Available  (was: Open)

> Typo in hadoop-aws index.md
> ---
>
> Key: HADOOP-14092
> URL: https://issues.apache.org/jira/browse/HADOOP-14092
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-14092.001.patch
>
>
> In section {{Testing against different regions}}, {{contract-tests.xml}} 
> should be {{contract-test-options.xml}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14092) Typo in hadoop-aws index.md

2017-02-17 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-14092:

Attachment: HADOOP-14092.001.patch

Patch 001
* Fix the typo

> Typo in hadoop-aws index.md
> ---
>
> Key: HADOOP-14092
> URL: https://issues.apache.org/jira/browse/HADOOP-14092
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Priority: Trivial
>  Labels: newbie
> Attachments: HADOOP-14092.001.patch
>
>
> In section {{Testing against different regions}}, {{contract-tests.xml}} 
> should be {{contract-test-options.xml}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872212#comment-15872212
 ] 

Xiao Chen commented on HADOOP-13805:


Thanks [~yzhangal], [~jojochuang] for covering me up and working on this, and 
thanks [~tucu00] for reporting and reviewing!

> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha2
>Reporter: Alejandro Abdelnur
>Assignee: Xiao Chen
> Fix For: 3.0.0-alpha3
>
> Attachments: HADOOP-13805.006.patch, HADOOP-13805.007.patch, 
> HADOOP-13805.008.patch, HADOOP-13805.009.patch, HADOOP-13805.010.patch, 
> HADOOP-13805.01.patch, HADOOP-13805.02.patch, HADOOP-13805.03.patch, 
> HADOOP-13805.04.patch, HADOOP-13805.05.patch
>
>
> HADOOP-13558 intention was to avoid UGI from trying to renew the TGT when the 
> UGI is created from an existing Subject as in that case the keytab is not 
> 'own' by UGI but by the creator of the Subject.
> In HADOOP-13558 we introduced a new private UGI constructor 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}} and 
> we use with TRUE only when doing a {{UGI.loginUserFromSubject()}}.
> The problem is, when we call {{UGI.getCurrentUser()}}, and UGI was created 
> via a Subject (via the {{UGI.loginUserFromSubject()}} method), we call {{new 
> UserGroupInformation(subject)}} which will delegate to 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}}  and 
> that will use externalKeyTab == *FALSE*. 
> Then the UGI returned by {{UGI.getCurrentUser()}} will attempt to login using 
> a non-existing keytab if the TGT expired.
> This problem is experienced in {{KMSClientProvider}} when used by the HDFS 
> filesystem client accessing an an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872209#comment-15872209
 ] 

Hudson commented on HADOOP-13805:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11273 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11273/])
HADOOP-13805. UGI.getCurrentUser() fails if user does not have a keytab 
(yzhang: rev 4c26c241ad2b907dc02cecefa9846cbe2b0465ba)
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestUserGroupInformation.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestUGIWithMiniKdc.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java


> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha2
>Reporter: Alejandro Abdelnur
>Assignee: Xiao Chen
> Fix For: 3.0.0-alpha3
>
> Attachments: HADOOP-13805.006.patch, HADOOP-13805.007.patch, 
> HADOOP-13805.008.patch, HADOOP-13805.009.patch, HADOOP-13805.010.patch, 
> HADOOP-13805.01.patch, HADOOP-13805.02.patch, HADOOP-13805.03.patch, 
> HADOOP-13805.04.patch, HADOOP-13805.05.patch
>
>
> HADOOP-13558 intention was to avoid UGI from trying to renew the TGT when the 
> UGI is created from an existing Subject as in that case the keytab is not 
> 'own' by UGI but by the creator of the Subject.
> In HADOOP-13558 we introduced a new private UGI constructor 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}} and 
> we use with TRUE only when doing a {{UGI.loginUserFromSubject()}}.
> The problem is, when we call {{UGI.getCurrentUser()}}, and UGI was created 
> via a Subject (via the {{UGI.loginUserFromSubject()}} method), we call {{new 
> UserGroupInformation(subject)}} which will delegate to 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}}  and 
> that will use externalKeyTab == *FALSE*. 
> Then the UGI returned by {{UGI.getCurrentUser()}} will attempt to login using 
> a non-existing keytab if the TGT expired.
> This problem is experienced in {{KMSClientProvider}} when used by the HDFS 
> filesystem client accessing an an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872184#comment-15872184
 ] 

Yongjun Zhang commented on HADOOP-13805:


I committed to trunk.

Thanks [~xiaochen] / [~jojochuang] for the earlier work, and [~tucu00] for the 
review.


> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha2
>Reporter: Alejandro Abdelnur
>Assignee: Xiao Chen
> Fix For: 3.0.0-alpha3
>
> Attachments: HADOOP-13805.006.patch, HADOOP-13805.007.patch, 
> HADOOP-13805.008.patch, HADOOP-13805.009.patch, HADOOP-13805.010.patch, 
> HADOOP-13805.01.patch, HADOOP-13805.02.patch, HADOOP-13805.03.patch, 
> HADOOP-13805.04.patch, HADOOP-13805.05.patch
>
>
> HADOOP-13558 intention was to avoid UGI from trying to renew the TGT when the 
> UGI is created from an existing Subject as in that case the keytab is not 
> 'own' by UGI but by the creator of the Subject.
> In HADOOP-13558 we introduced a new private UGI constructor 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}} and 
> we use with TRUE only when doing a {{UGI.loginUserFromSubject()}}.
> The problem is, when we call {{UGI.getCurrentUser()}}, and UGI was created 
> via a Subject (via the {{UGI.loginUserFromSubject()}} method), we call {{new 
> UserGroupInformation(subject)}} which will delegate to 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}}  and 
> that will use externalKeyTab == *FALSE*. 
> Then the UGI returned by {{UGI.getCurrentUser()}} will attempt to login using 
> a non-existing keytab if the TGT expired.
> This problem is experienced in {{KMSClientProvider}} when used by the HDFS 
> filesystem client accessing an an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HADOOP-13805:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha3
   Status: Resolved  (was: Patch Available)

> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha2
>Reporter: Alejandro Abdelnur
>Assignee: Xiao Chen
> Fix For: 3.0.0-alpha3
>
> Attachments: HADOOP-13805.006.patch, HADOOP-13805.007.patch, 
> HADOOP-13805.008.patch, HADOOP-13805.009.patch, HADOOP-13805.010.patch, 
> HADOOP-13805.01.patch, HADOOP-13805.02.patch, HADOOP-13805.03.patch, 
> HADOOP-13805.04.patch, HADOOP-13805.05.patch
>
>
> HADOOP-13558 intention was to avoid UGI from trying to renew the TGT when the 
> UGI is created from an existing Subject as in that case the keytab is not 
> 'own' by UGI but by the creator of the Subject.
> In HADOOP-13558 we introduced a new private UGI constructor 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}} and 
> we use with TRUE only when doing a {{UGI.loginUserFromSubject()}}.
> The problem is, when we call {{UGI.getCurrentUser()}}, and UGI was created 
> via a Subject (via the {{UGI.loginUserFromSubject()}} method), we call {{new 
> UserGroupInformation(subject)}} which will delegate to 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}}  and 
> that will use externalKeyTab == *FALSE*. 
> Then the UGI returned by {{UGI.getCurrentUser()}} will attempt to login using 
> a non-existing keytab if the TGT expired.
> This problem is experienced in {{KMSClientProvider}} when used by the HDFS 
> filesystem client accessing an an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14094) Rethink S3GuardTool options

2017-02-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872161#comment-15872161
 ] 

Sean Mackrory commented on HADOOP-14094:


Another issue is that we accept an S3 patch, and it could make sense for some 
commands to operate on subdirectories if they are provided (e.g. import) but we 
don't do that - we just get the filesystem the corresponds to that URL and then 
operate on the entire filesystem. I feel like that could be made clearer (or we 
could actually operate on a subset of the filesystem when relevant).

> Rethink S3GuardTool options
> ---
>
> Key: HADOOP-14094
> URL: https://issues.apache.org/jira/browse/HADOOP-14094
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>
> I think we need to rework the S3GuardTool options. A couple of problems I've 
> observed in the patches I've done on top of that and seeing other developers 
> trying it out:
> * We should probably wrap the current commands in an S3Guard-specific 
> command, since 'init', 'destroy', etc. don't touch the buckets at all.
> * Convert to whole-word options, as the single-letter options are already 
> getting overloaded. Some patches I've submitted have added functionality 
> where the obvious flag is already in use (e.g. -r for region, and read 
> throughput, -m for minutes, and metadatastore uri).  I may do this early as 
> part of HADOOP-14090.
> * We have some options that must be in the config in some cases, and can be 
> in the command in other cases. But I've seen someone try to specify the table 
> name in the config and leave out the -m option, with no luck. Also, since 
> commands hard-code table auto-creation, you might have configured table 
> auto-creation, try to import to a non-existent table, and it tells you table 
> auto-creation is off.
> We need a more consistent policy for how things should get configured that 
> addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872142#comment-15872142
 ] 

Yongjun Zhang commented on HADOOP-13805:


The test failures are pre-existing and reported as HADOOP-14030.



> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha2
>Reporter: Alejandro Abdelnur
>Assignee: Xiao Chen
> Attachments: HADOOP-13805.006.patch, HADOOP-13805.007.patch, 
> HADOOP-13805.008.patch, HADOOP-13805.009.patch, HADOOP-13805.010.patch, 
> HADOOP-13805.01.patch, HADOOP-13805.02.patch, HADOOP-13805.03.patch, 
> HADOOP-13805.04.patch, HADOOP-13805.05.patch
>
>
> HADOOP-13558 intention was to avoid UGI from trying to renew the TGT when the 
> UGI is created from an existing Subject as in that case the keytab is not 
> 'own' by UGI but by the creator of the Subject.
> In HADOOP-13558 we introduced a new private UGI constructor 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}} and 
> we use with TRUE only when doing a {{UGI.loginUserFromSubject()}}.
> The problem is, when we call {{UGI.getCurrentUser()}}, and UGI was created 
> via a Subject (via the {{UGI.loginUserFromSubject()}} method), we call {{new 
> UserGroupInformation(subject)}} which will delegate to 
> {{UserGroupInformation(Subject subject, final boolean externalKeyTab)}}  and 
> that will use externalKeyTab == *FALSE*. 
> Then the UGI returned by {{UGI.getCurrentUser()}} will attempt to login using 
> a non-existing keytab if the TGT expired.
> This problem is experienced in {{KMSClientProvider}} when used by the HDFS 
> filesystem client accessing an an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14030) PreCommit TestKDiag failure

2017-02-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872137#comment-15872137
 ] 

Yongjun Zhang commented on HADOOP-14030:


I'm also seeing it at 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11651/testReport/

FYI [~steve_l]. Thanks.


> PreCommit TestKDiag failure
> ---
>
> Key: HADOOP-14030
> URL: https://issues.apache.org/jira/browse/HADOOP-14030
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>
> https://builds.apache.org/job/PreCommit-HADOOP-Build/11523/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
> {noformat}
> Tests run: 13, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 2.175 sec 
> <<< FAILURE! - in org.apache.hadoop.security.TestKDiag
> testKeytabAndPrincipal(org.apache.hadoop.security.TestKDiag)  Time elapsed: 
> 0.05 sec  <<< ERROR!
> org.apache.hadoop.security.KerberosAuthException: Login failure for user: 
> f...@example.com from keytab 
> /testptch/hadoop/hadoop-common-project/hadoop-common/target/keytab 
> javax.security.auth.login.LoginException: Unable to obtain password from user
>   at 
> com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>   at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>   at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>   at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>   at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>   at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1355)
>   at org.apache.hadoop.security.KDiag.loginFromKeytab(KDiag.java:630)
>   at org.apache.hadoop.security.KDiag.execute(KDiag.java:396)
>   at org.apache.hadoop.security.KDiag.run(KDiag.java:236)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.security.KDiag.exec(KDiag.java:1047)
>   at org.apache.hadoop.security.TestKDiag.kdiag(TestKDiag.java:119)
>   at 
> org.apache.hadoop.security.TestKDiag.testKeytabAndPrincipal(TestKDiag.java:162)
> testFileOutput(org.apache.hadoop.security.TestKDiag)  Time elapsed: 0.033 sec 
>  <<< ERROR!
> org.apache.hadoop.security.KerberosAuthException: Login failure for user: 
> f...@example.com from keytab 
> /testptch/hadoop/hadoop-common-project/hadoop-common/target/keytab 
> javax.security.auth.login.LoginException: Unable to obtain password from user
>   at 
> com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760)
>   at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>   at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>   at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>   at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>   at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>   at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1355)
>   at org.apache.hadoop.security.KDiag.loginFromKeytab(KDiag.java:630)
>   at org.apache.hadoop.security.KDiag.execute(KDiag.java:396)
>   at o

[jira] [Commented] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable

2017-02-17 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872106#comment-15872106
 ] 

Daniel Templeton commented on HADOOP-10584:
---

I'm looking at this issue now, and it seems to me that the issue could be 
resolved by reseting the retry counts when the session is reconnected.  If 
we've lost the session, then whatever retry counts we had previously don't 
really apply anymore, so we should reset them on reconnect.  It looks like this 
issue is happening only in the case that the ZK connection is flaky.

> ActiveStandbyElector goes down if ZK quorum become unavailable
> --
>
> Key: HADOOP-10584
> URL: https://issues.apache.org/jira/browse/HADOOP-10584
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Priority: Critical
> Attachments: hadoop-10584-prelim.patch, rm.log
>
>
> ActiveStandbyElector retries operations for a few times. If the ZK quorum 
> itself is down, it goes down and the daemons will have to be brought up 
> again. 
> Instead, it should log the fact that it is unable to talk to ZK, call 
> becomeStandby on its client, and continue to attempt connecting to ZK.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-02-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14043:

Component/s: build

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871985#comment-15871985
 ] 

Sean Mackrory commented on HADOOP-14090:


{quote}the endpoint specifies the s3 endpoint for authentication. If I specify 
DDB region, what does that mean for the s3a bucket? What if they are 
different...does that just run DDB somewhere else?{quote}

There are already distinct configs for the s3 and DynamoDB endpoints, btw, so I 
didn't think that would be any more confusing. I've done this a lot in testing 
and had some requests for that use case in this feature: if users are going to 
take advantage of HADOOP-13876, etc. and use a single table for multiple 
buckets, perhaps one of the buckets occasionally used will be in a different 
region. Not a great idea for that to happen all the time for cost / 
performance, but there's no reason it shouldn't work. And it's easier for AWS 
users to specify a region than an endpoint.

{quote}Unix convention is to use – for long commands, so if an app supports 
single char options, you can combine them tar -xvf == tar -x -v -f. So use "--" 
as the prefix here.{quote}

Sounds good - I think we should only have the long commands because all options 
so far are not combinable anyway - they all take parameters.

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14094) Rethink S3GuardTool options

2017-02-17 Thread Sean Mackrory (JIRA)
Sean Mackrory created HADOOP-14094:
--

 Summary: Rethink S3GuardTool options
 Key: HADOOP-14094
 URL: https://issues.apache.org/jira/browse/HADOOP-14094
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sean Mackrory


I think we need to rework the S3GuardTool options. A couple of problems I've 
observed in the patches I've done on top of that and seeing other developers 
trying it out:

* We should probably wrap the current commands in an S3Guard-specific command, 
since 'init', 'destroy', etc. don't touch the buckets at all.
* Convert to whole-word options, as the single-letter options are already 
getting overloaded. Some patches I've submitted have added functionality where 
the obvious flag is already in use (e.g. -r for region, and read throughput, -m 
for minutes, and metadatastore uri).  I may do this early as part of 
HADOOP-14090.
* We have some options that must be in the config in some cases, and can be in 
the command in other cases. But I've seen someone try to specify the table name 
in the config and leave out the -m option, with no luck. Also, since commands 
hard-code table auto-creation, you might have configured table auto-creation, 
try to import to a non-existent table, and it tells you table auto-creation is 
off.

We need a more consistent policy for how things should get configured that 
addresses these problems and future-proofs the command a bit more.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13321) Deprecate FileSystem APIs that promote inefficient call patterns.

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871906#comment-15871906
 ] 

Steve Loughran commented on HADOOP-13321:
-

well, there's a lot more deprecation warnings coming out, but that's a good 
sign: it shows things are working. Once this is in, we could tune stuff. 

That said, I think for the production warnings in swift and s3a, we should add 
{{@SuppressWarnings(“deprecation”)}} annotations to their isDirectory/isFile 
subclassed methods, so that those ones don't raise false alarms.

> Deprecate FileSystem APIs that promote inefficient call patterns.
> -
>
> Key: HADOOP-13321
> URL: https://issues.apache.org/jira/browse/HADOOP-13321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.0.0-alpha3
>Reporter: Chris Nauroth
>Assignee: Mingliang Liu
> Attachments: HADOOP-13321.000.patch, HADOOP-13321.001.patch, 
> HADOOP-13321.002.patch
>
>
> {{FileSystem}} contains several methods that act as convenience wrappers over 
> calling {{getFileStatus}} and retrieving a single property of the returned 
> {{FileStatus}}.  These methods have a habit of fostering inefficient call 
> patterns in applications, resulting in multiple redundant {{getFileStatus}} 
> calls.  For HDFS, this translates into wasteful NameNode RPC traffic.  For 
> file systems backed by cloud object stores, this translates into wasteful 
> HTTP traffic.  This issue proposes to deprecate these methods and instead 
> encourage applications to call {{getFileStatus}} and then reuse the same 
> {{FileStatus}} instance as needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14092) Typo in hadoop-aws index.md

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871905#comment-15871905
 ] 

Steve Loughran commented on HADOOP-14092:
-

patch?

> Typo in hadoop-aws index.md
> ---
>
> Key: HADOOP-14092
> URL: https://issues.apache.org/jira/browse/HADOOP-14092
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0-alpha3
>Reporter: John Zhuge
>Priority: Trivial
>  Labels: newbie
>
> In section {{Testing against different regions}}, {{contract-tests.xml}} 
> should be {{contract-test-options.xml}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14041) CLI command to prune old metadata

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871899#comment-15871899
 ] 

Steve Loughran commented on HADOOP-14041:
-

why in {{DynamoDBMetadataStore}} line 584 IOE .getMessage() logged, but not 
details and not rethrown.

if the IOEs really are to be swallowed, then it should be a full log.warn. 
Though I think it should actually just throw up the IOE to the caller. Why? for 
tests to show something failed, for management tools calling it direct to 
detect the same, and for CLI tools to report and return an error code. 
Something has gone wrong

> CLI command to prune old metadata
> -
>
> Key: HADOOP-14041
> URL: https://issues.apache.org/jira/browse/HADOOP-14041
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14041-HADOOP-13345.001.patch, 
> HADOOP-14041-HADOOP-13345.002.patch, HADOOP-14041-HADOOP-13345.003.patch, 
> HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
>
>
> Add a CLI command that allows users to specify an age at which to prune 
> metadata that hasn't been modified for an extended period of time. Since the 
> primary use-case targeted at the moment is list consistency, it would make 
> sense (especially when authoritative=false) to prune metadata that is 
> expected to have become consistent a long time ago.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14068) Add integration test version of TestMetadataStore for DynamoDB

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871890#comment-15871890
 ] 

Steve Loughran commented on HADOOP-14068:
-

FWIW for the s3guard committer, I've seen diffs too, at least from clients of 
s3a+s3guard. For testing, it may just be best to switch to the real DDB 
endpoint everywhere. Removes any inconsistency (and tracking it down, working 
around, etc), and increases the likelihood that test runs will catch any 
transient failures due to network problems, so give us more stack traces to 
deal with.

> Add integration test version of TestMetadataStore for DynamoDB
> --
>
> Key: HADOOP-14068
> URL: https://issues.apache.org/jira/browse/HADOOP-14068
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Attachments: HADOOP-14068-HADOOP-13345.001.patch, 
> HADOOP-14068-HADOOP-13345.002.patch
>
>
> I tweaked TestDynamoDBMetadataStore to run against the actual Amazon DynamoDB 
> service (as opposed to the "local" edition). Several tests failed because of 
> minor variations in behavior. I think the differences that are clearly 
> possible are enough to warrant extending that class as an ITest (but 
> obviously keeping the existing test so 99% of the the coverage remains even 
> when not configured for actual DynamoDB usage).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871884#comment-15871884
 ] 

Steve Loughran commented on HADOOP-14090:
-

Having separate region/endpoint just confuses me. The docs will need a 
paragraph on it. Key thing: the endpoint specifies the s3 endpoint for 
authentication. If I specify DDB region, what does that mean for the s3a 
bucket? What if they are different...does that just run DDB somewhere else?

other than that

* Unix convention is to use -- for long commands, so if an app supports single 
char options, you can combine them {{tar \-xvf == tar \-x \-v \-f}}. So use 
"\-\-" as the prefix here.
* please use shared string constants for "region" and "endpoint" in code & 
tests.
* -1 to TODOs in code. Better to add a new JIRA item rather than leave comments 
in the source with will inevitably get forgotten about. JIRA should be where 
the entire TODO list should be documented.

 

> Allow users to specify region for DynamoDB table instead of endpoint
> 
>
> Key: HADOOP-14090
> URL: https://issues.apache.org/jira/browse/HADOOP-14090
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
> Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to 
> configure it for any usage on AWS itself (with endpoint still being an option 
> for AWS-compatible non-AWS use cases). Unless users actually care about a 
> specific endpoint, this is easier. Perhaps less important, HADOOP-14023 shows 
> that inferring the region from the endpoint (which granted, isn't that 
> necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13826) S3A Deadlock in multipart copy due to thread pool limits.

2017-02-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13826:

Attachment: HADOOP-13206-branch-2-005.patch


Production code LGTM (mostly); test code needed some tweaking, though the core 
test algorithm is good

Hence: HADOOP-13286 patch 005

Production code
* thread pool given a name
* imports tweaked

Test code:
* moved test to org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps to 
emphasise scale nature & for s3a
* all constant strings -> refs off Constants, to aid finding & avoid typos
* timeout logic integrating with S3AScaleTestBase rules.
* move off assert true/false to asserts with meaningful messages/error 
diagnostics
* various other IDE-suggested cleanups of the test code

Tested: S3a ireland @ scale.

Once yetus is happy I'll put this in

> S3A Deadlock in multipart copy due to thread pool limits.
> -
>
> Key: HADOOP-13826
> URL: https://issues.apache.org/jira/browse/HADOOP-13826
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Critical
> Attachments: HADOOP-13206-branch-2-005.patch, HADOOP-13826.001.patch, 
> HADOOP-13826.002.patch, HADOOP-13826.003.patch, HADOOP-13826.004.patch
>
>
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The 
> TransferManager javadocs 
> (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
>  explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread 
> pool with a bounded work queue as control tasks may submit subtasks that 
> can't complete until all sub tasks complete. Using an incorrectly configured 
> thread pool may cause a deadlock (I.E. the work queue is filled with control 
> tasks that can't finish until subtasks complete but subtasks can't execute 
> because the queue is filled).{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13826) S3A Deadlock in multipart copy due to thread pool limits.

2017-02-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13826:

Status: Open  (was: Patch Available)

> S3A Deadlock in multipart copy due to thread pool limits.
> -
>
> Key: HADOOP-13826
> URL: https://issues.apache.org/jira/browse/HADOOP-13826
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Critical
> Attachments: HADOOP-13826.001.patch, HADOOP-13826.002.patch, 
> HADOOP-13826.003.patch, HADOOP-13826.004.patch
>
>
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The 
> TransferManager javadocs 
> (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
>  explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread 
> pool with a bounded work queue as control tasks may submit subtasks that 
> can't complete until all sub tasks complete. Using an incorrectly configured 
> thread pool may cause a deadlock (I.E. the work queue is filled with control 
> tasks that can't finish until subtasks complete but subtasks can't execute 
> because the queue is filled).{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13282) S3 blob etags to be made visible in status/getFileChecksum() calls

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871726#comment-15871726
 ] 

Steve Loughran commented on HADOOP-13282:
-

for encryption, we'd have to return some {{FileChecksum}} instance which was 
either random or null, some way to warn distcp that it shouldn't expect the 
etags between encrypted files to be consistent. Having a different value 
depending on the #of parts is even more complex. I think we should leave this 
alone for now.

that said, being able to know the #of parts could be vaguely useful when 
partitioning files -though without block lengths not that useful, and a 
probably a distraction to work on it. You would never get the big speedup which 
comes from scheduling work on the same host as the data, just the smaller 
speedup which could come from using a different block off the s3 filestore, and 
so potentially less conflict for the same data

> S3 blob etags to be made visible in status/getFileChecksum() calls
> --
>
> Key: HADOOP-13282
> URL: https://issues.apache.org/jira/browse/HADOOP-13282
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Priority: Minor
>
> If the etags of blobs were exported via {{getFileChecksum()}}, it'd be 
> possible to probe for a blob being in sync with a local file. Distcp could 
> use this to decide whether to skip a file or not.
> Now, there's a problem there: distcp needs source and dest filesystems to 
> implement the same algorithm. It'd only work out the box if you were copying 
> between S3 instances. There are also quirks with encryption and multipart: 
> [s3 
> docs|http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html].
>  At the very least, it's something which could be used when indexing the FS, 
> to check for changes later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-17 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871711#comment-15871711
 ] 

Rajesh Balamohan commented on HADOOP-14081:
---

Thanks [~ste...@apache.org].  Here are the test results (region: S3 bucket in 
U.S east. Tests were run from my laptop). Errors are due to socket time outs 
(180 seconds). Checked ITestS3AContractGetFileStatus.teardown, which was again 
due to socket timeout.

{noformat}
Results :

Tests in error:
  
ITestS3ContractOpen>AbstractFSContractTestBase.setup:193->AbstractFSContractTestBase.mkdirs:338
 » SocketTimeout
  
ITestS3AContractGetFileStatus.teardown:40->AbstractFSContractTestBase.teardown:204->AbstractFSContractTestBase.deleteTestDirInTeardown:213
 »
  
ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive:116
 » PathIO
  
ITestS3NContractOpen>AbstractFSContractTestBase.setup:193->AbstractFSContractTestBase.mkdirs:338
 » SocketTimeout

Tests run: 454, Failures: 0, Errors: 4, Skipped: 56

..
..
[INFO] Total time: 02:11 h 
{noformat}

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14093) add tests/docs for HAR files on s3a

2017-02-17 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-14093:
---

 Summary: add tests/docs for HAR files on s3a
 Key: HADOOP-14093
 URL: https://issues.apache.org/jira/browse/HADOOP-14093
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation, fs/s3, test
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


Because many small files are such a perf killer for S3, we could think about 
encouraging people to use HAR files where they need to. Which means: tests and 
docs for using this + S3a



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14040) Use shaded aws-sdk uber-JAR 1.11.86

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871671#comment-15871671
 ] 

Steve Loughran commented on HADOOP-14040:
-

thanks for this; it'll help reduce pain downstream.

> Use shaded aws-sdk uber-JAR 1.11.86
> ---
>
> Key: HADOOP-14040
> URL: https://issues.apache.org/jira/browse/HADOOP-14040
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14040.001.patch, HADOOP-14040-branch-2-001.patch, 
> HADOOP-14040-branch-2.002.patch, HADOOP-14040-HADOOP-13345.001.patch
>
>
> AWS SDK now has a (v. large) uberjar shading all dependencies
> This ensures that AWS dependency changes (e.g json) don't cause problems 
> downstream in things like HBase, so enabling backporting if desired.
> This will let us addess the org.json don't be evil problem: this SDK version 
> doesn't have those files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871651#comment-15871651
 ] 

Steve Loughran commented on HADOOP-13345:
-

{{ITestS3AEncryptionSSEC}} is a new test from HADOOP-13075; have a look to see 
if it is failing for you on trunk & if it does, open a JIRA.

Maybe we're just translating the exception more strictly.

I don't think s3guard and credentials in URLs should work together at all, in 
fact, explicitly refusing to work with them could be extra incentive to stop 
using it.

> S3Guard: Improved Consistency for S3A
> -
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14088) Java 8 javadoc errors when releasing

2017-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871639#comment-15871639
 ] 

Steve Loughran commented on HADOOP-14088:
-

The patches are all done for Hadoop 2.7+, but not branch 2.6

bear in mind that Hadoop 2.6 doesn't, AFAIK, work properly on java 8; Kerberos 
related.

my recommendation is build/release on java 6 or 7, or go 
{{-Dmaven.javadoc.skip=true}}

looking at BUILDING.TXT, we don't mention that standard switch...we  ought to

> Java 8 javadoc errors when releasing
> 
>
> Key: HADOOP-14088
> URL: https://issues.apache.org/jira/browse/HADOOP-14088
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.6.5
> Environment: Java 8, CentOS 7
>Reporter: Serhiy Boychenko
>Priority: Minor
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> There are many javadoc errors popping up when releasing Hadoop. After doing 
> some modifications I have executed the following command to create a release:
> mvn package -pdist -Psrc - Dtar -DskipTests
> Many error regarding the usage of the self-closed tags (like ). Some 
> related to embedding  into  (which seems to be invalid). Many HTML 
> errors (unxesting tags, etc), problems related with generic representations 
> in javadoc which are being treated like tags, etc. Unfortunately I cannot 
> contribute my code since I have done some breaking changes, but I could 
> checkout again and try to fix errors (took me around 8 hours). There are also 
> loads of warnings, but at least those do not block release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A

2017-02-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871450#comment-15871450
 ] 

Mingliang Liu commented on HADOOP-13345:


I propose we merge from trunk again. I have fixed the conflicts so if you vote 
up, I'll simply push.

{code}
commit a434d50fe4547f32de7b1fafb3c370a7123cda2d
Merge: 8b37b6a96c 02c549484a
Author: Mingliang Liu 
Date:   Thu Feb 16 22:38:55 2017 -0800

Merge branch 'trunk' into HADOOP-13345

After HADOOP-14040, we use shaded aws-sdk uber-JAR so don't have to
bring DynamoDB dependency explicitly. However, for tests we do need the
DynamoDBLocal dependency from its Maven repository.
{code}

I got integration tests run against us-west-1. Please confirm as this merge is 
major. Thanks,

{code}
Failed tests:
  ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 
Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got 
unexpected exception:java.nio.file.AccessDeniedException: 
s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: 
Reopen at position 0 on 
s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: 
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: 
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 
4E0A3A7A0B2D8005), S3 Extended Request ID: 
ZKm3w28W57skopifj0wH5p+c8KF1NVzL7ItNG067aK6FNK9dk1kmGrykda/NI4EhtFmN1/bv60c=
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374)

Tests in error:
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: 
...
  
ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525
 » AWSServiceIO
{code}
For failing test {{ITestS3AEncryptionSSEC}} I'm not sure it's the caused by the 
merge; {{ITestS3ACredentialsInURL}} is known not supported as credentials in 
URL are very unsafe. 
{{ITestS3AFileSystemContract#testRenameToDirWithSamePrefixAllowed}} I can pass 
it 2nd run.

> S3Guard: Improved Consistency for S3A
> -
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13805) UGI.getCurrentUser() fails if user does not have a keytab associated

2017-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871396#comment-15871396
 ] 

Hadoop QA commented on HADOOP-13805:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 58s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestKDiag |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | HADOOP-13805 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12853225/HADOOP-13805.010.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bdec1aea006f 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 02c5494 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11651/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11651/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/11651/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> UGI.getCurrentUser() fails if user does not have a keytab associated
> 
>
> Key: HADOOP-13805
> URL: https://issues.apache.org/jira/browse/HADOOP-13805
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-

[jira] [Comment Edited] (HADOOP-14088) Java 8 javadoc errors when releasing

2017-02-17 Thread Serhiy Boychenko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871372#comment-15871372
 ] 

Serhiy Boychenko edited comment on HADOOP-14088 at 2/17/17 7:59 AM:


Unfortunately the changes I have done to Javadoc in the beginning are 
destructive since I was not thinking initially about the possibility of 
committing the code (and since I am just prototyping it made no difference for 
me to have or not to have the docs). I think it would be better to checkout the 
branch-2.6 again and apply the proper changes. I can try to do so this (less 
likely) or next weekend.


was (Author: serhiy):
Unfortunately the changes I have done to Javadoc in the beginning are 
destructive since I was not thinking initially about the possibility of 
committing the code (and since I am just prototyping it made no difference for 
me to have or not to have the docs). I think it would be better to checkout the 
branch-2.6 again and apply the proper changes. I can try to so(less likely) 
this or next weekend.

> Java 8 javadoc errors when releasing
> 
>
> Key: HADOOP-14088
> URL: https://issues.apache.org/jira/browse/HADOOP-14088
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.6.5
> Environment: Java 8, CentOS 7
>Reporter: Serhiy Boychenko
>Priority: Minor
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> There are many javadoc errors popping up when releasing Hadoop. After doing 
> some modifications I have executed the following command to create a release:
> mvn package -pdist -Psrc - Dtar -DskipTests
> Many error regarding the usage of the self-closed tags (like ). Some 
> related to embedding  into  (which seems to be invalid). Many HTML 
> errors (unxesting tags, etc), problems related with generic representations 
> in javadoc which are being treated like tags, etc. Unfortunately I cannot 
> contribute my code since I have done some breaking changes, but I could 
> checkout again and try to fix errors (took me around 8 hours). There are also 
> loads of warnings, but at least those do not block release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14088) Java 8 javadoc errors when releasing

2017-02-17 Thread Serhiy Boychenko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871372#comment-15871372
 ] 

Serhiy Boychenko commented on HADOOP-14088:
---

Unfortunately the changes I have done to Javadoc in the beginning are 
destructive since I was not thinking initially about the possibility of 
committing the code (and since I am just prototyping it made no difference for 
me to have or not to have the docs). I think it would be better to checkout the 
branch-2.6 again and apply the proper changes. I can try to so(less likely) 
this or next weekend.

> Java 8 javadoc errors when releasing
> 
>
> Key: HADOOP-14088
> URL: https://issues.apache.org/jira/browse/HADOOP-14088
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.6.5
> Environment: Java 8, CentOS 7
>Reporter: Serhiy Boychenko
>Priority: Minor
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> There are many javadoc errors popping up when releasing Hadoop. After doing 
> some modifications I have executed the following command to create a release:
> mvn package -pdist -Psrc - Dtar -DskipTests
> Many error regarding the usage of the self-closed tags (like ). Some 
> related to embedding  into  (which seems to be invalid). Many HTML 
> errors (unxesting tags, etc), problems related with generic representations 
> in javadoc which are being treated like tags, etc. Unfortunately I cannot 
> contribute my code since I have done some breaking changes, but I could 
> checkout again and try to fix errors (took me around 8 hours). There are also 
> loads of warnings, but at least those do not block release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org