[jira] [Commented] (HADOOP-16085) S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803589#comment-16803589
 ] 

Hadoop QA commented on HADOOP-16085:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 24 
new + 62 unchanged - 3 fixed = 86 total (was 65) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
30s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/646 |
| JIRA Issue | HADOOP-16085 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 98128039d9c4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9cd6619 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
| whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/testReport/ |
| Max. process+thread count | 444 (vs. ulimit of 5500) |
| modules | C: hadoop-tools/ha

[GitHub] [hadoop] hadoop-yetus commented on issue #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #646: HADOOP-16085: use object version or etags 
to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#issuecomment-477448247
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 26 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 22 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1008 | trunk passed |
   | +1 | compile | 30 | trunk passed |
   | +1 | checkstyle | 24 | trunk passed |
   | +1 | mvnsite | 36 | trunk passed |
   | +1 | shadedclient | 685 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 41 | trunk passed |
   | +1 | javadoc | 20 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 29 | the patch passed |
   | +1 | compile | 26 | the patch passed |
   | +1 | javac | 26 | the patch passed |
   | -0 | checkstyle | 17 | hadoop-tools/hadoop-aws: The patch generated 24 new 
+ 62 unchanged - 3 fixed = 86 total (was 65) |
   | +1 | mvnsite | 29 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | shadedclient | 687 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 48 | the patch passed |
   | +1 | javadoc | 22 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 270 | hadoop-aws in the patch passed. |
   | +1 | asflicense | 29 | The patch does not generate ASF License warnings. |
   | | | 3096 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/646 |
   | JIRA Issue | HADOOP-16085 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 98128039d9c4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
   | whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/artifact/out/whitespace-eol.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/testReport/ |
   | Max. process+thread count | 444 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/4/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
hadoop-yetus commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269857280
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,111 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning checked either on the server (default) or client. This works
+such that a reader reading a file after an overwrite either sees the new 
version
+of the file or an error. Without S3Guard, new readers may see the original
+version. Once S3 reaches eventual consistency, new readers will see the new
+version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID and 
 
 Review comment:
   whitespace:end of line
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao merged pull request #652: HDDS-1346. Remove hard-coded version ozone-0.5.0 from ReadMe of ozone…

2019-03-27 Thread GitBox
xiaoyuyao merged pull request #652: HDDS-1346. Remove hard-coded version 
ozone-0.5.0 from ReadMe of ozone…
URL: https://github.com/apache/hadoop/pull/652
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #648: HDDS-1340. Add List Containers API for Recon

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #648: HDDS-1340. Add List Containers API for 
Recon
URL: https://github.com/apache/hadoop/pull/648#issuecomment-477439946
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 26 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1137 | trunk passed |
   | +1 | compile | 49 | trunk passed |
   | +1 | checkstyle | 15 | trunk passed |
   | +1 | mvnsite | 26 | trunk passed |
   | +1 | shadedclient | 730 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 36 | trunk passed |
   | +1 | javadoc | 21 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 38 | the patch passed |
   | +1 | compile | 20 | the patch passed |
   | +1 | javac | 20 | the patch passed |
   | -0 | checkstyle | 13 | hadoop-ozone/ozone-recon: The patch generated 3 new 
+ 0 unchanged - 0 fixed = 3 total (was 0) |
   | +1 | mvnsite | 23 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 848 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 53 | the patch passed |
   | +1 | javadoc | 19 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 34 | ozone-recon in the patch passed. |
   | +1 | asflicense | 27 | The patch does not generate ASF License warnings. |
   | | | 3207 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/648 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 5257080725b2 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed 
Oct 31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/4/artifact/out/diff-checkstyle-hadoop-ozone_ozone-recon.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/4/testReport/ |
   | Max. process+thread count | 311 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/ozone-recon U: hadoop-ozone/ozone-recon |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/4/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269844283
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestObjectETag.java
 ##
 @@ -0,0 +1,304 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.s3guard;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.when;
+import static org.mockito.hamcrest.MockitoHamcrest.argThat;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.CompleteMultipartUploadRequest;
+import com.amazonaws.services.s3.model.CompleteMultipartUploadResult;
+import com.amazonaws.services.s3.model.GetObjectMetadataRequest;
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.InitiateMultipartUploadRequest;
+import com.amazonaws.services.s3.model.InitiateMultipartUploadResult;
+import com.amazonaws.services.s3.model.ListObjectsV2Request;
+import com.amazonaws.services.s3.model.ListObjectsV2Result;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.PutObjectRequest;
+import com.amazonaws.services.s3.model.PutObjectResult;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.UploadPartRequest;
+import com.amazonaws.services.s3.model.UploadPartResult;
+import java.io.ByteArrayInputStream;
+import java.nio.charset.Charset;
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.AbstractS3AMockTest;
+import org.apache.hadoop.fs.s3a.Constants;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy.Source;
+import org.hamcrest.BaseMatcher;
+import org.hamcrest.Description;
+import org.hamcrest.Matcher;
+import org.junit.Assume;
+import org.junit.Before;
+import org.junit.Test;
+
+/**
+ * Tests to ensure eTag is captured on S3 PUT and used on GET.
+ */
+public class TestObjectETag extends AbstractS3AMockTest {
+  @Before
+  public void before() {
+Assume.assumeTrue("change detection source should be etag",
+fs.getChangeDetectionPolicy().getSource() == Source.ETag);
+  }
+
+  /**
+   * Tests a file uploaded with a single PUT to ensure eTag is captured and 
used
+   * on file read.
 
 Review comment:
   I think this is the last unaddressed comment right now.  I will tackle it 
tomorrow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269843035
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
+fs.s3.change.detection.source=versionid.
+
+ Out of Sync Metadata
+
+The S3Guard version tracking metadata (ETag or object version ID) could become
+out of sync with the true current object metadata in S3.  For example, S3Guard
+is still tracking v1 of some f

[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269842868
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
 
 Review comment:
   Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #652: HDDS-1346. Remove hard-coded version ozone-0.5.0 from ReadMe of ozone…

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #652: HDDS-1346. Remove hard-coded version 
ozone-0.5.0 from ReadMe of ozone…
URL: https://github.com/apache/hadoop/pull/652#issuecomment-477421325
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 52 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1273 | trunk passed |
   | +1 | mvnsite | 73 | trunk passed |
   | +1 | shadedclient | 2119 | branch has no errors when building and testing 
our client artifacts. |
   ||| _ Patch Compile Tests _ |
   | -1 | mvninstall | 22 | dist in the patch failed. |
   | +1 | mvnsite | 23 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 868 | patch has no errors when building and testing 
our client artifacts. |
   ||| _ Other Tests _ |
   | +1 | asflicense | 27 | The patch does not generate ASF License warnings. |
   | | | 3228 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-652/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/652 |
   | Optional Tests |  dupname  asflicense  mvnsite  |
   | uname | Linux 0eda072a74c9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-652/1/artifact/out/patch-mvninstall-hadoop-ozone_dist.txt
 |
   | Max. process+thread count | 305 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/dist U: hadoop-ozone/dist |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-652/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao opened a new pull request #652: HDDS-1346. Remove hard-coded version ozone-0.5.0 from ReadMe of ozone…

2019-03-27 Thread GitBox
xiaoyuyao opened a new pull request #652: HDDS-1346. Remove hard-coded version 
ozone-0.5.0 from ReadMe of ozone…
URL: https://github.com/apache/hadoop/pull/652
 
 
   …secure-mr docker-compose. Contributed by Xiaoyu Yao.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16181) HadoopExecutors shutdown Cleanup

2019-03-27 Thread David Mollitor (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803514#comment-16803514
 ] 

David Mollitor commented on HADOOP-16181:
-

Just for additional context.  Guava has a similar capability.

https://google.github.io/guava/releases/20.0/api/docs/com/google/common/util/concurrent/MoreExecutors.html#shutdownAndAwaitTermination-java.util.concurrent.ExecutorService-long-java.util.concurrent.TimeUnit-

> HadoopExecutors shutdown Cleanup
> 
>
> Key: HADOOP-16181
> URL: https://issues.apache.org/jira/browse/HADOOP-16181
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 3.2.1
>
> Attachments: HADOOP-16181.1.patch, HADOOP-16181.2.patch
>
>
> # Add method description
> # Add additional logging
> # Do not log-and-throw Exception.  Anti-pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao commented on issue #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
xiaoyuyao commented on issue #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#issuecomment-477407720
 
 
   The single test failure does not repro locally. Seems unrelated to this 
patch. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #651: HDDS-1339. Implement ratis snapshots on OM

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #651: HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-477404785
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 97 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 22 | Maven dependency ordering for branch |
   | +1 | mvninstall | 970 | trunk passed |
   | +1 | compile | 959 | trunk passed |
   | +1 | checkstyle | 228 | trunk passed |
   | +1 | mvnsite | 200 | trunk passed |
   | +1 | shadedclient | 1158 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 197 | trunk passed |
   | +1 | javadoc | 151 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 22 | integration-test in the patch failed. |
   | -1 | mvninstall | 21 | ozone-manager in the patch failed. |
   | +1 | compile | 996 | the patch passed |
   | +1 | javac | 996 | the patch passed |
   | +1 | checkstyle | 236 | the patch passed |
   | -1 | mvnsite | 33 | integration-test in the patch failed. |
   | -1 | mvnsite | 31 | ozone-manager in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 621 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 26 | ozone-manager in the patch failed. |
   | +1 | javadoc | 123 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 67 | common in the patch passed. |
   | +1 | unit | 40 | common in the patch passed. |
   | -1 | unit | 32 | integration-test in the patch failed. |
   | -1 | unit | 30 | ozone-manager in the patch failed. |
   | +1 | asflicense | 38 | The patch does not generate ASF License warnings. |
   | | | 6407 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 06fe1375ad45 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvninstall-hadoop-ozone_ozone-manager.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvnsite-hadoop-ozone_ozone-manager.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-findbugs-hadoop-ozone_ozone-manager.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-unit-hadoop-ozone_ozone-manager.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/testReport/ |
   | Max. process+thread count | 411 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

--

[GitHub] [hadoop] hadoop-yetus commented on issue #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#issuecomment-477399363
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 28 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 10 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 28 | Maven dependency ordering for branch |
   | +1 | mvninstall | 994 | trunk passed |
   | +1 | compile | 957 | trunk passed |
   | +1 | checkstyle | 193 | trunk passed |
   | +1 | mvnsite | 170 | trunk passed |
   | +1 | shadedclient | 1092 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 125 | trunk passed |
   | +1 | javadoc | 117 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | +1 | mvninstall | 100 | the patch passed |
   | +1 | compile | 904 | the patch passed |
   | +1 | javac | 904 | the patch passed |
   | +1 | checkstyle | 231 | the patch passed |
   | +1 | mvnsite | 150 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 665 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 137 | the patch passed |
   | +1 | javadoc | 116 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 36 | client in the patch passed. |
   | +1 | unit | 77 | common in the patch passed. |
   | -1 | unit | 653 | integration-test in the patch failed. |
   | +1 | asflicense | 52 | The patch does not generate ASF License warnings. |
   | | | 6767 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.ozShell.TestOzoneShell |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/641 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 21d34faea4eb 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/3/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/3/testReport/ |
   | Max. process+thread count | 4532 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/client hadoop-hdds/common 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #648: HDDS-1340. Add List Containers API for Recon

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #648: HDDS-1340. Add List Containers API for 
Recon
URL: https://github.com/apache/hadoop/pull/648#issuecomment-477395623
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 26 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 982 | trunk passed |
   | +1 | compile | 27 | trunk passed |
   | +1 | checkstyle | 18 | trunk passed |
   | +1 | mvnsite | 28 | trunk passed |
   | +1 | shadedclient | 704 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 58 | trunk passed |
   | +1 | javadoc | 23 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 30 | the patch passed |
   | +1 | compile | 21 | the patch passed |
   | +1 | javac | 21 | the patch passed |
   | -0 | checkstyle | 14 | hadoop-ozone/ozone-recon: The patch generated 3 new 
+ 0 unchanged - 0 fixed = 3 total (was 0) |
   | +1 | mvnsite | 24 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 734 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | findbugs | 38 | hadoop-ozone/ozone-recon generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) |
   | +1 | javadoc | 19 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 29 | ozone-recon in the patch passed. |
   | +1 | asflicense | 26 | The patch does not generate ASF License warnings. |
   | | | 2899 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-ozone/ozone-recon |
   |  |  Primitive value is boxed and then immediately unboxed in 
org.apache.hadoop.ozone.recon.spi.impl.ContainerDBServiceProviderImpl.getContainers()
  At ContainerDBServiceProviderImpl.java:then immediately unboxed in 
org.apache.hadoop.ozone.recon.spi.impl.ContainerDBServiceProviderImpl.getContainers()
  At ContainerDBServiceProviderImpl.java:[line 190] |
   |  |  
org.apache.hadoop.ozone.recon.spi.impl.ContainerDBServiceProviderImpl.getContainers()
 invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead 
 At ContainerDBServiceProviderImpl.java:constructor; use Long.valueOf(long) 
instead  At ContainerDBServiceProviderImpl.java:[line 190] |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/648 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux b003b1a1e32d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/3/artifact/out/diff-checkstyle-hadoop-ozone_ozone-recon.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/3/artifact/out/new-findbugs-hadoop-ozone_ozone-recon.html
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/3/testReport/ |
   | Max. process+thread count | 411 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/ozone-recon U: hadoop-ozone/ozone-recon |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-648/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16156) [Clean-up] Remove NULL check before instanceof and fix checkstyle in InnerNodeImpl

2019-03-27 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803472#comment-16803472
 ] 

Daniel Templeton commented on HADOOP-16156:
---

[~daryn], that's true.  I think the intent here is to clean up latent 
checkstyle issues that popped up while trying to remove the extraneous null 
check.  Point taken, though.

[~shwetayakkali], what I had in mind with my last point above was something 
like:

{code}if ((childnode == null) || (path.length == 1)) {
  return childnode;
} else if (childnode instanceof InnerNode) {
  return ((InnerNode)childnode).getLoc(path[1]);
} else {
  return null;
}{code}

> [Clean-up] Remove NULL check before instanceof and fix checkstyle in 
> InnerNodeImpl
> --
>
> Key: HADOOP-16156
> URL: https://issues.apache.org/jira/browse/HADOOP-16156
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Shweta
>Assignee: Shweta
>Priority: Minor
> Attachments: HADOOP-16156.001.patch, HADOOP-16156.002.patch, 
> HADOOP-16156.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16199) KMSLoadBlanceClientProvider does not select token correctly

2019-03-27 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803459#comment-16803459
 ] 

Xiaoyu Yao commented on HADOOP-16199:
-

{quote}The added test is almost the same as 
testTokenServiceCreationWithUriFormat, added in HADOOP-15997, except that it 
configured key provider explicitly.
{quote}
Yes. That's a valid client configuration where client just downloaded the 
configuration from the same configuration used by Ambari/CM managed cluster, 
where hadoop.security.key.provider.path=kms://http@kms1;kms2:9600/kms
{quote}bq. After HADOOP-14445, if configuring KMS provide path explicitly for 
client, the expected behavior is: the client gets a kms dt whose credential 
alias is one of (randomly selected) KMS URI.
{quote}
The following code in LoadBalanceKMSCLientProvider#getDelegationToken was added 
by HADOOP-14445 to set the token service field to the the KMS URI so that it 
can be used across all instances. Check the KMSUtil#createKeyProvider and 
HdfsKMSUtil.createKeyProvider the uri configured above will be the one set into 
the token service field by LoadBalanceKMSCLientProvider. 

{code}

public Token getDelegationToken(String renewer) throws IOException {
  return doOp(new ProviderCallable>() {
    @Override
    public Token call(KMSClientProvider provider) throws IOException {
     Token token = provider.getDelegationToken(renewer);
      // override sub-providers service with our own so it can be used
      // across all providers.
      token.setService(dtService); 
      LOG.debug("New token service set. Token: ({})", token);
      return token;
  }

{code}

 

> KMSLoadBlanceClientProvider does not select token correctly
> ---
>
> Key: HADOOP-16199
> URL: https://issues.apache.org/jira/browse/HADOOP-16199
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.2
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
>  Labels: kms
>
> After HADOOP-14445 and HADOOP-15997, there are still cases where 
> KMSLoadBlanceClientProvider does not select token correctly. 
> Here is the use case:
> The new configuration key 
> hadoop.security.kms.client.token.use.uri.format=true is set cross all the 
> cluster, including both Submitter and Yarn RM(renewer), which is not covered 
> in the test matrix in this [HADOOP-14445 
> comment|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16505761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16505761].
> I will post the debug log and the proposed fix shortly, cc: [~xiaochen] and 
> [~jojochuang].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hanishakoneru opened a new pull request #651: HDDS-1339. Implement ratis snapshots on OM

2019-03-27 Thread GitBox
hanishakoneru opened a new pull request #651: HDDS-1339. Implement ratis 
snapshots on OM
URL: https://github.com/apache/hadoop/pull/651
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16214) Kerberos name implementation in Hadoop does not accept principals with more than two components

2019-03-27 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803432#comment-16803432
 ] 

Steve Loughran commented on HADOOP-16214:
-

I am not going near anything related to kerberos names. I am scared of this 
code, more importantly; scared of the consequences of getting this stuff wrong

> Kerberos name implementation in Hadoop does not accept principals with more 
> than two components
> ---
>
> Key: HADOOP-16214
> URL: https://issues.apache.org/jira/browse/HADOOP-16214
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: auth
>Reporter: Issac Buenrostro
>Priority: Major
>
> org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
> converting a Kerberos principal to a user name in Hadoop for all of the 
> services requiring authentication.
> Although the Kerberos spec 
> ([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html])
>  allows for an arbitrary number of components in the principal, the Hadoop 
> implementation will throw a "Malformed Kerberos name:" error if the principal 
> has more than two components (because the regex can only read serviceName and 
> hostName).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-15984) Update jersey from 1.19 to 2.x

2019-03-27 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-15984 started by Akira Ajisaka.
--
> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15984) Update jersey from 1.19 to 2.x

2019-03-27 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reassigned HADOOP-15984:
--

Assignee: Akira Ajisaka

> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Critical
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao commented on issue #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
xiaoyuyao commented on issue #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#issuecomment-477369914
 
 
   The current pattern of export and import traceID over the wire (GRPC/Ratis) 
is via the TraceID in the protocol message. But this changed the original 
traceID which the test expect to be the same, which is not true with the new 
yaeger based tracing where the parent:children:... are appended along the 
invocation chain. Will remove the invalid verification from those tests, cc: 
@elek . 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #646: HADOOP-16085: use object version or etags 
to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#issuecomment-477365131
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 23 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 21 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1061 | trunk passed |
   | +1 | compile | 31 | trunk passed |
   | +1 | checkstyle | 22 | trunk passed |
   | +1 | mvnsite | 34 | trunk passed |
   | +1 | shadedclient | 755 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 51 | trunk passed |
   | +1 | javadoc | 22 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 33 | the patch passed |
   | +1 | compile | 31 | the patch passed |
   | +1 | javac | 31 | the patch passed |
   | -0 | checkstyle | 19 | hadoop-tools/hadoop-aws: The patch generated 23 new 
+ 57 unchanged - 3 fixed = 80 total (was 60) |
   | +1 | mvnsite | 35 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 846 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 54 | the patch passed |
   | +1 | javadoc | 21 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 270 | hadoop-aws in the patch passed. |
   | +1 | asflicense | 25 | The patch does not generate ASF License warnings. |
   | | | 3427 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/646 |
   | JIRA Issue | HADOOP-16085 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 2d88dc5cb77e 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed 
Oct 31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/testReport/ |
   | Max. process+thread count | 340 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16085) S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803398#comment-16803398
 ] 

Hadoop QA commented on HADOOP-16085:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 21 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 23 
new + 57 unchanged - 3 fixed = 80 total (was 60) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
30s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/646 |
| JIRA Issue | HADOOP-16085 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2d88dc5cb77e 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / 9cd6619 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 5500) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-646/3/console |
| Powered by | Apache Yetus 0.9.0 http://yetus.apache.

[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269789149
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
+fs.s3.change.detection.source=versionid.
+
+ Out of Sync Metadata
+
+The S3Guard version tracking metadata (ETag or object version ID) could become
+out of sync with the true current object metadata in S3.  For example, S3Guard
+is still tracking v1 of some f

[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269786517
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
+
+* S3 is exhibiting read-after-overwrite eventual consistency.  The S3Guard
+metadata was updated with a new ETag during a recent write, but the current 
read
+is not seeing that ETag due to S3 eventual consistency.  This exception 
prevents
+the reader from an inconsistent read where the reader sees an older version of
+the file.
+
+```
+org.apache.hadoop.fs.s3a.RemoteFileChangedException: open 
's3a://my-bucket/test/file.txt':
+  ETag change reported by S3 while reading at position 0.
+  Version 4e886e26c072fef250cfaf8037675405 was unavailable
 
 Review comment:
   Edit: I'll take out the word "Version" as you suggest so it just reads like 
this:
   
   Change reported by S3 while reading at position 0. ETag 
4e886e26c072fef250cfaf8037675405 was unavailable


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269781017
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
+
+* S3 is exhibiting read-after-overwrite eventual consistency.  The S3Guard
+metadata was updated with a new ETag during a recent write, but the current 
read
+is not seeing that ETag due to S3 eventual consistency.  This exception 
prevents
+the reader from an inconsistent read where the reader sees an older version of
+the file.
+
+```
+org.apache.hadoop.fs.s3a.RemoteFileChangedException: open 
's3a://my-bucket/test/file.txt':
+  ETag change reported by S3 while reading at position 0.
+  Version 4e886e26c072fef250cfaf8037675405 was unavailable
 
 Review comment:
   This currently comes from:
   
   ```
   String.format("%s change "
   + CHANGE_REPORTED_BY_S3
   + " while reading"
   + " at position %s."
   + " Version %s was unavailable",
   getSource(),
   pos,
   getRevisionId()
   ```
   
   The word "Version" is used generically.  I'll restructure like this:
   
   ```
   String.format("Version change "
   + CHANGE_REPORTED_BY_S3
   + " while reading"
   + " at position %s."
   + " %s %s was unavailable",
   pos,
   getSource(),
   getRevisionId()
   ```
   
   Then,  this message will read like this:
   
   Version change reported by S3 while reading at position 0. ETag 
4e886e26c072fef250cfaf8037675405 was unavailable


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269779333
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
+
+* S3 is exhibiting read-after-overwrite eventual consistency.  The S3Guard
+metadata was updated with a new ETag during a recent write, but the current 
read
+is not seeing that ETag due to S3 eventual consistency.  This exception 
prevents
+the reader from an inconsistent read where the reader sees an older version of
+the file.
+
+```
+org.apache.hadoop.fs.s3a.RemoteFileChangedException: open 
's3a://my-bucket/test/file.txt':
+  ETag change reported by S3 while reading at position 0.
+  Version 4e886e26c072fef250cfaf8037675405 was unavailable
 
 Review comment:
   Looks like the "ETag" part is variable but "Version" isn't. Which could be 
confusing especially if versioning is not enabled, I would suggest to just 
remove the word "Version" from this error message.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269771702
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
 
 Review comment:
   Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269771523
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3LocatedFileStatus.java
 ##
 @@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.apache.hadoop.fs.BlockLocation;
+import org.apache.hadoop.fs.LocatedFileStatus;
+
+/**
+ * {@link LocatedFileStatus} extended to also carry ETag and object version ID.
+ */
+public class S3LocatedFileStatus extends LocatedFileStatus {
+  private final String eTag;
+  private final String versionId;
+
+  public S3LocatedFileStatus(S3AFileStatus status, BlockLocation[] locations,
+  String eTag, String versionId) {
+super(status, locations);
+this.eTag = eTag;
+this.versionId = versionId;
+  }
+
+  public String getETag() {
+return eTag;
+  }
+
+  public String getVersionId() {
+return versionId;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+return super.equals(o);
 
 Review comment:
   Added a non-javadoc comment to explain this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269771209
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -2859,12 +2893,15 @@ public String getCanonicalServiceName() {
* @param srcKey source object path
* @param dstKey destination object path
* @param size object size
+   * @param srcAttributes S3 attributes of the source object
+   * @param readContext the read context
* @throws AmazonClientException on failures inside the AWS SDK
* @throws InterruptedIOException the operation was interrupted
* @throws IOException Other IO problems
*/
   @Retries.RetryMixed
-  private void copyFile(String srcKey, String dstKey, long size)
+  private CopyResult copyFile(String srcKey, String dstKey, long size,
 
 Review comment:
   updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao commented on issue #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
xiaoyuyao commented on issue #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#issuecomment-477348325
 
 
   Some of the test failures are related to this change. I'm looking into it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269771042
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -696,7 +698,7 @@ public S3AInputPolicy getInputPolicy() {
* @return the change detection policy
*/
   @VisibleForTesting
 
 Review comment:
   See updated javadoc to mention only public to allow access in tests in other 
packages.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269762117
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
+  throws PathIOException {
+// ETag (sometimes, depending on encryption and/or multipart) is not the
+// same on the copied object as the original.  Version Id seems to never
+// be the same on the copy.  As such, there isn't really anything that
+// can be verified on the response.
+  }
+
+  /**
+   * Process an exception generated against the change policy.
+   * If the exception indicates the file has changed, this method throws
+   * {@code RemoteFileChangedException} with the original exception as the
+   * cause.
+   * @param e the exception
+   * @param operation the operation performed when the exception was
+   * generated.
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processException(Exception e, String operation) throws
+  RemoteFileChangedException {
+if (e instanceof AmazonServiceException) {
+  AmazonServiceException serviceException = (AmazonServiceException) e;
+  if (serviceException.getStatusCode() == 412) {
+versionMismatches.incrementAndGet();
+throw new RemoteFileChangedException(uri, operation, String.format(
+RemoteFileChangedException.PRECONDITIONS_NOT_MET
++ " on %s."
++ " Version %s was unavailable",
+getSource(),
 
 Review comment:
   The only operation string currently going through here is "copy".  It does 
not contain the path.  Source is "etag" or "versionid".  The path is included 
in the exception via the uri provided in the constructor which flows down to 
the base PathIOException.
   
   If the "read" or "select" paths threw 412 or 412-like exceptions they could 
flow through here too, but they do not.  Read returns a null S3Object, which 
flows into processResponse().  Select doesn't directly support ETag or 
versionId qualification so that code path is making a call to processMetadata() 
before the read.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269757427
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3LocatedFileStatus.java
 ##
 @@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.apache.hadoop.fs.BlockLocation;
+import org.apache.hadoop.fs.LocatedFileStatus;
+
+/**
+ * {@link LocatedFileStatus} extended to also carry ETag and object version ID.
+ */
+public class S3LocatedFileStatus extends LocatedFileStatus {
+  private final String eTag;
+  private final String versionId;
+
+  public S3LocatedFileStatus(S3AFileStatus status, BlockLocation[] locations,
+  String eTag, String versionId) {
+super(status, locations);
+this.eTag = eTag;
+this.versionId = versionId;
+  }
+
+  public String getETag() {
+return eTag;
+  }
+
+  public String getVersionId() {
+return versionId;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+return super.equals(o);
 
 Review comment:
   sounds good


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269756262
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestObjectETag.java
 ##
 @@ -0,0 +1,304 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.s3guard;
+
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.when;
+import static org.mockito.hamcrest.MockitoHamcrest.argThat;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.CompleteMultipartUploadRequest;
+import com.amazonaws.services.s3.model.CompleteMultipartUploadResult;
+import com.amazonaws.services.s3.model.GetObjectMetadataRequest;
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.InitiateMultipartUploadRequest;
+import com.amazonaws.services.s3.model.InitiateMultipartUploadResult;
+import com.amazonaws.services.s3.model.ListObjectsV2Request;
+import com.amazonaws.services.s3.model.ListObjectsV2Result;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.PutObjectRequest;
+import com.amazonaws.services.s3.model.PutObjectResult;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.UploadPartRequest;
+import com.amazonaws.services.s3.model.UploadPartResult;
+import java.io.ByteArrayInputStream;
+import java.nio.charset.Charset;
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.AbstractS3AMockTest;
+import org.apache.hadoop.fs.s3a.Constants;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy.Source;
+import org.hamcrest.BaseMatcher;
+import org.hamcrest.Description;
+import org.hamcrest.Matcher;
+import org.junit.Assume;
+import org.junit.Before;
+import org.junit.Test;
+
+/**
+ * Tests to ensure eTag is captured on S3 PUT and used on GET.
+ */
+public class TestObjectETag extends AbstractS3AMockTest {
+  @Before
+  public void before() {
+Assume.assumeTrue("change detection source should be etag",
+fs.getChangeDetectionPolicy().getSource() == Source.ETag);
+  }
+
+  /**
+   * Tests a file uploaded with a single PUT to ensure eTag is captured and 
used
+   * on file read.
 
 Review comment:
   Could you add a test for overwriting an existing file?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269754782
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
+
+* S3 is exhibiting read-after-overwrite eventual consistency.  The S3Guard
+metadata was updated with a new ETag during a recent write, but the current 
read
+is not seeing that ETag due to S3 eventual consistency.  This exception 
prevents
+the reader from an inconsistent read where the reader sees an older version of
+the file.
+
+```
+org.apache.hadoop.fs.s3a.RemoteFileChangedException: open 
's3a://my-bucket/test/file.txt':
+  ETag change reported by S3 while reading at position 0.
+  Version 4e886e26c072fef250cfaf8037675405 was unavailable
 
 Review comment:
   I think there's ETag/Version mismatch here - shouldn't this say "Version 
change reported...", or "ETag ... was unavailable"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269753654
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
+
+* S3 is exhibiting read-after-overwrite eventual consistency.  The S3Guard
 
 Review comment:
   Might be more clear to say "temporary inconsistency" instead of "eventual 
consistency" here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269753107
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified 
"iceland-2":
 
 The region specified in `fs.s3a.s3guard.ddb.region` is invalid.
 
+### Error `RemoteFileChangedException`
+
+An exception like the following could occur for a couple of reasons:
+
+* the S3Guard metadata is out of sync with the true S3 metadata.  For
+example, the S3Guard DynamoDB table is tracking a different ETag than the ETag
+shown in the exception.  This may suggest the object was updated in S3 without
+involvement from S3Guard or there was a transient failure when S3Guard tried to
+write to S3.
 
 Review comment:
   S3 -> DynamoDB


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269752128
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
+fs.s3.change.detection.source=versionid.
+
+ Out of Sync Metadata
+
+The S3Guard version tracking metadata (ETag or object version ID) could become
+out of sync with the true current object metadata in S3.  For example, S3Guard
+is still tracking v1 of some

[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269752588
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
+fs.s3.change.detection.source=versionid.
+
+ Out of Sync Metadata
+
+The S3Guard version tracking metadata (ETag or object version ID) could become
+out of sync with the true current object metadata in S3.  For example, S3Guard
+is still tracking v1 of some

[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269752128
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
+fs.s3.change.detection.source=versionid.
+
+ Out of Sync Metadata
+
+The S3Guard version tracking metadata (ETag or object version ID) could become
+out of sync with the true current object metadata in S3.  For example, S3Guard
+is still tracking v1 of some

[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269751086
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
+object versioning. This works such that a reader reading a file after an
+overwrite either sees the new version of the file or an error. Without S3Guard,
+new readers may see the original version. Once S3 reaches eventual consistency,
+new readers will see the new version.
+
+Readers using S3Guard will usually see the new file version, but may
+in rare cases see `RemoteFileChangedException` instead. This would occur if
+an S3 object read cannot provide the version tracked in S3Guard metadata.
+
+S3Guard achieves this behavior by storing ETags and object version IDs in the
+S3Guard metadata store (e.g. DynamoDB). On opening a file, S3AFileSystem
+will look in S3 for the version of the file indicated by the ETag or object
+version ID stored in the metadata store. If that version is unavailable,
+`RemoteFileChangedException` is thrown. Whether ETag or version ID is used is
+determed by the
+[fs.s3a.change.detection configuration 
options](./index.html#Handling_Read-During-Overwrite).
+
+### No Versioning Metadata Available
+
+When the first S3AFileSystem clients are upgraded to the version of
+S3AFileSystem that contains these change tracking features, any existing
+S3Guard metadata will not contain ETags or object version IDs.  Reads of files
+tracked in such S3Guard metadata will access whatever version of the file is
+available in S3 at the time of read.  Only if the file is subsequently updated
+will S3Guard start tracking ETag and object version ID and as such generating
+`RemoteFileChangedException` if an inconsistency is detected.
+
+Similarly, when S3Guard metadata is pruned, S3Guard will no longer be able to
+detect an inconsistent read.  S3Guard metadata should be retained for at least
+as long as the perceived read-after-overwrite eventual consistency window.
+That window is expected to be short, but there are no guarantees so it is at 
the
+administrator's discretion to weigh the risk.
+
+### Known Limitations
+
+ S3 Select
+
+S3 Select does not provide a capability for server-side ETag or object
+version ID qualification. Whether fs.s3a.change.detection.mode is client or
+server, S3Guard will cause a client-side check of the file version before
+opening the file with S3 Select.  If the current version does not match the
+version tracked in S3Guard, `RemoteFileChangedException` is thrown.
+
+It is still possible that the S3 Select read will access a different version of
+the file, if the visible file version changes between the version check and
+the opening of the file.  This can happen due to eventual consistency or
+an overwrite of the file between the version check and the open of the file.
+
+ Rename
+
+Rename is implemented via copy in S3.  With 
fs.s3a.change.detection.mode=client,
+a fully reliable mechansim for ensuring the copied content is the expected
+content is not possible. This is the case since there isn't necessarily a way
+to know the expected ETag or version ID to appear on the object resulting from
+the copy.
+
+Furthermore, if fs.s3a.change.detection.mode=server and a third-party S3
+implemntation is used that doesn't honor the provided ETag or version ID,
+S3AFileSystem and S3Guard cannot detect it.
+
+In either fs.s3.change.detection.mode=server or client, a client-side check
+will be performed before the copy to ensure the current version of the file
+matches S3Guard metadata.  If not, `RemoteFileChangedException` is thrown.
+Similar to as discussed with regard to S3 Select, this is not sufficient to
+guarantee that same version is the version copied.
+
+When fs.s3.change.detection.mode=server, the expected version is also specified
+in the underlying S3 CopyObjectRequest.  As long as the server honors it, the
+copied object will be correct.
+
+All this said, with the defaults of fs.s3.change.detection.mode=server and
+fs.s3.change.detection.source=etag against Amazon's S3, copy should in fact
+either copy the expected file version or, in the case of an eventual 
consistency
+anamoly, generate `RemoteFileChangedException`.  The same should be true with
 
 Review comment:
   typo, anamoly -> anomaly


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci

[GitHub] [hadoop] hadoop-yetus commented on issue #632: HDDS-1255. Refactor ozone acceptance test to allow run in secure mode. Contributed by Ajay Kumar.

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #632: HDDS-1255. Refactor ozone acceptance test 
to allow run in secure mode. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/632#issuecomment-477331007
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 0 | Docker mode activated. |
   | -1 | patch | 7 | https://github.com/apache/hadoop/pull/632 does not apply 
to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/632 |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-632/4/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269750692
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -923,6 +923,102 @@ from previous days, and and choosing a combination of
 retry counts and an interval which allow for the clients to cope with
 some throttling, but not to time out other applications.
 
+## Read-After-Overwrite Consistency
+
+S3Guard provides read-after-overwrite consistency through ETags (default) or
 
 Review comment:
   Maybe add a brief note about client vs. server mode here, and what the 
default is - to provide some context for when client or server side checks are 
mentioned later on.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269747516
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
+  throws PathIOException {
+// ETag (sometimes, depending on encryption and/or multipart) is not the
+// same on the copied object as the original.  Version Id seems to never
+// be the same on the copy.  As such, there isn't really anything that
+// can be verified on the response.
+  }
+
+  /**
+   * Process an exception generated against the change policy.
+   * If the exception indicates the file has changed, this method throws
+   * {@code RemoteFileChangedException} with the original exception as the
+   * cause.
+   * @param e the exception
+   * @param operation the operation performed when the exception was
+   * generated.
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processException(Exception e, String operation) throws
+  RemoteFileChangedException {
+if (e instanceof AmazonServiceException) {
+  AmazonServiceException serviceException = (AmazonServiceException) e;
+  if (serviceException.getStatusCode() == 412) {
+versionMismatches.incrementAndGet();
+throw new RemoteFileChangedException(uri, operation, String.format(
+RemoteFileChangedException.PRECONDITIONS_NOT_MET
++ " on %s."
++ " Version %s was unavailable",
+getSource(),
 
 Review comment:
   Does the operation string contain the path? If not, should this be 
getSource().getSource()?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269742590
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3LocatedFileStatus.java
 ##
 @@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.apache.hadoop.fs.BlockLocation;
+import org.apache.hadoop.fs.LocatedFileStatus;
+
+/**
+ * {@link LocatedFileStatus} extended to also carry ETag and object version ID.
+ */
+public class S3LocatedFileStatus extends LocatedFileStatus {
+  private final String eTag;
+  private final String versionId;
+
+  public S3LocatedFileStatus(S3AFileStatus status, BlockLocation[] locations,
+  String eTag, String versionId) {
+super(status, locations);
+this.eTag = eTag;
+this.versionId = versionId;
+  }
+
+  public String getETag() {
+return eTag;
+  }
+
+  public String getVersionId() {
+return versionId;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+return super.equals(o);
 
 Review comment:
   Originally I didn't override this.  FindBugs flagged it as an issue, which 
prompted me to add it.  The base LocatedFileStatus equality is only based on 
Path and this implementation doesn't need to be different.
   
   It looks like some would argue FindBugs shouldn't flag this:
   https://sourceforge.net/p/findbugs/bugs/1379/
   
   I'll just add a comment to explain why I'm implementing and why it's ok to 
not to include ETag and version ID.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269741838
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
+  throws PathIOException {
+// ETag (sometimes, depending on encryption and/or multipart) is not the
+// same on the copied object as the original.  Version Id seems to never
+// be the same on the copy.  As such, there isn't really anything that
+// can be verified on the response.
+  }
+
+  /**
+   * Process an exception generated against the change policy.
+   * If the exception indicates the file has changed, this method throws
+   * {@code RemoteFileChangedException} with the original exception as the
+   * cause.
+   * @param e the exception
+   * @param operation the operation performed when the exception was
+   * generated.
 
 Review comment:
   can you provide example(s) for typical operation values / descriptions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269741272
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
 
 Review comment:
   +1, and maybe add debug logging as well 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269739781
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -696,7 +698,7 @@ public S3AInputPolicy getInputPolicy() {
* @return the change detection policy
*/
   @VisibleForTesting
 
 Review comment:
   I actually thought a little about this.  I was thinking of 
`@VisibileForTesting` in this case as documenting this method is only public to 
allow access in tests (in a different package).  I know it is more typically 
used on protected or package-private methods.
   
   I'm curious if there is any feedback about this being public in general?  
I'm accessing it across packages in a couple of tests (in the s3guard and 
select packages).
   
   I can reinforce that it is only visible for tests by mentioning that 
explicitly in the javadoc.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269735660
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeDetectionPolicy.java
 ##
 @@ -277,11 +295,32 @@ public String getRevisionId(ObjectMetadata 
objectMetadata, String uri) {
   return objectMetadata.getETag();
 }
 
+@Override
+public String getRevisionId(S3ObjectAttributes s3Attributes) {
+  return s3Attributes.getETag();
+}
+
+@Override
+public String getRevisionId(CopyResult copyResult) {
+  return copyResult.getETag();
+}
+
 @Override
 public void applyRevisionConstraint(GetObjectRequest request,
 String revisionId) {
-  LOG.debug("Restricting request to etag {}", revisionId);
-  request.withMatchingETagConstraint(revisionId);
+  if (revisionId != null) {
+LOG.debug("Restricting request to etag {}", revisionId);
 
 Review comment:
   could clarify "get request" / "copy request" in these log messages so that 
they are unique (same applies for VersionIdChangeDetectionPolicy)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269735660
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeDetectionPolicy.java
 ##
 @@ -277,11 +295,32 @@ public String getRevisionId(ObjectMetadata 
objectMetadata, String uri) {
   return objectMetadata.getETag();
 }
 
+@Override
+public String getRevisionId(S3ObjectAttributes s3Attributes) {
+  return s3Attributes.getETag();
+}
+
+@Override
+public String getRevisionId(CopyResult copyResult) {
+  return copyResult.getETag();
+}
+
 @Override
 public void applyRevisionConstraint(GetObjectRequest request,
 String revisionId) {
-  LOG.debug("Restricting request to etag {}", revisionId);
-  request.withMatchingETagConstraint(revisionId);
+  if (revisionId != null) {
+LOG.debug("Restricting request to etag {}", revisionId);
 
 Review comment:
   could clarify "get request" / "copy request" in these log messages so that 
they are unique


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269735003
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeDetectionPolicy.java
 ##
 @@ -52,6 +55,10 @@
   private final Mode mode;
   private final boolean requireVersion;
 
+  public abstract String getRevisionId(S3ObjectAttributes s3Attributes);
 
 Review comment:
   add javadoc for these 2 new methods


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269734035
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3LocatedFileStatus.java
 ##
 @@ -0,0 +1,55 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.apache.hadoop.fs.BlockLocation;
+import org.apache.hadoop.fs.LocatedFileStatus;
+
+/**
+ * {@link LocatedFileStatus} extended to also carry ETag and object version ID.
+ */
+public class S3LocatedFileStatus extends LocatedFileStatus {
+  private final String eTag;
+  private final String versionId;
+
+  public S3LocatedFileStatus(S3AFileStatus status, BlockLocation[] locations,
+  String eTag, String versionId) {
+super(status, locations);
+this.eTag = eTag;
+this.versionId = versionId;
+  }
+
+  public String getETag() {
+return eTag;
+  }
+
+  public String getVersionId() {
+return versionId;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+return super.equals(o);
 
 Review comment:
   If not changing the behavior, no need to override equals and hashCode 
methods here - only valid reason might be to raise attention to the fact that 
the eTag and versionId are ignored by them? in that case, should add a comment 
to explain why.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269730870
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -2859,12 +2893,15 @@ public String getCanonicalServiceName() {
* @param srcKey source object path
* @param dstKey destination object path
* @param size object size
+   * @param srcAttributes S3 attributes of the source object
+   * @param readContext the read context
* @throws AmazonClientException on failures inside the AWS SDK
* @throws InterruptedIOException the operation was interrupted
* @throws IOException Other IO problems
*/
   @Retries.RetryMixed
-  private void copyFile(String srcKey, String dstKey, long size)
+  private CopyResult copyFile(String srcKey, String dstKey, long size,
 
 Review comment:
   add @return javadoc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#issuecomment-477312454
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 28 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 3 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 61 | Maven dependency ordering for branch |
   | +1 | mvninstall | 997 | trunk passed |
   | +1 | compile | 921 | trunk passed |
   | +1 | checkstyle | 178 | trunk passed |
   | +1 | mvnsite | 137 | trunk passed |
   | +1 | shadedclient | 997 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 110 | trunk passed |
   | +1 | javadoc | 85 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 21 | Maven dependency ordering for patch |
   | +1 | mvninstall | 95 | the patch passed |
   | +1 | compile | 946 | the patch passed |
   | +1 | javac | 946 | the patch passed |
   | +1 | checkstyle | 185 | the patch passed |
   | +1 | mvnsite | 116 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 616 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 119 | the patch passed |
   | +1 | javadoc | 104 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 35 | client in the patch passed. |
   | +1 | unit | 77 | common in the patch passed. |
   | -1 | unit | 1456 | integration-test in the patch failed. |
   | +1 | asflicense | 38 | The patch does not generate ASF License warnings. |
   | | | 7273 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.om.TestOzoneManagerHA |
   |   | hadoop.ozone.container.TestContainerReplication |
   |   | hadoop.ozone.container.metrics.TestContainerMetrics |
   |   | hadoop.ozone.container.ozoneimpl.TestSecureOzoneContainer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/641 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 089b0d54691b 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / a4cd75e |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/2/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/2/testReport/ |
   | Max. process+thread count | 3674 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/client hadoop-hdds/common 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-641/2/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269729362
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -696,7 +698,7 @@ public S3AInputPolicy getInputPolicy() {
* @return the change detection policy
*/
   @VisibleForTesting
 
 Review comment:
   No longer needs @VisibleForTesting annotation now, since public


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269727362
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileStatus.java
 ##
 @@ -67,31 +89,64 @@ public S3AFileStatus(Tristate isemptydir,
* @param path path
* @param blockSize block size
* @param owner owner
+   * @param eTag eTag of the S3 object if available, else null
+   * @param versionId versionId of the S3 object if available, else null
*/
   public S3AFileStatus(long length, long modification_time, Path path,
-  long blockSize, String owner) {
-super(length, false, 1, blockSize, modification_time, path);
+  long blockSize, String owner, String eTag, String versionId) {
+super(length, false, 1, blockSize, modification_time,
+path);
 isEmptyDirectory = Tristate.FALSE;
+this.eTag = eTag;
+this.versionId = versionId;
 setOwner(owner);
 setGroup(owner);
   }
 
+  /**
+   * A simple file.
+   * @param length file length
+   * @param modification_time mod time
+   * @param access_time  access time
+   * @param path path
+   * @param blockSize block size
+   * @param owner owner
+   * @param group group
+   * @param permission persmission
 
 Review comment:
   typo, persmission


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269726736
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileStatus.java
 ##
 @@ -54,10 +57,29 @@ public S3AFileStatus(boolean isemptydir,
   public S3AFileStatus(Tristate isemptydir,
   Path path,
   String owner) {
-super(0, true, 1, 0, 0, path);
+this(isemptydir, path, owner, owner, 0, 0, null);
+  }
+
+  /**
+   * Create a directory status.
+   * @param isemptydir is this an empty directory?
+   * @param path the path
+   * @param owner the owner
+   * @param group the group
+   * @param modification_time the modification time
+   */
 
 Review comment:
   add @param doc for access_time and permission


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #646: HADOOP-16085: use 
object version or etags to protect against inconsistent read after 
replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269724305
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -545,7 +545,7 @@ private Constants() {
   public static final String S3GUARD_METASTORE_LOCAL_ENTRY_TTL =
   "fs.s3a.s3guard.local.ttl";
   public static final int DEFAULT_S3GUARD_METASTORE_LOCAL_ENTRY_TTL
-  = 10 * 1000;
+  = 120 * 1000;
 
 Review comment:
   seems to be a common problem, this was increased to 60s in 
https://github.com/apache/hadoop/pull/624 / 
https://github.com/apache/hadoop/pull/630


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16215) Genconfig does not generate LOG4j configs

2019-03-27 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-16215:
-

 Summary: Genconfig does not generate LOG4j configs
 Key: HADOOP-16215
 URL: https://issues.apache.org/jira/browse/HADOOP-16215
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 0.3.0
Reporter: Hrishikesh Gadre
Assignee: Hrishikesh Gadre


Genconfig does not generate Log4J configs, This is needed for Ozone configs to 
work correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #650: HADOOP-16209. Create simple docker based pseudo-cluster for hdfs

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #650: HADOOP-16209. Create simple docker based 
pseudo-cluster for hdfs
URL: https://github.com/apache/hadoop/pull/650#issuecomment-477300962
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 34 | Docker mode activated. |
   ||| _ Prechecks _ |
   | 0 | yamllint | 0 | yamllint was not available. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 81 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1087 | trunk passed |
   | +1 | compile | 930 | trunk passed |
   | +1 | mvnsite | 798 | trunk passed |
   | +1 | shadedclient | 706 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 334 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 26 | Maven dependency ordering for patch |
   | +1 | mvninstall | 1115 | the patch passed |
   | +1 | compile | 925 | the patch passed |
   | +1 | javac | 925 | the patch passed |
   | +1 | mvnsite | 804 | the patch passed |
   | +1 | shellcheck | 2 | There were no new shellcheck issues. |
   | +1 | shelldocs | 12 | There were no new shelldocs issues. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 709 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 365 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 8705 | root in the patch failed. |
   | -1 | asflicense | 54 | The patch generated 2 ASF License warnings. |
   | | | 16874 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
   |   | hadoop.hdfs.TestReconstructStripedFile |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-650/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/650 |
   | Optional Tests |  dupname  asflicense  shellcheck  shelldocs  compile  
javac  javadoc  mvninstall  mvnsite  unit  shadedclient  yamllint  |
   | uname | Linux 27969e09fceb 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri 
Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / b226958 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | shellcheck | v0.4.6 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-650/1/artifact/out/patch-unit-root.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-650/1/testReport/ |
   | asflicense | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-650/1/artifact/out/patch-asflicense-problems.txt
 |
   | Max. process+thread count | 3937 (vs. ulimit of 5500) |
   | modules | C: hadoop-dist . U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-650/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16144) Create a Hadoop RPC based KMS client

2019-03-27 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803195#comment-16803195
 ] 

Anu Engineer edited comment on HADOOP-16144 at 3/27/19 6:36 PM:


I have a KMS server with Hadoop RPC working on my machine. It is not complete, 
the client side and tools is what I am working on now. We are on track. Sorry, 
I got pulled into some Ozone-4.0 release too work too. But I will have this 
done by end of this week. At least something that you will be able to take and 
start using.


was (Author: anu):
I have a KMS server with Hadoop RPC working on my machine. It is not complete, 
the client side and tools is what I am working on now. We are on track. Sorry, 
I got pulled into some Ozone-4.0 release too work too. But I have this done by 
end of this week. At least something that you will be able to take and start 
using.

> Create a Hadoop RPC based KMS client
> 
>
> Key: HADOOP-16144
> URL: https://issues.apache.org/jira/browse/HADOOP-16144
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: kms
>Reporter: Wei-Chiu Chuang
>Assignee: Anu Engineer
>Priority: Major
>
> Create a new KMS client implementation that speaks Hadoop RPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16144) Create a Hadoop RPC based KMS client

2019-03-27 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803195#comment-16803195
 ] 

Anu Engineer commented on HADOOP-16144:
---

I have a KMS server with Hadoop RPC working on my machine. It is not complete, 
the client side and tools is what I am working on now. We are on track. Sorry, 
I got pulled into some Ozone-4.0 release too work too. But I have this done by 
end of this week. At least something that you will be able to take and start 
using.

> Create a Hadoop RPC based KMS client
> 
>
> Key: HADOOP-16144
> URL: https://issues.apache.org/jira/browse/HADOOP-16144
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: kms
>Reporter: Wei-Chiu Chuang
>Assignee: Anu Engineer
>Priority: Major
>
> Create a new KMS client implementation that speaks Hadoop RPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16144) Create a Hadoop RPC based KMS client

2019-03-27 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803176#comment-16803176
 ] 

Daryn Sharp commented on HADOOP-16144:
--

[~anu], let me know if you will have cycles this week.  First deploy of RPC/TLS 
is a success. All RPC services sans task/AM umbilicals are encrypted (AM lacks 
cert).  I'm eager to convert the kms and will work on it myself later this week 
if you don't have cycles to spare.

> Create a Hadoop RPC based KMS client
> 
>
> Key: HADOOP-16144
> URL: https://issues.apache.org/jira/browse/HADOOP-16144
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: kms
>Reporter: Wei-Chiu Chuang
>Assignee: Anu Engineer
>Priority: Major
>
> Create a new KMS client implementation that speaks Hadoop RPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #628: HADOOP-16186. NPE in ITestS3AFileSystemContract teardown in DynamoDBMetadataStore.lambda$listChildren

2019-03-27 Thread GitBox
hadoop-yetus commented on issue #628: HADOOP-16186. NPE in 
ITestS3AFileSystemContract teardown in DynamoDBMetadataStore.lambda$listChildren
URL: https://github.com/apache/hadoop/pull/628#issuecomment-477289010
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 75 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1390 | trunk passed |
   | +1 | compile | 39 | trunk passed |
   | +1 | checkstyle | 27 | trunk passed |
   | +1 | mvnsite | 43 | trunk passed |
   | +1 | shadedclient | 885 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 60 | trunk passed |
   | +1 | javadoc | 30 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | +1 | mvninstall | 36 | the patch passed |
   | +1 | compile | 36 | the patch passed |
   | +1 | javac | 36 | the patch passed |
   | +1 | checkstyle | 22 | the patch passed |
   | +1 | mvnsite | 39 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 953 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 63 | the patch passed |
   | +1 | javadoc | 27 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 283 | hadoop-aws in the patch passed. |
   | +1 | asflicense | 33 | The patch does not generate ASF License warnings. |
   | | | 4125 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-628/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/628 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 1f31ddf4858c 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri 
Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / a4cd75e |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-628/3/testReport/ |
   | Max. process+thread count | 320 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-628/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #647: HADOOP-16118. S3Guard to support on-demand DDB tables.

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #647: HADOOP-16118. S3Guard 
to support on-demand DDB tables.
URL: https://github.com/apache/hadoop/pull/647#discussion_r269702216
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -906,22 +909,102 @@ If operations, especially directory operations, are 
slow, check the AWS
 console. It is also possible to set up AWS alerts for capacity limits
 being exceeded.
 
+###  On-Demand Dynamo Capacity
+
+[Amazon DynamoDB 
On-Demand](https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/)
+removes the need to pre-allocate I/O capacity for S3Guard tables.
+Instead the caller is _only_ charged per I/O Operation.
+
+* There are no SLA capacity guarantees. This is generally not an issue
+for S3Guard applications.
+* There's no explicit limit on I/O capacity, so operations which make
+heavy use of S3Guard tables (for example: SQL query planning) do not
+get throttled.
+* There's no way put a limit on the I/O; you may unintentionally run up
+large bills through sustained heavy load.
+* The `s3guard set-capacity` command fails: it does not make sense any more.
+
+When idle, S3Guard tables are only billed for the data stored, not for
+any unused capacity. For this reason, there is no benefit from sharing
+a single S3Guard table across multiple buckets.
+
+*Enabling DynamoDB On-Demand for a S3Guard table*
+
+You cannot currently enable DynamoDB on-demand from the `s3guard` command
+when creating or updating a bucket.
+
+Instead it must be done through the AWS console or [the 
CLI](https://docs.aws.amazon.com/cli/latest/reference/dynamodb/update-table.html).
+From the Web console or the command line, switch the billing to 
pay-per-request.
+
+Once enabled, the read and write capacities of the table listed in the
+`hadoop s3guard bucket-info` command become "0", and the "billing-mode"
+attribute changes to "per-request":
+
+```
+> hadoop s3guard bucket-info s3a://example-bucket/
+
+Filesystem s3a://example-bucket
+Location: eu-west-1
+Filesystem s3a://example-bucket is using S3Guard with store
+  DynamoDBMetadataStore{region=eu-west-1, tableName=example-bucket,
+  tableArn=arn:aws:dynamodb:eu-west-1:11:table/example-bucket}
+Authoritative S3Guard: fs.s3a.metadatastore.authoritative=false
+Metadata Store Diagnostics:
+  ARN=arn:aws:dynamodb:eu-west-1:11:table/example-bucket
+  billing-mode=per-request
+  description=S3Guard metadata store in DynamoDB
+  name=example-bucket
+  persist.authoritative.bit=true
+  read-capacity=0
+  region=eu-west-1
+  retryPolicy=ExponentialBackoffRetry(maxRetries=9, sleepTime=250 MILLISECONDS)
+  size=66797
+  status=ACTIVE
+  table={AttributeDefinitions:
+[{AttributeName: child,AttributeType: S},
+ {AttributeName: parent,AttributeType: S}],
+ TableName: example-bucket,
+ KeySchema: [{
+   AttributeName: parent,KeyType: HASH},
+   {AttributeName: child,KeyType: RANGE}],
+ TableStatus: ACTIVE,
+ CreationDateTime: Thu Oct 11 18:51:14 BST 2018,
+ ProvisionedThroughput: {
+   LastIncreaseDateTime: Tue Oct 30 16:48:45 GMT 2018,
+   LastDecreaseDateTime: Tue Oct 30 18:00:03 GMT 2018,
+   NumberOfDecreasesToday: 0,
+   ReadCapacityUnits: 0,
+   WriteCapacityUnits: 0},
+ TableSizeBytes: 66797,
+ ItemCount: 415,
+ TableArn: arn:aws:dynamodb:eu-west-1:11:table/example-bucket,
+ TableId: a7b0728a-f008-4260-b2a0-ab,}
+  write-capacity=0
+The "magic" committer is supported
+```
+
+###  Autoscaling S3Guard tables.
+
 [DynamoDB Auto 
Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html)
 can automatically increase and decrease the allocated capacity.
-This is good for keeping capacity high when needed, but avoiding large
-bills when it is not.
+
+Before DynamoDB On-Demand was introduced, autoscaling was the sole form
+of dynamic scaling. 
 
 Experiments with S3Guard and DynamoDB Auto Scaling have shown that any Auto 
Scaling
 operation will only take place after callers have been throttled for a period 
of
 time. The clients will still need to be configured to retry when overloaded
 until any extra capacity is allocated. Furthermore, as this retrying will
-block the threads from performing other operations -including more IO, the
+block the threads from performing other operations -including more I/O, the
 the autoscale may not scale fast enough.
 
-We recommend experimenting with this, based on usage information collected
-from previous days, and and choosing a combination of
-retry counts and an interval which allow for the clients to cope with
-some throttling, but not to time out other applications.
+This is why the DynamoDB On-Demand appears to be a better option for
+workloads with Hadoop, Spark, Hive and othe rapplications.
+
+If autoscaling is to be used, we recommend experimenting with

[GitHub] [hadoop] noslowerdna commented on a change in pull request #647: HADOOP-16118. S3Guard to support on-demand DDB tables.

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #647: HADOOP-16118. S3Guard 
to support on-demand DDB tables.
URL: https://github.com/apache/hadoop/pull/647#discussion_r269702082
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##
 @@ -906,22 +909,102 @@ If operations, especially directory operations, are 
slow, check the AWS
 console. It is also possible to set up AWS alerts for capacity limits
 being exceeded.
 
+###  On-Demand Dynamo Capacity
+
+[Amazon DynamoDB 
On-Demand](https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/)
+removes the need to pre-allocate I/O capacity for S3Guard tables.
+Instead the caller is _only_ charged per I/O Operation.
+
+* There are no SLA capacity guarantees. This is generally not an issue
+for S3Guard applications.
+* There's no explicit limit on I/O capacity, so operations which make
+heavy use of S3Guard tables (for example: SQL query planning) do not
+get throttled.
+* There's no way put a limit on the I/O; you may unintentionally run up
+large bills through sustained heavy load.
+* The `s3guard set-capacity` command fails: it does not make sense any more.
+
+When idle, S3Guard tables are only billed for the data stored, not for
+any unused capacity. For this reason, there is no benefit from sharing
+a single S3Guard table across multiple buckets.
+
+*Enabling DynamoDB On-Demand for a S3Guard table*
+
+You cannot currently enable DynamoDB on-demand from the `s3guard` command
+when creating or updating a bucket.
+
+Instead it must be done through the AWS console or [the 
CLI](https://docs.aws.amazon.com/cli/latest/reference/dynamodb/update-table.html).
+From the Web console or the command line, switch the billing to 
pay-per-request.
+
+Once enabled, the read and write capacities of the table listed in the
+`hadoop s3guard bucket-info` command become "0", and the "billing-mode"
+attribute changes to "per-request":
+
+```
+> hadoop s3guard bucket-info s3a://example-bucket/
+
+Filesystem s3a://example-bucket
+Location: eu-west-1
+Filesystem s3a://example-bucket is using S3Guard with store
+  DynamoDBMetadataStore{region=eu-west-1, tableName=example-bucket,
+  tableArn=arn:aws:dynamodb:eu-west-1:11:table/example-bucket}
+Authoritative S3Guard: fs.s3a.metadatastore.authoritative=false
+Metadata Store Diagnostics:
+  ARN=arn:aws:dynamodb:eu-west-1:11:table/example-bucket
+  billing-mode=per-request
+  description=S3Guard metadata store in DynamoDB
+  name=example-bucket
+  persist.authoritative.bit=true
+  read-capacity=0
+  region=eu-west-1
+  retryPolicy=ExponentialBackoffRetry(maxRetries=9, sleepTime=250 MILLISECONDS)
+  size=66797
+  status=ACTIVE
+  table={AttributeDefinitions:
+[{AttributeName: child,AttributeType: S},
+ {AttributeName: parent,AttributeType: S}],
+ TableName: example-bucket,
+ KeySchema: [{
+   AttributeName: parent,KeyType: HASH},
+   {AttributeName: child,KeyType: RANGE}],
+ TableStatus: ACTIVE,
+ CreationDateTime: Thu Oct 11 18:51:14 BST 2018,
+ ProvisionedThroughput: {
+   LastIncreaseDateTime: Tue Oct 30 16:48:45 GMT 2018,
+   LastDecreaseDateTime: Tue Oct 30 18:00:03 GMT 2018,
+   NumberOfDecreasesToday: 0,
+   ReadCapacityUnits: 0,
+   WriteCapacityUnits: 0},
+ TableSizeBytes: 66797,
+ ItemCount: 415,
+ TableArn: arn:aws:dynamodb:eu-west-1:11:table/example-bucket,
+ TableId: a7b0728a-f008-4260-b2a0-ab,}
+  write-capacity=0
+The "magic" committer is supported
+```
+
+###  Autoscaling S3Guard tables.
+
 [DynamoDB Auto 
Scaling](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AutoScaling.html)
 can automatically increase and decrease the allocated capacity.
-This is good for keeping capacity high when needed, but avoiding large
-bills when it is not.
+
+Before DynamoDB On-Demand was introduced, autoscaling was the sole form
+of dynamic scaling. 
 
 Experiments with S3Guard and DynamoDB Auto Scaling have shown that any Auto 
Scaling
 operation will only take place after callers have been throttled for a period 
of
 time. The clients will still need to be configured to retry when overloaded
 until any extra capacity is allocated. Furthermore, as this retrying will
-block the threads from performing other operations -including more IO, the
+block the threads from performing other operations -including more I/O, the
 the autoscale may not scale fast enough.
 
-We recommend experimenting with this, based on usage information collected
-from previous days, and and choosing a combination of
-retry counts and an interval which allow for the clients to cope with
-some throttling, but not to time out other applications.
+This is why the DynamoDB On-Demand appears to be a better option for
+workloads with Hadoop, Spark, Hive and othe rapplications.
 
 Review comment:
   typo, "other applications"

--

[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269699900
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -721,4 +721,24 @@ private Constants() {
* Default change detection require version: true.
*/
   public static final boolean CHANGE_DETECT_REQUIRE_VERSION_DEFAULT = true;
+
+  public static final String MULTIPART_DOWNLOAD_ENABLED =
+  "fs.s3a.multipartdownload.enabled";
+  public static final boolean DEFAULT_MULTIPART_DOWNLOAD_ENABLED = false;
+
+  public static final String MULTIPART_DOWNLOAD_PART_SIZE =
+  "fs.s3a.multipartdownload.part-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_PART_SIZE = 8_000_000;
+
+  public static final String MULTIPART_DOWNLOAD_CHUNK_SIZE =
+  "fs.s3a.multipartdownload.chunk-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_CHUNK_SIZE = 262_144;
+
+  public static final String MULTIPART_DOWNLOAD_BUFFER_SIZE =
+  "fs.s3a.multipartdownload.buffer-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_BUFFER_SIZE = 20_000_000;
 
 Review comment:
   How did you determine the optimal value for this? Javadoc for it should 
specify that the size is "(in bytes)".


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269699483
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -721,4 +721,24 @@ private Constants() {
* Default change detection require version: true.
*/
   public static final boolean CHANGE_DETECT_REQUIRE_VERSION_DEFAULT = true;
+
+  public static final String MULTIPART_DOWNLOAD_ENABLED =
 
 Review comment:
   Some documentation should be added to hadoop-aws/index.md and 
hadoop-aws/performance.md about this capability.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269698667
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/multipart/AbortableS3ObjectInputStream.java
 ##
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.multipart;
+
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+
+import java.io.IOException;
+
+/**
+ * An adapter between {@link AbortableInputStream} and
+ * {@link S3ObjectInputStream}.
+ */
+public class AbortableS3ObjectInputStream extends AbortableInputStream {
+
+  private final S3ObjectInputStream s3ObjectInputStream;
+
+  public AbortableS3ObjectInputStream(S3ObjectInputStream s3ObjectInputStream) 
{
 
 Review comment:
   add constructor javadoc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269698508
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/multipart/AbortableInputStream.java
 ##
 @@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.multipart;
+
+import java.io.InputStream;
+
+/**
+ * An {@link InputStream} that supports aborts.
+ */
+public abstract class AbortableInputStream extends InputStream {
+
+  public abstract void abort();
 
 Review comment:
   add method javadoc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269698213
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -402,6 +416,66 @@ public void initialize(URI name, Configuration 
originalConf)
   long authDirTtl = conf.getLong(METADATASTORE_AUTHORITATIVE_DIR_TTL,
   DEFAULT_METADATASTORE_AUTHORITATIVE_DIR_TTL);
   ttlTimeProvider = new S3Guard.TtlTimeProvider(authDirTtl);
+
+  S3Downloader rawS3Downloader = new S3Downloader() {
+@Override
+public AbortableInputStream download(
+String requestBucket, String key, long rangeStart, long rangeEnd,
+ChangeTracker changeTracker, String operation) throws IOException {
+  String serverSideEncryptionKey = getServerSideEncryptionKey(
+  requestBucket, getConf());
+  GetObjectRequest request = new GetObjectRequest(requestBucket, key);
+  if (S3AEncryptionMethods.SSE_C
+  .equals(getServerSideEncryptionAlgorithm()) &&
+  StringUtils.isNotBlank(serverSideEncryptionKey)) {
+request.setSSECustomerKey(
+new SSECustomerKey(serverSideEncryptionKey));
+  }
+
+  changeTracker.maybeApplyConstraint(request);
+
+  String text = String.format("%s %s at %d",
+  operation, uri, rangeStart);
+
+  S3Object object = Invoker.once(text, "s3a://" + key + "/" + bucket,
 
 Review comment:
   Do the `key` and `bucket` need to be switched here? Also, use 
`requestBucket` instead of `bucket`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269696754
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -721,4 +721,24 @@ private Constants() {
* Default change detection require version: true.
*/
   public static final boolean CHANGE_DETECT_REQUIRE_VERSION_DEFAULT = true;
+
+  public static final String MULTIPART_DOWNLOAD_ENABLED =
+  "fs.s3a.multipartdownload.enabled";
+  public static final boolean DEFAULT_MULTIPART_DOWNLOAD_ENABLED = false;
+
+  public static final String MULTIPART_DOWNLOAD_PART_SIZE =
+  "fs.s3a.multipartdownload.part-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_PART_SIZE = 8_000_000;
+
+  public static final String MULTIPART_DOWNLOAD_CHUNK_SIZE =
+  "fs.s3a.multipartdownload.chunk-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_CHUNK_SIZE = 262_144;
+
+  public static final String MULTIPART_DOWNLOAD_BUFFER_SIZE =
+  "fs.s3a.multipartdownload.buffer-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_BUFFER_SIZE = 20_000_000;
+
+  public static final String MULTIPART_DOWNLOAD_NUM_THREADS =
+  "fs.s3a.multipartdownload.num-threads";
+  public static final int DEFAULT_MULTIPART_DOWNLOAD_NUM_THREADS = 8;
 
 Review comment:
   why not reuse fs.s3a.threads.max (and ideally, the existing thread pool 
corresponding to that config)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269695258
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -721,4 +721,24 @@ private Constants() {
* Default change detection require version: true.
*/
   public static final boolean CHANGE_DETECT_REQUIRE_VERSION_DEFAULT = true;
+
+  public static final String MULTIPART_DOWNLOAD_ENABLED =
+  "fs.s3a.multipartdownload.enabled";
+  public static final boolean DEFAULT_MULTIPART_DOWNLOAD_ENABLED = false;
+
+  public static final String MULTIPART_DOWNLOAD_PART_SIZE =
+  "fs.s3a.multipartdownload.part-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_PART_SIZE = 8_000_000;
+
+  public static final String MULTIPART_DOWNLOAD_CHUNK_SIZE =
+  "fs.s3a.multipartdownload.chunk-size";
+  public static final long DEFAULT_MULTIPART_DOWNLOAD_CHUNK_SIZE = 262_144;
 
 Review comment:
   the javadoc should explain the difference between part size and chunk size, 
also why a new configuration property is needed instead of reusing 
fs.s3a.multipart.size


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #645: HADOOP-16132 Support multipart download in S3AFileSystem

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #645: HADOOP-16132 Support 
multipart download in S3AFileSystem
URL: https://github.com/apache/hadoop/pull/645#discussion_r269694762
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
 ##
 @@ -721,4 +721,24 @@ private Constants() {
* Default change detection require version: true.
*/
   public static final boolean CHANGE_DETECT_REQUIRE_VERSION_DEFAULT = true;
+
+  public static final String MULTIPART_DOWNLOAD_ENABLED =
 
 Review comment:
   should add javadoc for each of these, and add to core-default.xml


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] noslowerdna commented on a change in pull request #624: HADOOP-15999. S3Guard: Better support for out-of-band operations

2019-03-27 Thread GitBox
noslowerdna commented on a change in pull request #624: HADOOP-15999. S3Guard: 
Better support for out-of-band operations
URL: https://github.com/apache/hadoop/pull/624#discussion_r269690083
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -2382,6 +2392,38 @@ S3AFileStatus innerGetFileStatus(final Path f,
 "deleted by S3Guard");
   }
 
+  // if ms is not authoritative, check S3 if there's any recent
+  // modification - compare the modTime to check if metadata is up to date
+  // Skip going to s3 if the file checked is a directory. Because if the
+  // dest is also a directory, there's no difference.
+  // TODO After HADOOP-16085 the modification detection can be done with
+  //  etags or object version instead of modTime
+  if (!pm.getFileStatus().isDirectory() &&
+  !allowAuthoritative) {
+LOG.debug("Metadata for {} found in the non-auth metastore.", path);
+final long msModTime = pm.getFileStatus().getModificationTime();
+
+S3AFileStatus s3AFileStatus;
+try {
+  s3AFileStatus = s3GetFileStatus(path, key, tombstones);
+} catch (FileNotFoundException fne) {
+  s3AFileStatus = null;
+}
+if (s3AFileStatus == null) {
+  LOG.warn("Failed to find file {}. Either it is not yet visible, or "
+  + "it has been deleted.", path);
+} else {
+  final long s3ModTime = s3AFileStatus.getModificationTime();
+
+  if(s3ModTime > msModTime) {
+LOG.debug("S3Guard metadata for {} is outdated, updating it",
 
 Review comment:
   might include the 2 mod times in the debug log


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15183) S3Guard store becomes inconsistent after partial failure of rename

2019-03-27 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803109#comment-16803109
 ] 

Steve Loughran commented on HADOOP-15183:
-

Back working on this; done in a new PR.

Revisiting the code, one thing I see we need to look at is the problem of a 
partial delete where a number of children have all been deleted. Are we 
confident that there are no longer any parent entries which are in that list of 
deleted files? That is, it is never the case that the list of files to rm is

/dir1/
/dir1/file1
/dir1/file2

And so in a partial delete failure (say of file2), is there a dir1 entry to 
purge? I'm not convinced this situation will arise: every path passed into the 
delete will be independent. 





> S3Guard store becomes inconsistent after partial failure of rename
> --
>
> Key: HADOOP-15183
> URL: https://issues.apache.org/jira/browse/HADOOP-15183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15183-001.patch, HADOOP-15183-002.patch, 
> org.apache.hadoop.fs.s3a.auth.ITestAssumeRole-output.txt
>
>
> If an S3A rename() operation fails partway through, such as when the user 
> doesn't have permissions to delete the source files after copying to the 
> destination, then the s3guard view of the world ends up inconsistent. In 
> particular the sequence
>  (assuming src/file* is a list of files file1...file10 and read only to 
> caller)
>
> # create file rename src/file1 dest/ ; expect AccessDeniedException in the 
> delete, dest/file1 will exist
> # delete file dest/file1
> # rename src/file* dest/  ; expect failure
> # list dest; you will not see dest/file1
> You will not see file1 in the listing, presumably because it will have a 
> tombstone marker and the update at the end of the rename() didn't take place: 
> the old data is still there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15183) S3Guard store becomes inconsistent after partial failure of rename

2019-03-27 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-15183:
---

Assignee: Steve Loughran  (was: Gabor Bota)

> S3Guard store becomes inconsistent after partial failure of rename
> --
>
> Key: HADOOP-15183
> URL: https://issues.apache.org/jira/browse/HADOOP-15183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15183-001.patch, HADOOP-15183-002.patch, 
> org.apache.hadoop.fs.s3a.auth.ITestAssumeRole-output.txt
>
>
> If an S3A rename() operation fails partway through, such as when the user 
> doesn't have permissions to delete the source files after copying to the 
> destination, then the s3guard view of the world ends up inconsistent. In 
> particular the sequence
>  (assuming src/file* is a list of files file1...file10 and read only to 
> caller)
>
> # create file rename src/file1 dest/ ; expect AccessDeniedException in the 
> delete, dest/file1 will exist
> # delete file dest/file1
> # rename src/file* dest/  ; expect failure
> # list dest; you will not see dest/file1
> You will not see file1 in the listing, presumably because it will have a 
> tombstone marker and the update at the end of the rename() didn't take place: 
> the old data is still there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16214) Kerberos name implementation in Hadoop does not accept principals with more than two components

2019-03-27 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803096#comment-16803096
 ] 

Eric Yang commented on HADOOP-16214:


RFC1510 briefly describes multiple components is allowed, and given example of 
how to use two components for service principals.  Yes, this is a bug, and good 
to fix this.

> Kerberos name implementation in Hadoop does not accept principals with more 
> than two components
> ---
>
> Key: HADOOP-16214
> URL: https://issues.apache.org/jira/browse/HADOOP-16214
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: auth
>Reporter: Issac Buenrostro
>Priority: Major
>
> org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
> converting a Kerberos principal to a user name in Hadoop for all of the 
> services requiring authentication.
> Although the Kerberos spec 
> ([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html])
>  allows for an arbitrary number of components in the principal, the Hadoop 
> implementation will throw a "Malformed Kerberos name:" error if the principal 
> has more than two components (because the regex can only read serviceName and 
> hostName).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on issue #630: HADOOP-15999 S3Guard OOB: improve test resilience and probes

2019-03-27 Thread GitBox
bgaborg commented on issue #630: HADOOP-15999 S3Guard OOB: improve test 
resilience and probes
URL: https://github.com/apache/hadoop/pull/630#issuecomment-477264070
 
 
   Tested against ireland. Found one flaky/intermittent race condition issue 
fixed in https://issues.apache.org/jira/browse/HADOOP-16186. Otherwise it looks 
good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16186) NPE in ITestS3AFileSystemContract teardown in DynamoDBMetadataStore.lambda$listChildren

2019-03-27 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803094#comment-16803094
 ] 

Gabor Bota commented on HADOOP-16186:
-

Updated the PR, tested and commented on PR.

> NPE in ITestS3AFileSystemContract teardown in  
> DynamoDBMetadataStore.lambda$listChildren
> 
>
> Key: HADOOP-16186
> URL: https://issues.apache.org/jira/browse/HADOOP-16186
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> Test run options. NPE in test teardown
> {code}
> -Dparallel-tests -DtestsThreadCount=6 -Ds3guard -Ddynamodb
> {code}
> If you look at the code, its *exactly* the place fixed in HADOOP-15827, a 
> change which HADOOP-15947 reverted. 
> There's clearly some codepath which can surface which is causing failures in 
> some situations, and having multiple patches switching between the && and || 
> operators isn't going to to fix it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao commented on a change in pull request #632: HDDS-1255. Refactor ozone acceptance test to allow run in secure mode. Contributed by Ajay Kumar.

2019-03-27 Thread GitBox
xiaoyuyao commented on a change in pull request #632: HDDS-1255. Refactor ozone 
acceptance test to allow run in secure mode. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/632#discussion_r269678333
 
 

 ##
 File path: hadoop-ozone/dist/src/main/smoketest/s3/commonawslib.robot
 ##
 @@ -15,6 +15,7 @@
 
 *** Settings ***
 Resource../commonlib.robot
+Resource../commonlib.robot
 
 Review comment:
   dup line 18 can be removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on issue #628: HADOOP-16186. NPE in ITestS3AFileSystemContract teardown in DynamoDBMetadataStore.lambda$listChildren

2019-03-27 Thread GitBox
bgaborg commented on issue #628: HADOOP-16186. NPE in 
ITestS3AFileSystemContract teardown in DynamoDBMetadataStore.lambda$listChildren
URL: https://github.com/apache/hadoop/pull/628#issuecomment-477263379
 
 
   Tested against ireland. 
   I still have one known issue: 
https://issues.apache.org/jira/browse/HADOOP-16207. 
   Maybe we can pull this, and fix that issue later?
   
   > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
66.011 s <<< FAILURE! - in 
org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitMRJob
   [ERROR] 
testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitMRJob)
  Time elapsed: 37.181 s  <<< ERROR!
   java.io.FileNotFoundException: Path 
s3a://cloudera-dev-gabor-ireland/fork-0006/test/DELAY_LISTING_ME/testMRJob is 
recorded as deleted by S3Guard
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2369)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2343)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:559)
at 
org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertIsDirectory(AbstractFSContractTestBase.java:327)
at 
org.apache.hadoop.fs.s3a.commit.AbstractITCommitMRJob.testMRJob(AbstractITCommitMRJob.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] xiaoyuyao commented on a change in pull request #641: HDDS-1318. Fix MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.

2019-03-27 Thread GitBox
xiaoyuyao commented on a change in pull request #641: HDDS-1318. Fix 
MalformedTracerStateStringException on DN logs. Contributed by Xiaoyu Yao.
URL: https://github.com/apache/hadoop/pull/641#discussion_r269675601
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/ozShell/TestOzoneShell.java
 ##
 @@ -919,13 +926,19 @@ public void testGetKey() throws Exception {
 bucket.createKey(keyName, dataStr.length());
 keyOutputStream.write(dataStr.getBytes());
 keyOutputStream.close();
+assertFalse("put key without malformed tracing",
+logs.getOutput().contains("MalformedTracerStateString"));
+logs.clearOutput();
 
 String tmpPath = baseDir.getAbsolutePath() + "/testfile-"
 + UUID.randomUUID().toString();
 String[] args = new String[] {"key", "get",
 url + "/" + volumeName + "/" + bucketName + "/" + keyName,
 tmpPath};
 execute(shell, args);
+assertFalse("get key without malformed tracing",
 
 Review comment:
   additional unit test added.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269675602
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ChangeTracker.java
 ##
 @@ -148,16 +177,77 @@ public void processResponse(final S3Object object,
   }
 }
 
-final ObjectMetadata metadata = object.getObjectMetadata();
+processMetadata(object.getObjectMetadata(), operation, pos);
+  }
+
+  /**
+   * Process the response from the server for validation against the
+   * change policy.
+   * @param copyResult result of a copy operation
+   * @throws PathIOException raised on failure
+   * @throws RemoteFileChangedException if the remote file has changed.
+   */
+  public void processResponse(final CopyResult copyResult)
 
 Review comment:
   Actually, one thing I can do here is enforce 
fs.s3a.change.detection.version.required (if set) to make sure the CopyResult 
has an ETag or versionId if one is expected.  I'll add that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] avijayanhwx commented on a change in pull request #648: HDDS-1340. Add List Containers API for Recon

2019-03-27 Thread GitBox
avijayanhwx commented on a change in pull request #648: HDDS-1340. Add List 
Containers API for Recon
URL: https://github.com/apache/hadoop/pull/648#discussion_r269657755
 
 

 ##
 File path: 
hadoop-ozone/ozone-recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ContainerDBServiceProviderImpl.java
 ##
 @@ -163,4 +165,21 @@ public Integer getCountForForContainerKeyPrefix(
 return prefixes;
   }
 
+  /**
+   * Iterate the DB to construct a unique set of containerIDs.
+   *
+   * @return List of containerIDs.
+   * @throws IOException
+   */
+  @Override
+  public Set getContainerIDList() throws IOException {
+Set containerIDs = new HashSet<>();
 
 Review comment:
   (Minor) LinkedHashSet may be better if we want to preserve containerId 
ordering on iteration.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16214) Kerberos name implementation in Hadoop does not accept principals with more than two components

2019-03-27 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803018#comment-16803018
 ] 

Erik Krogen commented on HADOOP-16214:
--

I see some discussion of Hadoop's handling of Kerberos names containing slashes 
in HADOOP-12751 (starting 
[here|https://issues.apache.org/jira/browse/HADOOP-12751?focusedCommentId=15124818&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15124818]).
 It looks like eventually it was decided that it makes sense to allow Kerberos 
identities which contain slashes in a way that don't confirm with Hadoop's 
normal expectation of {{user/host@realm}} (see 
[this|https://issues.apache.org/jira/browse/HADOOP-12751?focusedCommentId=15239016&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15239016]).
 So it seems to me that it would be worthwhile to fix this issue.

Ping [~templedf], [~steve_l], [~daryn] who have been involved in previous 
efforts in this area.

> Kerberos name implementation in Hadoop does not accept principals with more 
> than two components
> ---
>
> Key: HADOOP-16214
> URL: https://issues.apache.org/jira/browse/HADOOP-16214
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: auth
>Reporter: Issac Buenrostro
>Priority: Major
>
> org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
> converting a Kerberos principal to a user name in Hadoop for all of the 
> services requiring authentication.
> Although the Kerberos spec 
> ([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html])
>  allows for an arbitrary number of components in the principal, the Hadoop 
> implementation will throw a "Malformed Kerberos name:" error if the principal 
> has more than two components (because the regex can only read serviceName and 
> hostName).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] avijayanhwx commented on a change in pull request #648: HDDS-1340. Add List Containers API for Recon

2019-03-27 Thread GitBox
avijayanhwx commented on a change in pull request #648: HDDS-1340. Add List 
Containers API for Recon
URL: https://github.com/apache/hadoop/pull/648#discussion_r269655591
 
 

 ##
 File path: 
hadoop-ozone/ozone-recon/src/test/java/org/apache/hadoop/ozone/recon/api/TestContainerKeyService.java
 ##
 @@ -198,6 +198,38 @@ public void testGetKeysForContainer() throws Exception {
 assertTrue(keyMetadataList.isEmpty());
   }
 
+  @Test
+  public void testGetContainerIDList() throws Exception {
+//Take snapshot of OM DB and copy over to Recon OM DB.
+DBCheckpoint checkpoint = omMetadataManager.getStore()
 
 Review comment:
   DB Snapshot step and running ContainerKeyMapperTask can also be moved to the 
Before method. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16207) intermittent failure of S3A test ITestDirectoryCommitMRJob

2019-03-27 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802995#comment-16802995
 ] 

Steve Loughran commented on HADOOP-16207:
-

could be more fundamental as in "I'm not sure the committers are correctly 
telling S3Guard about parent directories".

After each PUT is manifest, we call finishedWrite() , but that seems to do more 
about purging spurious deleted files, rather than adding dir entries into 
S3Guard. Provided mkdirs() is called 

Proposed: build a list of all directories which need to exist as part of a job 
commit, and only create those entries

The other strategy is for initiateMultipartUpload() to mkdir on the parent. 
That's a bit more expensive though. Better to not worry about whether it exists 
and do all this stuff during job commit only

> intermittent failure of S3A test ITestDirectoryCommitMRJob
> --
>
> Key: HADOOP-16207
> URL: https://issues.apache.org/jira/browse/HADOOP-16207
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> Reported failure of {{ITestDirectoryCommitMRJob}} in validation runs of 
> HADOOP-16186; assertIsDirectory with s3guard enabled and a parallel test run: 
> Path "is recorded as deleted by S3Guard"
> {code}
> waitForConsistency();
> assertIsDirectory(outputPath) /* here */
> {code}
> The file is there but there's a tombstone. Possibilities
> * some race condition with another test
> * tombstones aren't timing out
> * committers aren't creating that base dir in a way which cleans up S3Guard's 
> tombstones. 
> Remember: we do have to delete that dest dir before the committer runs unless 
> overwrite==true, so at the start of the run there will be a tombstone. It 
> should be overwritten by a success.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16214) Kerberos name implementation in Hadoop does not accept principals with more than two components

2019-03-27 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HADOOP-16214:
-
Description: 
org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
converting a Kerberos principal to a user name in Hadoop for all of the 
services requiring authentication.

Although the Kerberos spec 
([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html])
 allows for an arbitrary number of components in the principal, the Hadoop 
implementation will throw a "Malformed Kerberos name:" error if the principal 
has more than two components (because the regex can only read serviceName and 
hostName).

  was:
org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
converting a Kerberos principal to a user name in Hadoop for all of the 
services requiring authentication.

Although the Kerberos spec 
([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html)]
 allows for an arbitrary number of components in the principal, the Hadoop 
implementation will throw a "Malformed Kerberos name:" error if the principal 
has more than two components (because the regex can only read serviceName and 
hostName).


> Kerberos name implementation in Hadoop does not accept principals with more 
> than two components
> ---
>
> Key: HADOOP-16214
> URL: https://issues.apache.org/jira/browse/HADOOP-16214
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: auth
>Reporter: Issac Buenrostro
>Priority: Major
>
> org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
> converting a Kerberos principal to a user name in Hadoop for all of the 
> services requiring authentication.
> Although the Kerberos spec 
> ([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html])
>  allows for an arbitrary number of components in the principal, the Hadoop 
> implementation will throw a "Malformed Kerberos name:" error if the principal 
> has more than two components (because the regex can only read serviceName and 
> hostName).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16214) Kerberos name implementation in Hadoop does not accept principals with more than two components

2019-03-27 Thread Issac Buenrostro (JIRA)
Issac Buenrostro created HADOOP-16214:
-

 Summary: Kerberos name implementation in Hadoop does not accept 
principals with more than two components
 Key: HADOOP-16214
 URL: https://issues.apache.org/jira/browse/HADOOP-16214
 Project: Hadoop Common
  Issue Type: Bug
  Components: auth
Reporter: Issac Buenrostro


org.apache.hadoop.security.authentication.util.KerberosName is in charge of 
converting a Kerberos principal to a user name in Hadoop for all of the 
services requiring authentication.

Although the Kerberos spec 
([https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html)]
 allows for an arbitrary number of components in the principal, the Hadoop 
implementation will throw a "Malformed Kerberos name:" error if the principal 
has more than two components (because the regex can only read serviceName and 
hostName).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16207) intermittent failure of S3A test ITestDirectoryCommitMRJob

2019-03-27 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16207:

Priority: Critical  (was: Minor)

> intermittent failure of S3A test ITestDirectoryCommitMRJob
> --
>
> Key: HADOOP-16207
> URL: https://issues.apache.org/jira/browse/HADOOP-16207
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> Reported failure of {{ITestDirectoryCommitMRJob}} in validation runs of 
> HADOOP-16186; assertIsDirectory with s3guard enabled and a parallel test run: 
> Path "is recorded as deleted by S3Guard"
> {code}
> waitForConsistency();
> assertIsDirectory(outputPath) /* here */
> {code}
> The file is there but there's a tombstone. Possibilities
> * some race condition with another test
> * tombstones aren't timing out
> * committers aren't creating that base dir in a way which cleans up S3Guard's 
> tombstones. 
> Remember: we do have to delete that dest dir before the committer runs unless 
> overwrite==true, so at the start of the run there will be a tombstone. It 
> should be overwritten by a success.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15960) Update guava to 27.0-jre in hadoop-project

2019-03-27 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802949#comment-16802949
 ] 

Gabor Bota commented on HADOOP-15960:
-

h2 Testing with HBase

Tested HBase with hadoop-3.0.4-snapshot patched with guava updated to v27. 
Created HBASE-22109 to fix issues on compile time. With that fix, I was able to 
compile and test. [~psomogyi] helped me to figure out that the test failures 
and errors were not related to this change or flaky or presumably because of 
wrong environment setup. We still have to run {{hbase/hbase-it}}, but other 
than that we are good to go imho.

> Update guava to 27.0-jre in hadoop-project
> --
>
> Key: HADOOP-15960
> URL: https://issues.apache.org/jira/browse/HADOOP-15960
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, security
>Affects Versions: 2.7.3, 3.1.0, 3.2.0, 3.0.3, 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-15960-branch-3.0.001.patch, 
> HADOOP-15960-branch-3.1.001.patch, HADOOP-15960-branch-3.2.001.patch, 
> HADOOP-15960.000.WIP.patch, HADOOP-15960.001.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> [CVE-2018-10237|https://nvd.nist.gov/vuln/detail/CVE-2018-10237].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15960) Update guava to 27.0-jre in hadoop-project

2019-03-27 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802949#comment-16802949
 ] 

Gabor Bota edited comment on HADOOP-15960 at 3/27/19 3:47 PM:
--

h2. Testing with HBase

Tested HBase master with hadoop-3.0.4-snapshot patched with guava updated to 
v27. Created HBASE-22109 to fix issues on compile time. With that fix, I was 
able to compile and test. [~psomogyi] helped me to figure out that the test 
failures and errors were not related to this change or flaky or presumably 
because of wrong environment setup. We still have to run {{hbase/hbase-it}}, 
but other than that we are good to go imho.


was (Author: gabor.bota):
h2 Testing with HBase

Tested HBase with hadoop-3.0.4-snapshot patched with guava updated to v27. 
Created HBASE-22109 to fix issues on compile time. With that fix, I was able to 
compile and test. [~psomogyi] helped me to figure out that the test failures 
and errors were not related to this change or flaky or presumably because of 
wrong environment setup. We still have to run {{hbase/hbase-it}}, but other 
than that we are good to go imho.

> Update guava to 27.0-jre in hadoop-project
> --
>
> Key: HADOOP-15960
> URL: https://issues.apache.org/jira/browse/HADOOP-15960
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, security
>Affects Versions: 2.7.3, 3.1.0, 3.2.0, 3.0.3, 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-15960-branch-3.0.001.patch, 
> HADOOP-15960-branch-3.1.001.patch, HADOOP-15960-branch-3.2.001.patch, 
> HADOOP-15960.000.WIP.patch, HADOOP-15960.001.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> [CVE-2018-10237|https://nvd.nist.gov/vuln/detail/CVE-2018-10237].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ben-roling commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite

2019-03-27 Thread GitBox
ben-roling commented on a change in pull request #646: HADOOP-16085: use object 
version or etags to protect against inconsistent read after replace/overwrite
URL: https://github.com/apache/hadoop/pull/646#discussion_r269596801
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 ##
 @@ -3387,26 +3489,41 @@ public EtagChecksum getFileChecksum(Path f, final long 
length)
   @Retries.OnceTranslated
   public RemoteIterator listFiles(Path f,
   boolean recursive) throws FileNotFoundException, IOException {
-return innerListFiles(f, recursive,
-new Listing.AcceptFilesOnly(qualify(f)));
+return toLocatedFileStatusIterator(innerListFiles(f, recursive,
+new Listing.AcceptFilesOnly(qualify(f;
+  }
+
+  private static RemoteIterator toLocatedFileStatusIterator(
 
 Review comment:
   I'm not sure if there is a better way to do this.  I have 
`RemoteIterator` but need to return 
`RemoteIterator` from the public methods like listFiles() so 
I use this to do the conversion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2

2019-03-27 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16211:

Target Version/s: 3.2.1  (was: 3.2.1, 3.1.3)

> Update guava to 27.0-jre in hadoop-project branch-3.2
> -
>
> Key: HADOOP-16211
> URL: https://issues.apache.org/jira/browse/HADOOP-16211
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16211) Update guava to 27.0-jre in hadoop-project branch-3.2

2019-03-27 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16211:

Affects Version/s: (was: 3.1.2)
   (was: 3.1.1)
   (was: 3.1.0)

> Update guava to 27.0-jre in hadoop-project branch-3.2
> -
>
> Key: HADOOP-16211
> URL: https://issues.apache.org/jira/browse/HADOOP-16211
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15960) Update guava to 27.0-jre in hadoop-project

2019-03-27 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-15960:

Status: In Progress  (was: Patch Available)

> Update guava to 27.0-jre in hadoop-project
> --
>
> Key: HADOOP-15960
> URL: https://issues.apache.org/jira/browse/HADOOP-15960
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, security
>Affects Versions: 3.0.3, 3.2.0, 3.1.0, 2.7.3, 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Critical
> Attachments: HADOOP-15960-branch-3.0.001.patch, 
> HADOOP-15960-branch-3.1.001.patch, HADOOP-15960-branch-3.2.001.patch, 
> HADOOP-15960.000.WIP.patch, HADOOP-15960.001.patch
>
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> [CVE-2018-10237|https://nvd.nist.gov/vuln/detail/CVE-2018-10237].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16213) Update guava to 27.0-jre in hadoop-project branch-3.1

2019-03-27 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16213:

Target Version/s: 3.1.3  (was: 3.2.1, 3.1.3)

> Update guava to 27.0-jre in hadoop-project branch-3.1
> -
>
> Key: HADOOP-16213
> URL: https://issues.apache.org/jira/browse/HADOOP-16213
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> com.google.guava:guava should be upgraded to 27.0-jre due to new CVE's found 
> CVE-2018-10237.
> This is a sub-task for branch-3.2 from HADOOP-15960 to track issues on that 
> particular branch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >