[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1470822345 there's some new warnings about deprecation and cast. you can add @ tags to suppress these ``` hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:801:45:[unchecked] unchecked cast hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:806:45:[unchecked] unchecked cast hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:813:41:[unchecked] unchecked cast hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:819:43:[unchecked] unchecked cast hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java:184:4:[deprecation] S3ClientFactory in org.apache.hadoop.fs.s3a has been deprecated hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java:185:14:[deprecation] S3ClientFactory in org.apache.hadoop.fs.s3a has been deprecated ``` a lot of what should be unit tests are now failing with no credentials. try checking out a copy of this pr without any auth-keys.xml, unset your AWS_ env vars and see if these tests fail for you locally -then fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1484983979 `Endpoint is not set || Endpoint is set && ends in amazonaws.com || ARN is set` is roughly what storediag does: https://github.com/steveloughran/cloudstore/blob/trunk/src/main/java/org/apache/hadoop/fs/store/diag/S3ADiagnosticsInfo.java#L672 it also looks for amazonaws.cn; not sure about where else it should probe. having a flag "is.aws" would be good as a single switch to move to all 3rd party stuff, but the problem there is that features there may change over time too; fs.s3a.store.vendor would let you have a table of providers (aws, ozone, hitachi, amplidata, minio, netapp) and choose the right settings w.r.t path vs hostname, checksums, we'd have the vendor settings from properties too, e.g fs.s3a.vendor.ozone.change.detection.mode = none so they could be overridden in core-site/per-bucket that could actually simplify a lot of our internal "doesn't work with vendor XYZ" where problems 1 and 2 are 1. you forgot to set the endpoint and aws are rejecting you/your bucket 2. use path resolution -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1490346831 i don't know about those third party stores; someone in your sdk team probably knows better there. I think generally it is just account + secret, unless something like kerberos/active directory is used -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1490348322 w.r.t merging, add the amazon.cn check so it doesn't get forgotten about. then we should be good to mergel -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1508895984 ok, before merging, that spotbugs needs to be quietened ``` ode | Warning -- | -- IS | Inconsistent synchronization of org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient; locked 50% of time | Bug type IS2_INCONSISTENT_SYNC (click for details)In class org.apache.hadoop.fs.s3a.S3AFileSystemField org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClientSynchronized 50% of the timeUnsynchronized access at S3AFileSystem.java:[line 1675]Unsynchronized access at S3AFileSystem.java:[line 994]Synchronized access at S3AFileSystem.java:[line 4051]Synchronized access at S3AFileSystem.java:[line 4056] ode Warning IS Inconsistent synchronization of org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient; locked 50% of time [Bug type IS2_INCONSISTENT_SYNC (click for details)](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5421/14/artifact/out/branch-spotbugs-hadoop-tools_hadoop-aws-warnings.html#IS2_INCONSISTENT_SYNC) In class org.apache.hadoop.fs.s3a.S3AFileSystem Field org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient Synchronized 50% of the time Unsynchronized access at S3AFileSystem.java:[line 1675] Unsynchronized access at S3AFileSystem.java:[line 994] Synchronized access at S3AFileSystem.java:[line 4051] Synchronized access at S3AFileSystem.java:[line 4056] ``` if it is a false alarm, add an exclusion in `hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml` do see if there is a real risk first, before just disabling though. Assuming it is created in the unsynchronized initialize() method, if access, even internally, is through an unsync getter all should be good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1515158123 it looks exactly the same as the one above, doesn't it? Sometimes with findbugs is simplest to just surrender: create some setter which is synchronized and set it through that. If you look at `MeanStatistic` you can see how I had to sync everything to get it to STFU, even the equals() method. BTW, if you don't know already, it is possible to run the findbugs on the command line. I normally only do this and the checkstyle calls when trying to fix things they've reported through yetus ``` mvn findbugs:findbugs ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1517736384 spotbugs is happy; deprecation warnings file...all that is left is to get the name of the new method right -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520403758 ok, let's give up on spotbugs. this is one of its enternal losing battles, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520404440 here's what I propose: merge then fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520411654 > do you mean reverting the changes made for spotbugs (moving to a separate method)? new method is fine, we just need to add in whatever shuts spotbugs up. I would have hoped that was it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1523433345 oh, and based some other changes, spotbugs warning about extant spotbugs doesn't mean you've added a new one, just that there is one to get rid of. yetus can still warn there even if the pr fixes it, which is what you have just done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1547877112 I'm going to merge this in and then play with myself; see if there are any more final-final-final changes to worry about. then pull into trunk. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1547909482 so you want me to merge this as is? oh, here's my list for the next iteration ``` * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/AwsCredentialListProvider.java L184 access denied exception. add test for this? AWSClientConfig TODO: Don't think you can set a socket factory for the netty client. cloudstore: add the new paths import software.amazon.awssdk.http.apache.ApacheHttpClient; import software.amazon.awssdk.thirdparty.org.apache.http.conn.ssl.SSLConnectionSocketFactory; oftware.amazon.awssdk.services.s3.model.HeadBucketResponse; hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java +add test for getHeaders(/) to see what comes back hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABucketExistence.java L128 use explicit region constant rather than inline string hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java L552: use intercept() hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java L75: just throw the exception again L87, L90, use constants hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AAWSCredentialsProvider.java L44 move o.a.h. imports into "real" hadoop block; include the sets one too hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AProxy.java is new ssl.proxy setting consistent with what this pr does hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/auth/delegation/ITestSessionDelegationInFileystem.java L335 TODO open, getObjectMetadata("/") ? maybe just raise the exception, as getFileStatus() will create a fake entry here, and nothing else should be calling root explicitly +cut hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/InconsistentS3ClientFactory.java ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1550074617 sorry, i'm creating confusion. I was going to merge this into the feature branch. I hadn't realised that the sdk was going to hurt rename; yes, that would be an issue. How hard would it be for us to add it, rather than await for an SDK feature? We have the thread pool so it is a matter of putting the copies into there and awaiting completion...doing it ourselves lets us add the audit headers we can't do today, and lets us be aware of all throttling/retries going on. It's why there's long been interest in doing it internally, just not enough interest to justify the effort. I think if we do add our implementation, we just need the ability to disable it and fall back to the single COPY in case it turns out that we've gone and broken it... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.
steveloughran commented on PR #5421: URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1550129607 did a local build and test with -Dparallel-tests -DtestsThreadCount=8 -Dscale against s3 london; failures in ITestS3ABucketExistence were seen, but I've not looked at the details there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org