[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-03-15 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1470822345

   there's some new warnings about deprecation and cast. you can add @ tags to 
suppress these
   ```
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:801:45:[unchecked]
 unchecked cast
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:806:45:[unchecked]
 unchecked cast
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:813:41:[unchecked]
 unchecked cast
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java:819:43:[unchecked]
 unchecked cast
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java:184:4:[deprecation]
 S3ClientFactory in org.apache.hadoop.fs.s3a has been deprecated
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java:185:14:[deprecation]
 S3ClientFactory in org.apache.hadoop.fs.s3a has been deprecated
   ```
   
   a lot of what should be unit tests are now failing with no credentials. try 
checking out a copy of this pr without any auth-keys.xml, unset your AWS_ env 
vars and see if these tests fail for you locally -then fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-03-27 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1484983979

   `Endpoint is not set || Endpoint is set && ends in amazonaws.com || ARN is 
set` is roughly what storediag does: 
https://github.com/steveloughran/cloudstore/blob/trunk/src/main/java/org/apache/hadoop/fs/store/diag/S3ADiagnosticsInfo.java#L672
   
   it also looks for amazonaws.cn; not sure about where else it should probe. 
   
   having a flag "is.aws" would be good as a single switch to move to all 3rd 
party stuff, but the problem there is that features there may change over time 
too; fs.s3a.store.vendor would let you have a table of providers (aws, ozone, 
hitachi, amplidata, minio, netapp) and choose the right settings w.r.t path vs 
hostname, checksums, 
   
   we'd have the vendor settings from properties too, e.g
   fs.s3a.vendor.ozone.change.detection.mode = none
   so they could be overridden in core-site/per-bucket
   
   that could actually simplify a lot of our internal "doesn't work with vendor 
XYZ" where problems 1 and 2 are 
   1. you forgot to set the endpoint and aws are rejecting you/your bucket
   2. use path resolution 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-03-30 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1490346831

   i don't know about those third party stores; someone in your sdk team 
probably knows better there. I think generally it is just account + secret, 
unless something like kerberos/active directory is used


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-03-30 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1490348322

   w.r.t merging, add the amazon.cn check so it doesn't get forgotten about. 
then we should be good to mergel


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-14 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1508895984

   ok, before merging, that spotbugs needs to be quietened
   ```
   
   
   ode | Warning
   -- | --
   IS | Inconsistent synchronization of 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient; locked 50% of time
     | Bug type IS2_INCONSISTENT_SYNC (click for details)In class 
org.apache.hadoop.fs.s3a.S3AFileSystemField 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClientSynchronized 50% of the 
timeUnsynchronized access at S3AFileSystem.java:[line 1675]Unsynchronized 
access at S3AFileSystem.java:[line 994]Synchronized access at 
S3AFileSystem.java:[line 4051]Synchronized access at S3AFileSystem.java:[line 
4056]
   
   ode  Warning
   IS   Inconsistent synchronization of 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient; locked 50% of time
   [Bug type IS2_INCONSISTENT_SYNC (click for 
details)](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5421/14/artifact/out/branch-spotbugs-hadoop-tools_hadoop-aws-warnings.html#IS2_INCONSISTENT_SYNC)
   In class org.apache.hadoop.fs.s3a.S3AFileSystem
   Field org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient
   Synchronized 50% of the time
   Unsynchronized access at S3AFileSystem.java:[line 1675]
   Unsynchronized access at S3AFileSystem.java:[line 994]
   Synchronized access at S3AFileSystem.java:[line 4051]
   Synchronized access at S3AFileSystem.java:[line 4056]
   ```
   
   if it is a false alarm, add an exclusion in 
`hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml`
   do see if there is a real risk first, before just disabling though. Assuming 
it is created in the unsynchronized initialize() method, if access, even 
internally, is through an unsync getter all should be good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-19 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1515158123

   it looks exactly the same as the one above, doesn't it?
   
   Sometimes with findbugs is simplest to just surrender: create some setter 
which is synchronized and set it through that. If you look at `MeanStatistic` 
you can see how I had to sync everything to get it to STFU, even the equals() 
method.
   
   BTW, if you don't know already, it is possible to run the findbugs on the 
command line. I normally only do this and the checkstyle calls when trying to 
fix things they've reported through yetus
   
   ```
   mvn findbugs:findbugs
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-21 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1517736384

   spotbugs is happy; deprecation warnings file...all that is left is to get 
the name of the new method right


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-24 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520403758

   ok, let's give up on spotbugs. this is one of its enternal losing battles, 
right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-24 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520404440

   here's what I propose: merge then fix


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-24 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1520411654

   > do you mean reverting the changes made for spotbugs (moving to a separate 
method)?
   
   new method is fine, we just need to add in whatever shuts spotbugs up. I 
would have hoped that was it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-04-26 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1523433345

   oh, and based some other changes, spotbugs warning about extant spotbugs 
doesn't mean you've added a new one, just that there is one to get rid of. 
yetus can still warn there even if the pr fixes it, which is what you have just 
done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-05-15 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1547877112

   I'm going to merge this in and then play with myself; see if there are any 
more final-final-final changes to worry about. then pull into trunk. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-05-15 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1547909482

   so you want me to merge this as is?
   
   oh, here's my list for the next iteration
   
   
   
   ```
   * 
   
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/AwsCredentialListProvider.java
   L184 access denied exception. add test for this?
   
   AWSClientConfig
   TODO: Don't think you can set a socket factory for the netty client.
   
   
   cloudstore: add the new paths
   import software.amazon.awssdk.http.apache.ApacheHttpClient;
   import 
software.amazon.awssdk.thirdparty.org.apache.http.conn.ssl.SSLConnectionSocketFactory;
  
   oftware.amazon.awssdk.services.s3.model.HeadBucketResponse;
   
   
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
   +add test for getHeaders(/) to see what comes back
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABucketExistence.java
   L128 use explicit region constant rather than inline string
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
   L552: use intercept()
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java
   L75: just throw the exception again
   L87, L90, use constants
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AAWSCredentialsProvider.java
   L44 move o.a.h. imports into "real" hadoop block; include the sets one too
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AProxy.java
   is new ssl.proxy  setting consistent with what this pr does
   
   
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/auth/delegation/ITestSessionDelegationInFileystem.java
   L335 TODO open, getObjectMetadata("/")
   ? maybe just raise the exception, as getFileStatus() will create a fake 
entry here, and nothing else should be calling root explicitly
   
   +cut 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/InconsistentS3ClientFactory.java
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-05-16 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1550074617

   sorry, i'm creating confusion. I was going to merge this into the feature 
branch. 
   
   I hadn't realised that the sdk was going to hurt rename; yes, that would be 
an issue.
   How hard would it be for us to add it, rather than await for an SDK feature? 
We have the thread pool so it is a matter of putting the copies into there and 
awaiting completion...doing it ourselves lets us add the audit headers we can't 
do today, and lets us be aware of all throttling/retries going on. It's why 
there's long been interest in doing it internally, just not enough interest to 
justify the effort.
   
   I think if we do add our implementation, we just need the ability to disable 
it and fall back to the single COPY in case it turns out that we've gone and 
broken it...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on pull request #5421: HADOOP-18565. Completes outstanding items for the SDK V2 upgrade.

2023-05-16 Thread via GitHub


steveloughran commented on PR #5421:
URL: https://github.com/apache/hadoop/pull/5421#issuecomment-1550129607

   did a local build and test with -Dparallel-tests -DtestsThreadCount=8 
-Dscale against s3 london; failures in  ITestS3ABucketExistence were seen, but 
I've not looked at the details there. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org