[
https://issues.apache.org/jira/browse/BEAM-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732370#comment-16732370
]
Steve Loughran commented on BEAM-6266:
--------------------------------------
AWS SDK updates have a tendency to print things on new releases (warnings,
stack traces &c): changes which can sneak past system tests which don't
actually look for new messages. Sometimes they are bugs in the SDK to get
fixed, sometimes they're actual issues in your code.
# The latest S3A update process tells people to [look through the
logs|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]
to catch this.
# And we don't use the builder mechanism to create the client, just the
deprecated one on the basis that's its actually better to work with
> Newest AWS SDK causes errors on startup without usage
> -----------------------------------------------------
>
> Key: BEAM-6266
> URL: https://issues.apache.org/jira/browse/BEAM-6266
> Project: Beam
> Issue Type: Bug
> Components: io-java-aws
> Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0
> Reporter: Mike Kaplinskiy
> Assignee: Ismaël Mejía
> Priority: Major
>
> The S3 filesystem implementation in Beam logs a message like this if it is
> linked in:
> {code:java}
> 2018-12-18 22:17:55 INFO S3FileSystem:104 - The AWS S3 Beam extension was
> included in this build, but the awsRegion flag was not specified. If you
> don't plan to use S3, then ignore this message.{code}
> Previous versions of the AWS libraries seemed to allow you to {{ignore this
> message}} , but the newest libraries throw an exception:
> {code:java}
> org.apache.beam.sdk.Pipeline.create Pipeline.java: 145
> org.apache.beam.sdk.PipelineRunner.fromOptions PipelineRunner.java: 47
> org.apache.beam.sdk.io.FileSystems.setDefaultPipelineOptions
> FileSystems.java: 482
> org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique FileSystems.java:
> 492
> org.apache.beam.sdk.io.aws.s3.S3FileSystemRegistrar.fromOptions
> S3FileSystemRegistrar.java: 39
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.<init> S3FileSystem.java: 108
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.buildAmazonS3Client
> S3FileSystem.java: 122
> com.amazonaws.client.builder.AwsClientBuilder.withRegion
> AwsClientBuilder.java: 245
> com.amazonaws.client.builder.AwsClientBuilder.getRegionObject
> AwsClientBuilder.java: 258
> com.amazonaws.SdkClientException: Could not find region information for
> 'null' in SDK metadata.
> retryable: true{code}
> (this stack trace is from 2.6.0, but master is also affected)
>
> The root cause is this code that gets run unconditionally
> (https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/s3/DefaultS3ClientBuilderFactory.java#L42):
> {code:java}
> builder = builder.withRegion(s3Options.getAwsRegion());{code}
> Unfortunately the latest AWS library will throw if you pass {{null}} to
> {{withRegion}}. It previously did nothing. This release of the AWS java sdk
> added the error check:
> [https://github.com/aws/aws-sdk-java/commit/8f07cc35eec9047f7cfdbc7de3abc6d4327b08d0]
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)