Re: Error from reading S3 in Scala

James Hammerton Wed, 04 May 2016 03:33:08 -0700

On 3 May 2016 at 17:22, Gourav Sengupta <gourav.sengu...@gmail.com> wrote:


> Hi,
>
> The best thing to do is start the EMR clusters with proper permissions in
> the roles that way you do not need to worry about the keys at all.
>
> Another thing, why are we using s3a// instead of s3:// ?
>

Probably because of what's said about s3:// and s3n:// here (which is why I
use s3a://):

https://wiki.apache.org/hadoop/AmazonS3

Regards,

James


> Besides that you can increase s3 speeds using the instructions mentioned
> here:
> https://aws.amazon.com/blogs/aws/aws-storage-update-amazon-s3-transfer-acceleration-larger-snowballs-in-more-regions/
>
>
> Regards,
> Gourav
>
> On Tue, May 3, 2016 at 12:04 PM, Steve Loughran <ste...@hortonworks.com>
> wrote:
>
>> don't put your secret in the URI, it'll only creep out in the logs.
>>
>> Use the specific properties coverd in
>> http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html,
>> which you can set in your spark context by prefixing them with spark.hadoop.
>>
>> you can also set the env vars, AWS_ACCESS_KEY_ID and
>> AWS_SECRET_ACCESS_KEY; SparkEnv will pick these up and set the relevant
>> spark context keys for you
>>
>>
>> On 3 May 2016, at 01:53, Zhang, Jingyu <jingyu.zh...@news.com.au> wrote:
>>
>> Hi All,
>>
>> I am using Eclipse with Maven for developing Spark applications. I got a
>> error for Reading from S3 in Scala but it works fine in Java when I run
>> them in the same project in Eclipse. The Scala/Java code and the error in
>> following
>>
>>
>> Scala
>>
>> val uri = URI.create("s3a://" + key + ":" + seckey + "@" +
>> "graphclustering/config.properties");
>> val pt = new Path("s3a://" + key + ":" + seckey + "@" +
>> "graphclustering/config.properties");
>> val fs = FileSystem.get(uri,ctx.hadoopConfiguration);
>> val  inputStream:InputStream = fs.open(pt);
>>
>> ----Exception: on aws-java-1.7.4 and hadoop-aws-2.6.1----
>>
>> Exception in thread "main"
>> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
>> Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
>> 8A56DC7BF0BFF09A), S3 Extended Request ID
>>
>> at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(
>> AmazonHttpClient.java:1160)
>>
>> at com.amazonaws.http.AmazonHttpClient.executeOneRequest(
>> AmazonHttpClient.java:748)
>>
>> at com.amazonaws.http.AmazonHttpClient.executeHelper(
>> AmazonHttpClient.java:467)
>>
>> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:302)
>>
>> at com.amazonaws.services.s3.AmazonS3Client.invoke(
>> AmazonS3Client.java:3785)
>>
>> at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(
>> AmazonS3Client.java:1050)
>>
>> at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(
>> AmazonS3Client.java:1027)
>>
>> at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(
>> S3AFileSystem.java:688)
>>
>> at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:222)
>>
>> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
>>
>> at com.news.report.graph.GraphCluster$.main(GraphCluster.scala:53)
>>
>> at com.news.report.graph.GraphCluster.main(GraphCluster.scala)
>>
>> 16/05/03 10:49:17 INFO SparkContext: Invoking stop() from shutdown hook
>>
>> 16/05/03 10:49:17 INFO SparkUI: Stopped Spark web UI at
>> http://10.65.80.125:4040
>>
>> 16/05/03 10:49:17 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>>
>> 16/05/03 10:49:17 INFO MemoryStore: MemoryStore cleared
>>
>> 16/05/03 10:49:17 INFO BlockManager: BlockManager stopped
>>
>> ----Exception: on aws-java-1.7.4 and hadoop-aws-2.7.2----
>>
>> 16/05/03 10:23:40 INFO Slf4jLogger: Slf4jLogger started
>>
>> 16/05/03 10:23:40 INFO Remoting: Starting remoting
>>
>> 16/05/03 10:23:40 INFO Remoting: Remoting started; listening on addresses
>> :[akka.tcp://sparkDriverActorSystem@10.65.80.125:61860]
>>
>> 16/05/03 10:23:40 INFO Utils: Successfully started service
>> 'sparkDriverActorSystem' on port 61860.
>>
>> 16/05/03 10:23:40 INFO SparkEnv: Registering MapOutputTracker
>>
>> 16/05/03 10:23:40 INFO SparkEnv: Registering BlockManagerMaster
>>
>> 16/05/03 10:23:40 INFO DiskBlockManager: Created local directory at
>> /private/var/folders/sc/tdmkbvr1705b8p70xqj1kqks5l9p
>>
>> 16/05/03 10:23:40 INFO MemoryStore: MemoryStore started with capacity
>> 1140.4 MB
>>
>> 16/05/03 10:23:40 INFO SparkEnv: Registering OutputCommitCoordinator
>>
>> 16/05/03 10:23:40 INFO Utils: Successfully started service 'SparkUI' on
>> port 4040.
>>
>> 16/05/03 10:23:40 INFO SparkUI: Started SparkUI at
>> http://10.65.80.125:4040
>>
>> 16/05/03 10:23:40 INFO Executor: Starting executor ID driver on host
>> localhost
>>
>> 16/05/03 10:23:40 INFO Utils: Successfully started service
>> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61861.
>>
>> 16/05/03 10:23:40 INFO NettyBlockTransferService: Server created on 61861
>>
>> 16/05/03 10:23:40 INFO BlockManagerMaster: Trying to register BlockManager
>>
>> 16/05/03 10:23:40 INFO BlockManagerMasterEndpoint: Registering block
>> manager localhost:61861 with 1140.4 MB RAM, BlockManagerId(driver,
>> localhost, 61861)
>>
>> 16/05/03 10:23:40 INFO BlockManagerMaster: Registered BlockManager
>>
>> Exception in thread "main" java.lang.NoSuchMethodError:
>> com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V
>>
>> at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(
>> S3AFileSystem.java:285)
>>
>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
>>
>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>>
>> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630
>> )
>>
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>>
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>>
>> at com.news.report.graph.GraphCluster$.main(GraphCluster.scala:52)
>>
>> at com.news.report.graph.GraphCluster.main(GraphCluster.scala)
>>
>> 16/05/03 10:23:51 INFO SparkContext: Invoking stop() from shutdown hook
>>
>> 16/05/03 10:23:51 INFO SparkUI: Stopped Spark web UI at
>> http://10.65.80.125:4040
>>
>> 16/05/03 10:23:51 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>>
>> 16/05/03 10:23:51 INFO MemoryStore: MemoryStore cleared
>>
>> 16/05/03 10:23:51 INFO BlockManager: BlockManager stopped
>>
>> 16/05/03 10:23:51 INFO BlockManagerMaster: BlockManagerMaster stopped
>>
>> 16/05/03 10:23:51 INFO
>> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>> OutputCommitCoordinator stopped!
>>
>> 16/05/03 10:23:51 INFO SparkContext: Successfully stopped SparkContext
>>
>> 16/05/03 10:23:51 INFO ShutdownHookManager: Shutdown hook called
>>
>> 16/05/03 10:23:51 INFO ShutdownHookManager: Deleting directory
>> /private/var/folders/sc/tdmkbvr1705b8p70xqj1kqks5l9pk9/T/spark-53cf244a-2947-48c9-ba97-7302c9985f35
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: Caught an AmazonServiceException,
>> which means your request made it to Amazon S3, but was rejected with an
>> error response for some reason.
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: Error Message: Forbidden (Service:
>> Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
>> 8A56DC7BF0BFF09A)
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: HTTP Status Code: 403
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: AWS Error Code: 403 Forbidden
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: Error Type: Client
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: Request ID: 8A56DC7BF0BFF09A
>>
>> 16/05/03 10:49:17 INFO S3AFileSystem: Class Name:
>> com.amazonaws.services.s3.model.AmazonS3Exception
>>
>>
>> But, Java code works without error
>>
>> URI uri = URI.create("s3a://" + key + ":" + seckey + "@" +
>> "graphclustering/config.properties");
>> Path pt = new Path("s3a://" + key + ":" + seckey + "@" +
>> "graphclustering/config.properties");
>> FileSystem fs = FileSystem.get(uri,ctx.hadoopConfiguration());
>> inputStream = fs.open(pt);
>>
>> Thanks,
>>
>> Jingyu
>>
>> This message and its attachments may contain legally privileged or
>> confidential information. It is intended solely for the named addressee. If
>> you are not the addressee indicated in this message or responsible for
>> delivery of the message to the addressee, you may not copy or deliver this
>> message or its attachments to anyone. Rather, you should permanently delete
>> this message and its attachments and kindly notify the sender by reply
>> e-mail. Any content of this message and its attachments which does not
>> relate to the official business of the sending company must be taken not to
>> have been sent or endorsed by that company or any of its related entities.
>> No warranty is made that the e-mail or attachments are free from computer
>> virus or other defect.
>>
>>
>>
>

Re: Error from reading S3 in Scala

Reply via email to