Re: use S3-Compatible Storage with spark

2015-07-28 Thread Schmirr Wurst
Hi recompiled and retried, now its looking like this with s3a : com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain S3n is working find, (only problem is still the endpoint) - To

Re: use S3-Compatible Storage with spark

2015-07-28 Thread Akhil Das
With s3n try this out: *s3service.s3-endpoint*The host name of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes. Default: s3.amazonaws.com Thanks Best Regards On Tue, Jul 28, 2015 at 1:54 PM, Schmirr

Re: use S3-Compatible Storage with spark

2015-07-28 Thread Schmirr Wurst
I tried those 3 possibilities, and everything is working = endpoint param is not working : sc.hadoopConfiguration.set(s3service.s3-endpoint,test) sc.hadoopConfiguration.set(fs.s3n.endpoint,test) sc.hadoopConfiguration.set(fs.s3n.s3-endpoint,test) 2015-07-28 10:28 GMT+02:00 Akhil Das

Re: use S3-Compatible Storage with spark

2015-07-27 Thread Schmirr Wurst
No with s3a, I have the following error : java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285) 2015-07-27 11:17 GMT+02:00 Akhil Das

Re: use S3-Compatible Storage with spark

2015-07-27 Thread Akhil Das
That error is a jar conflict, you must be having multiple versions of hadoop jar in the classpath. First you make sure you are able to access your AWS S3 with s3a, then you give the endpoint configuration and try to access the custom storage. Thanks Best Regards On Mon, Jul 27, 2015 at 4:02 PM,

Re: use S3-Compatible Storage with spark

2015-07-27 Thread Akhil Das
So you are able to access your AWS S3 with s3a now? What is the error that you are getting when you try to access the custom storage with fs.s3a.endpoint? Thanks Best Regards On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I was able to access Amazon S3, but for

Re: use S3-Compatible Storage with spark

2015-07-27 Thread Schmirr Wurst
I was able to access Amazon S3, but for some reason, the Endpoint parameter is ignored, and I'm not able to access to storage from my provider... : sc.hadoopConfiguration.set(fs.s3a.endpoint,test) sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,)

Re: use S3-Compatible Storage with spark

2015-07-22 Thread Schmirr Wurst
your spark job add --jars path/to/thejar From: Schmirr Wurst schmirrwu...@gmail.com Sent: Wednesday, July 22, 2015 12:06 PM To: Thomas Demoor Subject: Re: use S3-Compatible Storage with spark Hi Thomas, thanks, could you just tell me what exaclty I

Re: use S3-Compatible Storage with spark

2015-07-21 Thread Akhil Das
You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea

Re: use S3-Compatible Storage with spark

2015-07-21 Thread Schmirr Wurst
Which version do you have ? - I tried with spark 1.4.1 for hdp 2.6, but here I had an issue that the aws-module is not there somehow: java.io.IOException: No FileSystem for scheme: s3n the same for s3a : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class

Re: use S3-Compatible Storage with spark

2015-07-21 Thread Akhil Das
Did you try with s3a? It seems its more like an issue with hadoop. Thanks Best Regards On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to

Re: use S3-Compatible Storage with spark

2015-07-21 Thread Schmirr Wurst
It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com) And I continue to get my data from amazon, how could it be ? (I also use s3n in my text url) 2015-07-21 9:30 GMT+02:00 Akhil Das

Re: use S3-Compatible Storage with spark

2015-07-20 Thread Akhil Das
Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com)

Re: use S3-Compatible Storage with spark

2015-07-20 Thread Schmirr Wurst
Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration

Re: use S3-Compatible Storage with spark

2015-07-19 Thread Akhil Das
Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using

Re: use S3-Compatible Storage with spark

2015-07-17 Thread Sujit Pal
Hi Schmirr, The part after the s3n:// is your bucket name and folder name, ie s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are unique across S3, so the resulting path is also unique. There is no concept of hostname in s3 urls as far as I know. -sujit On Fri, Jul 17,

Re: use S3-Compatible Storage with spark

2015-07-17 Thread Ankur Chauhan
The endpoint is the property you want to set. I would look at the source for that. Sent from my iPhone On Jul 17, 2015, at 08:55, Sujit Pal sujitatgt...@gmail.com wrote: Hi Schmirr, The part after the s3n:// is your bucket name and folder name, ie