Re: use S3-Compatible Storage with spark

2015-07-28 Thread Schmirr Wurst
Hi recompiled and retried, now its looking like this with s3a :
com.amazonaws.AmazonClientException: Unable to load AWS credentials
from any provider in the chain

S3n is working find, (only problem is still the endpoint)

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-28 Thread Akhil Das
With s3n try this out:

*s3service.s3-endpoint*The host name of the S3 service. You should only
ever change this value from the default if you need to contact an
alternative S3 endpoint for testing purposes.
Default: s3.amazonaws.com

Thanks
Best Regards

On Tue, Jul 28, 2015 at 1:54 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 Hi recompiled and retried, now its looking like this with s3a :
 com.amazonaws.AmazonClientException: Unable to load AWS credentials
 from any provider in the chain

 S3n is working find, (only problem is still the endpoint)



Re: use S3-Compatible Storage with spark

2015-07-28 Thread Schmirr Wurst
I tried those 3 possibilities, and everything is working = endpoint param
is not working :
sc.hadoopConfiguration.set(s3service.s3-endpoint,test)
sc.hadoopConfiguration.set(fs.s3n.endpoint,test)
sc.hadoopConfiguration.set(fs.s3n.s3-endpoint,test)

2015-07-28 10:28 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:

 With s3n try this out:

 *s3service.s3-endpoint*The host name of the S3 service. You should only
 ever change this value from the default if you need to contact an
 alternative S3 endpoint for testing purposes.
 Default: s3.amazonaws.com

 Thanks
 Best Regards

 On Tue, Jul 28, 2015 at 1:54 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 Hi recompiled and retried, now its looking like this with s3a :
 com.amazonaws.AmazonClientException: Unable to load AWS credentials
 from any provider in the chain

 S3n is working find, (only problem is still the endpoint)





Re: use S3-Compatible Storage with spark

2015-07-27 Thread Schmirr Wurst
No with s3a, I have the following error :
java.lang.NoSuchMethodError:
com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285)

2015-07-27 11:17 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 So you are able to access your AWS S3 with s3a now? What is the error that
 you are getting when you try to access the custom storage with
 fs.s3a.endpoint?

 Thanks
 Best Regards

 On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 I was able to access Amazon S3, but for some reason, the Endpoint
 parameter is ignored, and I'm not able to access to storage from my
 provider... :

 sc.hadoopConfiguration.set(fs.s3a.endpoint,test)
 sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,)
 sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,)

 Any Idea why it doesn't work ?

 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com:
  Thanks, that is what I was looking for...
 
  Any Idea where I have to store and reference the corresponding
  hadoop-aws-2.6.0.jar ?:
 
  java.io.IOException: No FileSystem for scheme: s3n
 
  2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Not in the uri, but in the hadoop configuration you can specify it.
 
  property
namefs.s3a.endpoint/name
descriptionAWS S3 endpoint to connect to. An up-to-date list is
  provided in the AWS Documentation: regions and endpoints. Without
  this
  property, the standard region (s3.amazonaws.com) is assumed.
/description
  /property
 
 
  Thanks
  Best Regards
 
  On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  I want to use pithos, were do I can specify that endpoint, is it
  possible in the url ?
 
  2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Could you name the Storage service that you are using? Most of them
   provides
   a S3 like RestAPI endpoint for you to hit.
  
   Thanks
   Best Regards
  
   On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
   schmirrwu...@gmail.com
   wrote:
  
   Hi,
  
   I wonder how to use S3 compatible Storage in Spark ?
   If I'm using s3n:// url schema, the it will point to amazon, is
   there
   a way I can specify the host somewhere ?
  
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-27 Thread Akhil Das
That error is a jar conflict, you must be having multiple versions of
hadoop jar in the classpath. First you make sure you are able to access
your AWS S3 with s3a, then you give the endpoint configuration and try to
access the custom storage.

Thanks
Best Regards

On Mon, Jul 27, 2015 at 4:02 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 No with s3a, I have the following error :
 java.lang.NoSuchMethodError:

 com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V
 at
 org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285)

 2015-07-27 11:17 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  So you are able to access your AWS S3 with s3a now? What is the error
 that
  you are getting when you try to access the custom storage with
  fs.s3a.endpoint?
 
  Thanks
  Best Regards
 
  On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  I was able to access Amazon S3, but for some reason, the Endpoint
  parameter is ignored, and I'm not able to access to storage from my
  provider... :
 
  sc.hadoopConfiguration.set(fs.s3a.endpoint,test)
  sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,)
  sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,)
 
  Any Idea why it doesn't work ?
 
  2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com:
   Thanks, that is what I was looking for...
  
   Any Idea where I have to store and reference the corresponding
   hadoop-aws-2.6.0.jar ?:
  
   java.io.IOException: No FileSystem for scheme: s3n
  
   2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Not in the uri, but in the hadoop configuration you can specify it.
  
   property
 namefs.s3a.endpoint/name
 descriptionAWS S3 endpoint to connect to. An up-to-date list is
   provided in the AWS Documentation: regions and endpoints. Without
   this
   property, the standard region (s3.amazonaws.com) is assumed.
 /description
   /property
  
  
   Thanks
   Best Regards
  
   On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst 
 schmirrwu...@gmail.com
   wrote:
  
   I want to use pithos, were do I can specify that endpoint, is it
   possible in the url ?
  
   2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
Could you name the Storage service that you are using? Most of
 them
provides
a S3 like RestAPI endpoint for you to hit.
   
Thanks
Best Regards
   
On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
schmirrwu...@gmail.com
wrote:
   
Hi,
   
I wonder how to use S3 compatible Storage in Spark ?
If I'm using s3n:// url schema, the it will point to amazon, is
there
a way I can specify the host somewhere ?
   
   
   
 -
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
   
   
  
  
 -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
 



Re: use S3-Compatible Storage with spark

2015-07-27 Thread Akhil Das
So you are able to access your AWS S3 with s3a now? What is the error that
you are getting when you try to access the custom storage with
fs.s3a.endpoint?

Thanks
Best Regards

On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 I was able to access Amazon S3, but for some reason, the Endpoint
 parameter is ignored, and I'm not able to access to storage from my
 provider... :

 sc.hadoopConfiguration.set(fs.s3a.endpoint,test)
 sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,)
 sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,)

 Any Idea why it doesn't work ?

 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com:
  Thanks, that is what I was looking for...
 
  Any Idea where I have to store and reference the corresponding
  hadoop-aws-2.6.0.jar ?:
 
  java.io.IOException: No FileSystem for scheme: s3n
 
  2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Not in the uri, but in the hadoop configuration you can specify it.
 
  property
namefs.s3a.endpoint/name
descriptionAWS S3 endpoint to connect to. An up-to-date list is
  provided in the AWS Documentation: regions and endpoints. Without
 this
  property, the standard region (s3.amazonaws.com) is assumed.
/description
  /property
 
 
  Thanks
  Best Regards
 
  On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  I want to use pithos, were do I can specify that endpoint, is it
  possible in the url ?
 
  2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Could you name the Storage service that you are using? Most of them
   provides
   a S3 like RestAPI endpoint for you to hit.
  
   Thanks
   Best Regards
  
   On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst 
 schmirrwu...@gmail.com
   wrote:
  
   Hi,
  
   I wonder how to use S3 compatible Storage in Spark ?
   If I'm using s3n:// url schema, the it will point to amazon, is
 there
   a way I can specify the host somewhere ?
  
  
 -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 



Re: use S3-Compatible Storage with spark

2015-07-27 Thread Schmirr Wurst
I was able to access Amazon S3, but for some reason, the Endpoint
parameter is ignored, and I'm not able to access to storage from my
provider... :

sc.hadoopConfiguration.set(fs.s3a.endpoint,test)
sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,)
sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,)

Any Idea why it doesn't work ?

2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com:
 Thanks, that is what I was looking for...

 Any Idea where I have to store and reference the corresponding
 hadoop-aws-2.6.0.jar ?:

 java.io.IOException: No FileSystem for scheme: s3n

 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 Not in the uri, but in the hadoop configuration you can specify it.

 property
   namefs.s3a.endpoint/name
   descriptionAWS S3 endpoint to connect to. An up-to-date list is
 provided in the AWS Documentation: regions and endpoints. Without this
 property, the standard region (s3.amazonaws.com) is assumed.
   /description
 /property


 Thanks
 Best Regards

 On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 I want to use pithos, were do I can specify that endpoint, is it
 possible in the url ?

 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Could you name the Storage service that you are using? Most of them
  provides
  a S3 like RestAPI endpoint for you to hit.
 
  Thanks
  Best Regards
 
  On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Hi,
 
  I wonder how to use S3 compatible Storage in Spark ?
  If I'm using s3n:// url schema, the it will point to amazon, is there
  a way I can specify the host somewhere ?
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-22 Thread Schmirr Wurst
I could get a little further :
- installed spark-1.4.1-without-hadoop
- unpacked hadoop 2.7.1
- added the folowing in spark-env.sh

HADOOP_HOME=/opt/hadoop-2.7.1/
SPARK_DIST_CLASSPATH=/opt/hadoop-2.7.1/opt/hadoop-2.7.1/share/hadoop/tools/lib/*/share/hadoop/tools/lib/*:/opt/hadoop-2.7.1/etc/hadoop:/opt/hadoop-2.7.1/share/hadoop/common/lib/*:/opt/had$

and start spark-shell with :
bin/spark-shell --jars
/opt/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-aws-2.7.1.jar

Now spark-shell is starting with
spark.SparkContext: Added JAR
file:/opt/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-aws-2.7.1.jar at
http://185.19.29.91:46368/jars/hadoop-aws-2.7.1.jar with timestamp
1437575186830

But when trying to access s3 I have
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem:
Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be
instantiated

In Fact it doesn't even matters if I try to use s3n or s3a, error is
the same (strange!)

2015-07-22 12:19 GMT+02:00 Thomas Demoor thomas.dem...@hgst.com:
 You need to get the hadoop-aws.jar from hadoop-tools (use hadoop 2.7+) - you 
 can get the source and build with mvn or get it from prebuilt hadoop 
 distro's. Then when you run your spark job add --jars path/to/thejar

 
 From: Schmirr Wurst schmirrwu...@gmail.com
 Sent: Wednesday, July 22, 2015 12:06 PM
 To: Thomas Demoor
 Subject: Re: use S3-Compatible Storage with spark

 Hi Thomas, thanks, could you just tell me what exaclty I need to do ?
 I'm not familiar with java programming
 - where do I get the jar from, do  I need to compile it with mvn ?
 - where should I update the classpath and how ?



 2015-07-22 11:55 GMT+02:00 Thomas Demoor thomas.dem...@hgst.com:
 The classes are not found. Is the jar on your classpath?

 Take care: there are multiple s3 connectors in hadoop: the legacy s3n, based 
 on a 3d party S3 lib Jets3t, and the recent (functional since hadoop 2.7)  
 s3a based on the Amazon SDK. Make sure you stick to one: so use fs.s3a 
 endpoint and url s3a://bucket/object or fs.s3n.endpoint and 
 s3n://bucket/object. I recommend s3a but I'm biased :P

 Regards,
 Thomas

 
 From: Schmirr Wurst schmirrwu...@gmail.com
 Sent: Tuesday, July 21, 2015 11:59 AM
 To: Akhil Das
 Cc: user@spark.apache.org
 Subject: Re: use S3-Compatible Storage with spark

 Which version do you have ?

 - I tried with spark 1.4.1 for hdp 2.6, but here I had an issue that
 the aws-module is not there somehow:
 java.io.IOException: No FileSystem for scheme: s3n
 the same for s3a :
 java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
 org.apache.hadoop.fs.s3a.S3AFileSystem not found

 - On Spark 1.4.1 for hdp 2.4 , the module is there, and works out of
 the box for S3n (but for the endpoint)
 But I have java.io.IOException: No FileSystem for scheme: s3a

 :-|

 2015-07-21 11:09 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 Did you try with s3a? It seems its more like an issue with hadoop.

 Thanks
 Best Regards

 On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 It seems to work for the credentials , but the endpoint is ignored.. :
 I've changed it to
 sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com)

 And I continue to get my data from amazon, how could it be ? (I also
 use s3n in my text url)

 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  You can add the jar in the classpath, and you can set the property like:
 
  sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com)
 
 
 
  Thanks
  Best Regards
 
  On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Thanks, that is what I was looking for...
 
  Any Idea where I have to store and reference the corresponding
  hadoop-aws-2.6.0.jar ?:
 
  java.io.IOException: No FileSystem for scheme: s3n
 
  2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Not in the uri, but in the hadoop configuration you can specify it.
  
   property
 namefs.s3a.endpoint/name
 descriptionAWS S3 endpoint to connect to. An up-to-date list is
   provided in the AWS Documentation: regions and endpoints. Without
   this
   property, the standard region (s3.amazonaws.com) is assumed.
 /description
   /property
  
  
   Thanks
   Best Regards
  
   On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst
   schmirrwu...@gmail.com
   wrote:
  
   I want to use pithos, were do I can specify that endpoint, is it
   possible in the url ?
  
   2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
Could you name the Storage service that you are using? Most of
them
provides
a S3 like RestAPI endpoint for you to hit.
   
Thanks
Best Regards
   
On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
schmirrwu...@gmail.com
wrote:
   
Hi,
   
I wonder how to use S3 compatible Storage in Spark ?
If I'm using s3n:// url schema, the it will point

Re: use S3-Compatible Storage with spark

2015-07-21 Thread Akhil Das
You can add the jar in the classpath, and you can set the property like:

sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com)



Thanks
Best Regards

On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 Thanks, that is what I was looking for...

 Any Idea where I have to store and reference the corresponding
 hadoop-aws-2.6.0.jar ?:

 java.io.IOException: No FileSystem for scheme: s3n

 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Not in the uri, but in the hadoop configuration you can specify it.
 
  property
namefs.s3a.endpoint/name
descriptionAWS S3 endpoint to connect to. An up-to-date list is
  provided in the AWS Documentation: regions and endpoints. Without
 this
  property, the standard region (s3.amazonaws.com) is assumed.
/description
  /property
 
 
  Thanks
  Best Regards
 
  On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  I want to use pithos, were do I can specify that endpoint, is it
  possible in the url ?
 
  2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Could you name the Storage service that you are using? Most of them
   provides
   a S3 like RestAPI endpoint for you to hit.
  
   Thanks
   Best Regards
  
   On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst 
 schmirrwu...@gmail.com
   wrote:
  
   Hi,
  
   I wonder how to use S3 compatible Storage in Spark ?
   If I'm using s3n:// url schema, the it will point to amazon, is there
   a way I can specify the host somewhere ?
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 



Re: use S3-Compatible Storage with spark

2015-07-21 Thread Schmirr Wurst
Which version do you have ?

- I tried with spark 1.4.1 for hdp 2.6, but here I had an issue that
the aws-module is not there somehow:
java.io.IOException: No FileSystem for scheme: s3n
the same for s3a :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.S3AFileSystem not found

- On Spark 1.4.1 for hdp 2.4 , the module is there, and works out of
the box for S3n (but for the endpoint)
But I have java.io.IOException: No FileSystem for scheme: s3a

:-|

2015-07-21 11:09 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 Did you try with s3a? It seems its more like an issue with hadoop.

 Thanks
 Best Regards

 On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 It seems to work for the credentials , but the endpoint is ignored.. :
 I've changed it to
 sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com)

 And I continue to get my data from amazon, how could it be ? (I also
 use s3n in my text url)

 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  You can add the jar in the classpath, and you can set the property like:
 
  sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com)
 
 
 
  Thanks
  Best Regards
 
  On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Thanks, that is what I was looking for...
 
  Any Idea where I have to store and reference the corresponding
  hadoop-aws-2.6.0.jar ?:
 
  java.io.IOException: No FileSystem for scheme: s3n
 
  2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Not in the uri, but in the hadoop configuration you can specify it.
  
   property
 namefs.s3a.endpoint/name
 descriptionAWS S3 endpoint to connect to. An up-to-date list is
   provided in the AWS Documentation: regions and endpoints. Without
   this
   property, the standard region (s3.amazonaws.com) is assumed.
 /description
   /property
  
  
   Thanks
   Best Regards
  
   On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst
   schmirrwu...@gmail.com
   wrote:
  
   I want to use pithos, were do I can specify that endpoint, is it
   possible in the url ?
  
   2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
Could you name the Storage service that you are using? Most of
them
provides
a S3 like RestAPI endpoint for you to hit.
   
Thanks
Best Regards
   
On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
schmirrwu...@gmail.com
wrote:
   
Hi,
   
I wonder how to use S3 compatible Storage in Spark ?
If I'm using s3n:// url schema, the it will point to amazon, is
there
a way I can specify the host somewhere ?
   
   
   
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
   
   
  
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
 



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-21 Thread Akhil Das
Did you try with s3a? It seems its more like an issue with hadoop.

Thanks
Best Regards

On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 It seems to work for the credentials , but the endpoint is ignored.. :
 I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com
 )

 And I continue to get my data from amazon, how could it be ? (I also
 use s3n in my text url)

 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  You can add the jar in the classpath, and you can set the property like:
 
  sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com)
 
 
 
  Thanks
  Best Regards
 
  On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Thanks, that is what I was looking for...
 
  Any Idea where I have to store and reference the corresponding
  hadoop-aws-2.6.0.jar ?:
 
  java.io.IOException: No FileSystem for scheme: s3n
 
  2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Not in the uri, but in the hadoop configuration you can specify it.
  
   property
 namefs.s3a.endpoint/name
 descriptionAWS S3 endpoint to connect to. An up-to-date list is
   provided in the AWS Documentation: regions and endpoints. Without
   this
   property, the standard region (s3.amazonaws.com) is assumed.
 /description
   /property
  
  
   Thanks
   Best Regards
  
   On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst 
 schmirrwu...@gmail.com
   wrote:
  
   I want to use pithos, were do I can specify that endpoint, is it
   possible in the url ?
  
   2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
Could you name the Storage service that you are using? Most of them
provides
a S3 like RestAPI endpoint for you to hit.
   
Thanks
Best Regards
   
On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
schmirrwu...@gmail.com
wrote:
   
Hi,
   
I wonder how to use S3 compatible Storage in Spark ?
If I'm using s3n:// url schema, the it will point to amazon, is
there
a way I can specify the host somewhere ?
   
   
   
 -
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
   
   
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
 



Re: use S3-Compatible Storage with spark

2015-07-21 Thread Schmirr Wurst
It seems to work for the credentials , but the endpoint is ignored.. :
I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com)

And I continue to get my data from amazon, how could it be ? (I also
use s3n in my text url)

2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 You can add the jar in the classpath, and you can set the property like:

 sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com)



 Thanks
 Best Regards

 On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 Thanks, that is what I was looking for...

 Any Idea where I have to store and reference the corresponding
 hadoop-aws-2.6.0.jar ?:

 java.io.IOException: No FileSystem for scheme: s3n

 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Not in the uri, but in the hadoop configuration you can specify it.
 
  property
namefs.s3a.endpoint/name
descriptionAWS S3 endpoint to connect to. An up-to-date list is
  provided in the AWS Documentation: regions and endpoints. Without
  this
  property, the standard region (s3.amazonaws.com) is assumed.
/description
  /property
 
 
  Thanks
  Best Regards
 
  On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  I want to use pithos, were do I can specify that endpoint, is it
  possible in the url ?
 
  2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
   Could you name the Storage service that you are using? Most of them
   provides
   a S3 like RestAPI endpoint for you to hit.
  
   Thanks
   Best Regards
  
   On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst
   schmirrwu...@gmail.com
   wrote:
  
   Hi,
  
   I wonder how to use S3 compatible Storage in Spark ?
   If I'm using s3n:// url schema, the it will point to amazon, is
   there
   a way I can specify the host somewhere ?
  
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-20 Thread Akhil Das
Not in the uri, but in the hadoop configuration you can specify it.

property
  namefs.s3a.endpoint/name
  descriptionAWS S3 endpoint to connect to. An up-to-date list is
provided in the AWS Documentation: regions and endpoints. Without this
property, the standard region (s3.amazonaws.com) is assumed.
  /description
/property


Thanks
Best Regards

On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 I want to use pithos, were do I can specify that endpoint, is it
 possible in the url ?

 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Could you name the Storage service that you are using? Most of them
 provides
  a S3 like RestAPI endpoint for you to hit.
 
  Thanks
  Best Regards
 
  On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Hi,
 
  I wonder how to use S3 compatible Storage in Spark ?
  If I'm using s3n:// url schema, the it will point to amazon, is there
  a way I can specify the host somewhere ?
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: use S3-Compatible Storage with spark

2015-07-20 Thread Schmirr Wurst
Thanks, that is what I was looking for...

Any Idea where I have to store and reference the corresponding
hadoop-aws-2.6.0.jar ?:

java.io.IOException: No FileSystem for scheme: s3n

2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
 Not in the uri, but in the hadoop configuration you can specify it.

 property
   namefs.s3a.endpoint/name
   descriptionAWS S3 endpoint to connect to. An up-to-date list is
 provided in the AWS Documentation: regions and endpoints. Without this
 property, the standard region (s3.amazonaws.com) is assumed.
   /description
 /property


 Thanks
 Best Regards

 On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com
 wrote:

 I want to use pithos, were do I can specify that endpoint, is it
 possible in the url ?

 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com:
  Could you name the Storage service that you are using? Most of them
  provides
  a S3 like RestAPI endpoint for you to hit.
 
  Thanks
  Best Regards
 
  On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com
  wrote:
 
  Hi,
 
  I wonder how to use S3 compatible Storage in Spark ?
  If I'm using s3n:// url schema, the it will point to amazon, is there
  a way I can specify the host somewhere ?
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: use S3-Compatible Storage with spark

2015-07-19 Thread Akhil Das
Could you name the Storage service that you are using? Most of them
provides a S3 like RestAPI endpoint for you to hit.

Thanks
Best Regards

On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 Hi,

 I wonder how to use S3 compatible Storage in Spark ?
 If I'm using s3n:// url schema, the it will point to amazon, is there
 a way I can specify the host somewhere ?

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: use S3-Compatible Storage with spark

2015-07-17 Thread Sujit Pal
Hi Schmirr,

The part after the s3n:// is your bucket name and folder name, ie
s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are
unique across S3, so the resulting path is also unique. There is no concept
of hostname in s3 urls as far as I know.

-sujit


On Fri, Jul 17, 2015 at 1:36 AM, Schmirr Wurst schmirrwu...@gmail.com
wrote:

 Hi,

 I wonder how to use S3 compatible Storage in Spark ?
 If I'm using s3n:// url schema, the it will point to amazon, is there
 a way I can specify the host somewhere ?

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: use S3-Compatible Storage with spark

2015-07-17 Thread Ankur Chauhan
The endpoint is the property you want to set. I would look at the source for 
that.

Sent from my iPhone

 On Jul 17, 2015, at 08:55, Sujit Pal sujitatgt...@gmail.com wrote:
 
 Hi Schmirr,
 
 The part after the s3n:// is your bucket name and folder name, ie 
 s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are 
 unique across S3, so the resulting path is also unique. There is no concept 
 of hostname in s3 urls as far as I know.
 
 -sujit
 
 
 On Fri, Jul 17, 2015 at 1:36 AM, Schmirr Wurst schmirrwu...@gmail.com 
 wrote:
 Hi,
 
 I wonder how to use S3 compatible Storage in Spark ?
 If I'm using s3n:// url schema, the it will point to amazon, is there
 a way I can specify the host somewhere ?
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org