Re: use S3-Compatible Storage with spark
Hi recompiled and retried, now its looking like this with s3a : com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain S3n is working find, (only problem is still the endpoint) - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
With s3n try this out: *s3service.s3-endpoint*The host name of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes. Default: s3.amazonaws.com Thanks Best Regards On Tue, Jul 28, 2015 at 1:54 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi recompiled and retried, now its looking like this with s3a : com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain S3n is working find, (only problem is still the endpoint)
Re: use S3-Compatible Storage with spark
I tried those 3 possibilities, and everything is working = endpoint param is not working : sc.hadoopConfiguration.set(s3service.s3-endpoint,test) sc.hadoopConfiguration.set(fs.s3n.endpoint,test) sc.hadoopConfiguration.set(fs.s3n.s3-endpoint,test) 2015-07-28 10:28 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: With s3n try this out: *s3service.s3-endpoint*The host name of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes. Default: s3.amazonaws.com Thanks Best Regards On Tue, Jul 28, 2015 at 1:54 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi recompiled and retried, now its looking like this with s3a : com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain S3n is working find, (only problem is still the endpoint)
Re: use S3-Compatible Storage with spark
No with s3a, I have the following error : java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285) 2015-07-27 11:17 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: So you are able to access your AWS S3 with s3a now? What is the error that you are getting when you try to access the custom storage with fs.s3a.endpoint? Thanks Best Regards On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I was able to access Amazon S3, but for some reason, the Endpoint parameter is ignored, and I'm not able to access to storage from my provider... : sc.hadoopConfiguration.set(fs.s3a.endpoint,test) sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,) sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,) Any Idea why it doesn't work ? 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
That error is a jar conflict, you must be having multiple versions of hadoop jar in the classpath. First you make sure you are able to access your AWS S3 with s3a, then you give the endpoint configuration and try to access the custom storage. Thanks Best Regards On Mon, Jul 27, 2015 at 4:02 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: No with s3a, I have the following error : java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285) 2015-07-27 11:17 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: So you are able to access your AWS S3 with s3a now? What is the error that you are getting when you try to access the custom storage with fs.s3a.endpoint? Thanks Best Regards On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I was able to access Amazon S3, but for some reason, the Endpoint parameter is ignored, and I'm not able to access to storage from my provider... : sc.hadoopConfiguration.set(fs.s3a.endpoint,test) sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,) sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,) Any Idea why it doesn't work ? 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
So you are able to access your AWS S3 with s3a now? What is the error that you are getting when you try to access the custom storage with fs.s3a.endpoint? Thanks Best Regards On Mon, Jul 27, 2015 at 2:44 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I was able to access Amazon S3, but for some reason, the Endpoint parameter is ignored, and I'm not able to access to storage from my provider... : sc.hadoopConfiguration.set(fs.s3a.endpoint,test) sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,) sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,) Any Idea why it doesn't work ? 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
I was able to access Amazon S3, but for some reason, the Endpoint parameter is ignored, and I'm not able to access to storage from my provider... : sc.hadoopConfiguration.set(fs.s3a.endpoint,test) sc.hadoopConfiguration.set(fs.s3a.awsAccessKeyId,) sc.hadoopConfiguration.set(fs.s3a.awsSecretAccessKey,) Any Idea why it doesn't work ? 2015-07-20 18:11 GMT+02:00 Schmirr Wurst schmirrwu...@gmail.com: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
I could get a little further : - installed spark-1.4.1-without-hadoop - unpacked hadoop 2.7.1 - added the folowing in spark-env.sh HADOOP_HOME=/opt/hadoop-2.7.1/ SPARK_DIST_CLASSPATH=/opt/hadoop-2.7.1/opt/hadoop-2.7.1/share/hadoop/tools/lib/*/share/hadoop/tools/lib/*:/opt/hadoop-2.7.1/etc/hadoop:/opt/hadoop-2.7.1/share/hadoop/common/lib/*:/opt/had$ and start spark-shell with : bin/spark-shell --jars /opt/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-aws-2.7.1.jar Now spark-shell is starting with spark.SparkContext: Added JAR file:/opt/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-aws-2.7.1.jar at http://185.19.29.91:46368/jars/hadoop-aws-2.7.1.jar with timestamp 1437575186830 But when trying to access s3 I have java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated In Fact it doesn't even matters if I try to use s3n or s3a, error is the same (strange!) 2015-07-22 12:19 GMT+02:00 Thomas Demoor thomas.dem...@hgst.com: You need to get the hadoop-aws.jar from hadoop-tools (use hadoop 2.7+) - you can get the source and build with mvn or get it from prebuilt hadoop distro's. Then when you run your spark job add --jars path/to/thejar From: Schmirr Wurst schmirrwu...@gmail.com Sent: Wednesday, July 22, 2015 12:06 PM To: Thomas Demoor Subject: Re: use S3-Compatible Storage with spark Hi Thomas, thanks, could you just tell me what exaclty I need to do ? I'm not familiar with java programming - where do I get the jar from, do I need to compile it with mvn ? - where should I update the classpath and how ? 2015-07-22 11:55 GMT+02:00 Thomas Demoor thomas.dem...@hgst.com: The classes are not found. Is the jar on your classpath? Take care: there are multiple s3 connectors in hadoop: the legacy s3n, based on a 3d party S3 lib Jets3t, and the recent (functional since hadoop 2.7) s3a based on the Amazon SDK. Make sure you stick to one: so use fs.s3a endpoint and url s3a://bucket/object or fs.s3n.endpoint and s3n://bucket/object. I recommend s3a but I'm biased :P Regards, Thomas From: Schmirr Wurst schmirrwu...@gmail.com Sent: Tuesday, July 21, 2015 11:59 AM To: Akhil Das Cc: user@spark.apache.org Subject: Re: use S3-Compatible Storage with spark Which version do you have ? - I tried with spark 1.4.1 for hdp 2.6, but here I had an issue that the aws-module is not there somehow: java.io.IOException: No FileSystem for scheme: s3n the same for s3a : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found - On Spark 1.4.1 for hdp 2.4 , the module is there, and works out of the box for S3n (but for the endpoint) But I have java.io.IOException: No FileSystem for scheme: s3a :-| 2015-07-21 11:09 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Did you try with s3a? It seems its more like an issue with hadoop. Thanks Best Regards On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com) And I continue to get my data from amazon, how could it be ? (I also use s3n in my text url) 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point
Re: use S3-Compatible Storage with spark
You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Which version do you have ? - I tried with spark 1.4.1 for hdp 2.6, but here I had an issue that the aws-module is not there somehow: java.io.IOException: No FileSystem for scheme: s3n the same for s3a : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found - On Spark 1.4.1 for hdp 2.4 , the module is there, and works out of the box for S3n (but for the endpoint) But I have java.io.IOException: No FileSystem for scheme: s3a :-| 2015-07-21 11:09 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Did you try with s3a? It seems its more like an issue with hadoop. Thanks Best Regards On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com) And I continue to get my data from amazon, how could it be ? (I also use s3n in my text url) 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Did you try with s3a? It seems its more like an issue with hadoop. Thanks Best Regards On Tue, Jul 21, 2015 at 2:31 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com ) And I continue to get my data from amazon, how could it be ? (I also use s3n in my text url) 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
It seems to work for the credentials , but the endpoint is ignored.. : I've changed it to sc.hadoopConfiguration.set(fs.s3n.endpoint,test.com) And I continue to get my data from amazon, how could it be ? (I also use s3n in my text url) 2015-07-21 9:30 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: You can add the jar in the classpath, and you can set the property like: sc.hadoopConfiguration.set(fs.s3a.endpoint,storage.sigmoid.com) Thanks Best Regards On Mon, Jul 20, 2015 at 9:41 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Thanks, that is what I was looking for... Any Idea where I have to store and reference the corresponding hadoop-aws-2.6.0.jar ?: java.io.IOException: No FileSystem for scheme: s3n 2015-07-20 8:33 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Not in the uri, but in the hadoop configuration you can specify it. property namefs.s3a.endpoint/name descriptionAWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. /description /property Thanks Best Regards On Sun, Jul 19, 2015 at 9:13 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: I want to use pithos, were do I can specify that endpoint, is it possible in the url ? 2015-07-19 17:22 GMT+02:00 Akhil Das ak...@sigmoidanalytics.com: Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Could you name the Storage service that you are using? Most of them provides a S3 like RestAPI endpoint for you to hit. Thanks Best Regards On Fri, Jul 17, 2015 at 2:06 PM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
Hi Schmirr, The part after the s3n:// is your bucket name and folder name, ie s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are unique across S3, so the resulting path is also unique. There is no concept of hostname in s3 urls as far as I know. -sujit On Fri, Jul 17, 2015 at 1:36 AM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: use S3-Compatible Storage with spark
The endpoint is the property you want to set. I would look at the source for that. Sent from my iPhone On Jul 17, 2015, at 08:55, Sujit Pal sujitatgt...@gmail.com wrote: Hi Schmirr, The part after the s3n:// is your bucket name and folder name, ie s3n://${bucket_name}/${folder_name}[/${subfolder_name}]*. Bucket names are unique across S3, so the resulting path is also unique. There is no concept of hostname in s3 urls as far as I know. -sujit On Fri, Jul 17, 2015 at 1:36 AM, Schmirr Wurst schmirrwu...@gmail.com wrote: Hi, I wonder how to use S3 compatible Storage in Spark ? If I'm using s3n:// url schema, the it will point to amazon, is there a way I can specify the host somewhere ? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org