[ https://issues.apache.org/jira/browse/SPARK-15965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332369#comment-15332369 ]
Sean Owen commented on SPARK-15965: ----------------------------------- CC [~steve_l] but I think this is your classpath issue or a Hadoop issue, not Spark. > No FileSystem for scheme: s3n or s3a spark-2.0.0 and spark-1.6.1 > ----------------------------------------------------------------- > > Key: SPARK-15965 > URL: https://issues.apache.org/jira/browse/SPARK-15965 > Project: Spark > Issue Type: Bug > Components: Build > Affects Versions: 1.6.1 > Environment: Debian GNU/Linux 8 > java version "1.7.0_79" > Reporter: thauvin damien > Original Estimate: 8h > Remaining Estimate: 8h > > The spark programming-guide explain that Spark can create distributed > datasets on Amazon S3 . > But since the pre-buid "Hadoop 2.6" the S3 access doesn't work with s3n or > s3a. > sc.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", "XXXZZZHHH") > sc.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey", > "xxxxxxxxxxxxxxxxxxxxxxxxxxx") > val > lines=sc.textFile("s3a://poc-XXX/access/2016/02/20160201202001_xxx.log.gz") > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > Any version of spark : spark-1.3.1 ; spark-1.6.1 even spark-2.0.0 with > hadoop.7.2 . > I understand this is an Hadoop Issue (SPARK-7442) but can you make some > documentation to explain what jar we need to add and where ? ( for standalone > installation) . > "hadoop-aws-x.x.x.jar and aws-java-sdk-x.x.x.jar is enough ? > What env variable we need to set and what file we need to modifiy . > Is it "$CLASSPATH "or a variable in "spark-defaults.conf" with variable > "spark.driver.extraClassPath" and "spark.executor.extraClassPath" > But Still Works with spark-1.6.1 pre build with hadoop2.4 > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org