[ https://issues.apache.org/jira/browse/SPARK-12557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-12557. ------------------------------- Resolution: Not A Problem > Spark 1.5.1 is unable to read S3 file system (Java exception - > s3a.S3AFileSystem not found) > -------------------------------------------------------------------------------------------- > > Key: SPARK-12557 > URL: https://issues.apache.org/jira/browse/SPARK-12557 > Project: Spark > Issue Type: Bug > Components: EC2, PySpark > Affects Versions: 1.5.1 > Environment: AWS (EC2) instances + S3 + Hadoop CDH > Reporter: Chiragkumar > > Hello Technical Support team, > This is one of critical production issue we are facing on Spark version > 1.5.1. It is throwing JAVA runtime exception error > "apache.hadoop.fs.s3a.S3AFileSystem" not found. Although it works perfectly > on Spark version 1.3.1. Is this known issue on Spark1.5.1? I have opened case > with Cloudera CDH but they are not fully supporting this yet. We are using > spark-shell (scala) now a day lot so end user would prefer this environment > to execute there HQL and most of datasets exist at S3 bucket. Note that there > is no complain if the dataset call from HDFS (Hadoop FS) so it seems to be > related to my Spark configuration or something similar. Pls help to identify > root cause and its solution. Following is the more technical info for review : > scala> val rdf1 = sqlContext.sql("Select * from > ntcom.nc_currency_dim").collect() > rdf1: Array[org.apache.spark.sql.Row] = > Array([-1,UNK,UNKNOWN,UNKNOWN,0.74,1.35,1.0,1.0,DBUDAL,11-JUN-2014 > 20:36:41,JHOSLE,2008-03-26 00:00:00.0,105.0,6.1,2014-06-11 > 20:36:41,2015-07-08 22:10:02,N], > [-1,UNK,UNKNOWN,UNKNOWN,1.0,1.0,1.0,1.0,PDHAVA,08-JUL-2015 > 22:10:03,JHOSLE,2008-03-26 00:00:00.0,null,null,2015-07-08 > 22:10:03,3000-01-01 00:00:00,Y], [1,DKK,Danish Krone,Danish > Krone,0.13,7.46,0.180965147453,5.53,DBUDAL,11-JUN-2014 > 20:36:41,NCBATCH,2007-01-16 00:00:00.0,19.0,1.1,2014-06-11 > 20:36:41,2015-07-08 22:10:02,N], [1,DKK,Danish Krone,Danish > Krone,0.134048257372654,7.46,0.134048257372654,7.46,PDHAVA,08-JUL-2015 > 22:10:03,NCBATCH,2007-01-16 00:00:00.0,null,null,2015-07-08 > 22:10:03,3000-01-01 00:00:00,Y], [2,EUR,Euro,EMU currency > (Euro),1.0,1.0,1.35,0.74,DBUDAL,11-JUN-2014 20:36:41,NCBA... > rdf1 = sqlContext.sql("Select * from dev_ntcom.nc_currency_dim").collect() > 11:52 AM > ava.lang.RuntimeException: java.lang.ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074) > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2578) > at > org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$2.apply(ClientWrapper.scala:303) > at scala.Option.map(Option.scala:145) > Caused by: java.lang.ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980) > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072) > ... 120 more > 15/11/05 20:31:01 ERROR log: error in initSerDe: > org.apache.hadoop.hive.serde2.SerDeException Encountered exception > determining schema. Returning signal schema to indicate problem: > java.lang.ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > org.apache.hadoop.hive.serde2.SerDeException: Encountered exception > determining schema. Returning signal schema to indicate problem: > java.lang.ClassNotFoundException: Class org.apache.hadoop.fs > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:524) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org