unsubscribe 2021년 5월 26일 (수) 오전 12:31, Eric Beabes <mailinglist...@gmail.com>님이 작성:
> I keep getting the following exception when I am trying to read a Parquet > file from a Path on S3 in Spark/Scala. Note: I am running this on EMR. > > java.lang.NullPointerException > at > org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:144) > at > org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:142) > at > org.apache.spark.sql.DataFrameReader.<init>(DataFrameReader.scala:789) > at org.apache.spark.sql.SparkSession.read(SparkSession.scala:656) > > Interestingly I can read the path from Spark shell: > > scala> val df = spark.read.parquet("s3://my-path/").count > df: Long = 47 > > I've created the SparkSession as follows: > > val sparkConf = new SparkConf().setAppName("My spark app")val spark = > SparkSession.builder.config(sparkConf).enableHiveSupport().getOrCreate() > spark.sparkContext.setLogLevel("WARN") > spark.sparkContext.hadoopConfiguration.set("java.library.path", > "/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native") > spark.conf.set("spark.sql.parquet.mergeSchema", "true") > spark.conf.set("spark.speculation", "false") > spark.conf.set("spark.sql.crossJoin.enabled", "true") > spark.conf.set("spark.sql.sources.partitionColumnTypeInference.enabled", > "true") > spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic") > spark.sparkContext.hadoopConfiguration.set("mapreduce.fileoutputcommitter.algorithm.version", > "2") > spark.sparkContext.hadoopConfiguration.setBoolean("mapreduce.fileoutputcommitter.cleanup.skipped", > true) > spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", > System.getenv("AWS_ACCESS_KEY_ID")) > spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", > System.getenv("AWS_SECRET_ACCESS_KEY")) > spark.sparkContext.hadoopConfiguration.set("fs.s3a.endpoint", > "s3.amazonaws.com") > > Here's the line where I am getting this exception: > > val df1 = spark.read.parquet(pathToRead) > > What am I doing wrong? I have tried without setting 'access key' & 'secret > key' as well with no luck. > > > > >