Spark 2.0.1 fails for provided hadoop

Rishi Mishra Tue, 30 Aug 2016 06:02:53 -0700

Hi All,
I tried to configure my Spark with MapR hadoop cluster. For that I built
Spark 2.0 from source with hadoop-provided option. Then as per the document
I set my hadoop libraries in spark-env.sh.
However I get an error while SessionCatalog is getting created.  Please
refer below for exception stack trace.  Point to note is default scheme for
MapR is "maprfs://".  Hence the error.


I can see some fixes were there earlier to solve the problem.
https://github.com/apache/spark/pull/13348
But another PR removed the code.
https://github.com/apache/spark/pull/13868/files.

If I take the changes in the 1st PR mentioned here it works perfectly
fine.

Is it intentional or is it a bug ?
If its intentional , does user always have to run drivers on a hadoop
cluster node ? Which might make "some" sense in a production environment ,
but it is not very helpful during development.

GIT version on my fork : d16f9a0b7c464728d7b11899740908e23820a797.

Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)

https://in.linkedin.com/in/rishiteshmishra



Exception Stack
=====================================================================

2016-08-29 18:30:17,0869 ERROR JniCommon
fs/client/fileclient/cc/jni_MapRClient.cc:2073 Thread: 18258 mkdirs failed
for /rishim1/POCs/spark-2.0.1-SNAPSHOT-bin-custom-spark/spar, error 13
org.apache.spark.SparkException: Unable to create database default as
failed to create its directory
maprfs:///rishim1/POCs/spark-2.0.1-SNAPSHOT-bin-custom-spark/spark-warehouse
  at
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.liftedTree1$1(InMemoryCatalog.scala:126)
  at
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.createDatabase(InMemoryCatalog.scala:120)
  at
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:147)
  at
org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89)
  at
org.apache.spark.sql.internal.SessionState.catalog$lzycompute(SessionState.scala:95)
  at
org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:95)
  at
org.apache.spark.sql.internal.SessionState$$anon$1.<init>(SessionState.scala:112)
  at
org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:112)
  at
org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:111)
  at
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
  at
org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:143)
  at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:427)
  at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:411)
  ... 48 elided
Caused by: org.apache.hadoop.security.AccessControlException: User
rishim(user id 1000)  has been denied access to create spark-warehouse
  at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1239)
  at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1259)
  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1913)
  at
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.liftedTree1$1(InMemoryCatalog.scala:123)
  ... 62 more

Spark 2.0.1 fails for provided hadoop

Reply via email to