I have stumbled across and interesting (potential) bug.  I have an
environment that is MapR FS and Mesos.  I've posted a bit in the past
around getting this setup to work with Spark Mesos, and MapR and the Spark
community has been helpful.

In 1.4.1, I was able to get Spark working in this setup via setting
"spark.driver.extraClassPath"
and "spark.executor.extraClassPath" to include some specific MapR
libraries. That seemed to work, and all was well.

Fast forward to today, and I was trying to work with the "hadoop-provided"
download from the apache site, and things seemed to work with that setting,
i.e. in pyspark, when I ran this test:

tf = sc.textFile("/path/to/a/file/in/maprfs/file.json")
tf.count()

I didn't get the "No filesystem for scheme maprfs" but instead I got the
NPE on the tasks (NPE listed below). which was really similar to the NPE
issues I was getting on the 1.4.1 when I was not setting the classpath.

So I was frustrated, and started playing with settings as I am apt to do,
and what I realized is when I changed the mesos resource allocation to
coarse mode, it worked!

So I did some other tests to isolate this, and sure enough, ONLY changing
the spark.mesos.coarse setting to true, while leaving all else the same
fixes the issue. When I set to false (fine grained) then it failed with the
NPE.

So I thought about this, and I did open a case with MapR, but this seems to
me to be odd in how Spark is handling things. Perhaps the classpath
information isn't being properly propagated to the tasks in fine mode, but
when I run in coarse grained mode, the executor is properly seeing the
spark.executor.extraClassPath?  Is there a spark.task.extraClassPath? (I
will be googling this as well).

I am curious on this behaivior, and if it's something that might point to a
bug or if it's just classic uninitiated user error :)

John



NPE in Fine Grained Mode:

15/11/12 13:52:00 INFO storage.DiskBlockManager: Created local directory at
/tmp/blockmgr-94b6962b-2c28-4c10-946c-bd3b5c8c8069
15/11/12 13:52:00 INFO storage.MemoryStore: MemoryStore started with
capacity 1060.0 MB
java.lang.NullPointerException
        at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:109)
        at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:251)
        at com.mapr.fs.ShimLoader.load(ShimLoader.java:213)
        at
org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:61)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
        at
org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2362)
        at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2579)
        at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2531)
        at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2444)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1156)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1128)
        at
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:107)
        at
org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:52)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop$lzycompute(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:403)
        at
org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2049)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:97)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:173)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:347)
        at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:218)
        at
org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
java.lang.RuntimeException: Failure loading MapRClient.
        at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:295)
        at com.mapr.fs.ShimLoader.load(ShimLoader.java:213)
        at
org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:61)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
        at
org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2362)
        at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2579)
        at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2531)
        at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2444)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1156)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1128)
        at
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:107)
        at
org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:52)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop$lzycompute(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:403)
        at
org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2049)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:97)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:173)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:347)
        at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:218)
        at
org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
Caused by: java.lang.NullPointerException
        at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:109)
        at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:251)
        ... 22 more
java.lang.ExceptionInInitializerError
        at com.mapr.fs.ShimLoader.load(ShimLoader.java:233)
        at
org.apache.hadoop.conf.CoreDefaultProperties.<clinit>(CoreDefaultProperties.java:61)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
        at
org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2362)
        at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2579)
        at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2531)
        at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2444)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1156)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1128)
        at
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:107)
        at
org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:52)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop$lzycompute(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.hadoop(SparkHadoopUtil.scala:383)
        at
org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:403)
        at
org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2049)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:97)
        at
org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:173)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:347)
        at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:218)
        at
org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
Caused by: java.lang.RuntimeException: Failure loading MapRClient.
        at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:295)
        at com.mapr.fs.ShimLoader.load(ShimLoader.java:213)
        ... 21 more
Caused by: java.lang.NullPointerException
        at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:109)
        at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:251)
        ... 22 more
Exception in thread "Thread-1" I1112 13:52:01.186414 19602 exec.cpp:416]
Deactivating the executor libprocess
15/11/12 13:52:01 INFO storage.DiskBlockManager: Shutdown hook called
15/11/12 13:52:01 INFO util.ShutdownHookManager: Shutdown hook called

Reply via email to