John, I took the liberty of reopening because I have sufficient JIRA permissions (not sure if you do). It would be good if you can add relevant comments/investigations there.
On Thu, Jun 11, 2015 at 8:34 AM, John Omernik <j...@omernik.com> wrote: > Hey all, from my other post on Spark 1.3.1 issues, I think we found an > issue related to a previous closed Jira ( > https://issues.apache.org/jira/browse/SPARK-1403) Basically it looks > like the threat context class loader is NULL which is causing the NPE in > MapR and that's similar to posted Jira. New comments have been added to > that Jira, but I am not sure how to trace back changes to determine why it > was NULL in 0.9 apparently fixed in 1.0 working in 1.2 and then broken from > 1.2.2 onward. > > Is it possible to open a closed Jira? Should I open another? I think MapR > is working to handle in their code, but I think someone (with more > knowledge than I) should probably look into this on Spark as well due it > appearing to have changed behavior between versions. > > Thoughts? > > John > > > Previous Post > > All - > > I am facing and odd issue and I am not really sure where to go for support > at this point. I am running MapR which complicates things as it relates to > Mesos, however this HAS worked in the past with no issues so I am stumped > here. > > So for starters, here is what I am trying to run. This is a simple show > tables using the Hive Context: > > from pyspark import SparkContext, SparkConf > from pyspark.sql import SQLContext, Row, HiveContext > sparkhc = HiveContext(sc) > test = sparkhc.sql("show tables") > for r in test.collect(): > print r > > When I run it on 1.3.1 using ./bin/pyspark --master local This works with > no issues. > > When I run it using Mesos with all the settings configured (as they had > worked in the past) I get lost tasks and when I zoom in them, the error > that is being reported is below. Basically it's a NullPointerException on > the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance > and compared both together, the class path, everything is exactly the same. > Yet running in local mode works, and running in mesos fails. Also of note, > when the task is scheduled to run on the same node as when I run locally, > that fails too! (Baffling). > > Ok, for comparison, how I configured Mesos was to download the mapr4 > package from spark.apache.org. Using the exact same configuration file > (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. > When I run this example with the mapr4 for 1.2.0 there is no issue in > Mesos, everything runs as intended. Using the same package for 1.3.1 then > it fails. > > (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as > well). > > So basically When I used 1.2.0 and followed a set of steps, it worked on > Mesos and 1.3.1 fails. Since this is a "current" version of Spark, MapR is > supports 1.2.1 only. (Still working on that). > > I guess I am at a loss right now on why this would be happening, any > pointers on where I could look or what I could tweak would be greatly > appreciated. Additionally, if there is something I could specifically draw > to the attention of MapR on this problem please let me know, I am perplexed > on the change from 1.2.0 to 1.3.1. > > Thank you, > > John > > > > > Full Error on 1.3.1 on Mesos: > 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity > 1060.3 MB java.lang.NullPointerException at > com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at > com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at > com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at > org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) > at java.lang.Class.forName0(Native Method) at > java.lang.Class.forName(Class.java:274) at > org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) > at > org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) > at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at > org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at > org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) > at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at > org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at > org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at > org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at > org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at > org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at > org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at > org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at > org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70) > java.lang.RuntimeException: Failure loading MapRClient. at > com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:283) at > com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at > org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) > at java.lang.Class.forName0(Native Method) at > java.lang.Class.forName(Class.java:274) at > org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) > at > org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) > at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at > org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at > org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) > at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at > org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at > org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at > org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at > org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at > org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at > org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at > org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at > org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70) >