Re: Spark 1.3.1 On Mesos Issues.
It appears this may be related. https://issues.apache.org/jira/browse/SPARK-1403 Granted the NPE is in MapR's code, having Spark (seemingly, I am not an expert here, just basing it off the comments) switch in its behavior (if that's what it is doing) probably isn't good either. I guess the level that this is happening at is way above my head. :) On Fri, Jun 5, 2015 at 4:38 PM, John Omernik j...@omernik.com wrote: Thanks all. The answers post is me too, I multi thread. That and Ted is aware to and Mapr is helping me with it. I shall report the answer of that investigation when we have it. As to reproduction, I've installed mapr file system, tired both version 4.0.2 and 4.1.0. Have mesos running along side mapr, and then I use standard methods for submitting spark jobs to mesos. I don't have my configs now, on vacation :) but I can shar on Monday. I appreciate the support I am getting from every one, mesos community, spark community, and mapr. Great to see folks solving problems and I will be sure report back findings as they arise. On Friday, June 5, 2015, Tim Chen t...@mesosphere.io wrote: It seems like there is another thread going on: http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html I'm not particularly sure why, seems like the problem is that getting the current context class loader is returning null in this instance. Do you have some repro steps or config we can try this? Tim On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com wrote: On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com wrote: It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... I know that MapRFS is closed source, don't know about the java JAR. Why not ask Ted Dunning (cc'd) nicely to see if he can track down the stack trace for you. So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com/ @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am
Re: Spark 1.3.1 On Mesos Issues.
On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.commailto:deanwamp...@gmail.com wrote: It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... I know that MapRFS is closed source, don't know about the java JAR. Why not ask Ted Dunning (cc'd) nicely to see if he can track down the stack trace for you. So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Editionhttp://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafehttp://typesafe.com/ @deanwamplerhttp://twitter.com/deanwampler http://polyglotprogramming.comhttp://polyglotprogramming.com/ On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.commailto:j...@omernik.com wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.orghttp://spark.apache.org/. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at
Re: Spark 1.3.1 On Mesos Issues.
Thanks all. The answers post is me too, I multi thread. That and Ted is aware to and Mapr is helping me with it. I shall report the answer of that investigation when we have it. As to reproduction, I've installed mapr file system, tired both version 4.0.2 and 4.1.0. Have mesos running along side mapr, and then I use standard methods for submitting spark jobs to mesos. I don't have my configs now, on vacation :) but I can shar on Monday. I appreciate the support I am getting from every one, mesos community, spark community, and mapr. Great to see folks solving problems and I will be sure report back findings as they arise. On Friday, June 5, 2015, Tim Chen t...@mesosphere.io wrote: It seems like there is another thread going on: http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html I'm not particularly sure why, seems like the problem is that getting the current context class loader is returning null in this instance. Do you have some repro steps or config we can try this? Tim On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com javascript:_e(%7B%7D,'cvml','ste...@hortonworks.com'); wrote: On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com javascript:_e(%7B%7D,'cvml','deanwamp...@gmail.com'); wrote: It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... I know that MapRFS is closed source, don't know about the java JAR. Why not ask Ted Dunning (cc'd) nicely to see if he can track down the stack trace for you. So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com/ @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com javascript:_e(%7B%7D,'cvml','j...@omernik.com'); wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at
Re: Spark 1.3.1 On Mesos Issues.
It seems like there is another thread going on: http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html I'm not particularly sure why, seems like the problem is that getting the current context class loader is returning null in this instance. Do you have some repro steps or config we can try this? Tim On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com wrote: On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com wrote: It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... I know that MapRFS is closed source, don't know about the java JAR. Why not ask Ted Dunning (cc'd) nicely to see if he can track down the stack trace for you. So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com/ @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
Re: Spark 1.3.1 On Mesos Issues.
So a few updates. When I run local as stated before, it works fine. When I run in Yarn (via Apache Myriad on Mesos) it also runs fine. The only issue is specifically with Mesos. I wonder if there is some sort of class path goodness I need to fix or something along that lines. Any tips would be appreciated. Thanks! John On Mon, Jun 1, 2015 at 6:14 PM, Dean Wampler deanwamp...@gmail.com wrote: It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
Spark 1.3.1 On Mesos Issues.
All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70) java.lang.RuntimeException: Failure loading MapRClient. at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:283) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
Re: Spark 1.3.1 On Mesos Issues.
It would be nice to see the code for MapR FS Java API, but my google foo failed me (assuming it's open source)... So, shooting in the dark ;) there are a few things I would check, if you haven't already: 1. Could there be 1.2 versions of some Spark jars that get picked up at run time (but apparently not in local mode) on one or more nodes? (Side question: Does your node experiment fail on all nodes?) Put another way, are the classpaths good for all JVM tasks? 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos? Incidentally, how are you combining Mesos and MapR? Are you running Spark in Mesos, but accessing data in MapR-FS? Perhaps the MapR shim library doesn't support Spark 1.3.1. HTH, dean Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition http://shop.oreilly.com/product/0636920033073.do (O'Reilly) Typesafe http://typesafe.com @deanwampler http://twitter.com/deanwampler http://polyglotprogramming.com On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote: All - I am facing and odd issue and I am not really sure where to go for support at this point. I am running MapR which complicates things as it relates to Mesos, however this HAS worked in the past with no issues so I am stumped here. So for starters, here is what I am trying to run. This is a simple show tables using the Hive Context: from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext, Row, HiveContext sparkhc = HiveContext(sc) test = sparkhc.sql(show tables) for r in test.collect(): print r When I run it on 1.3.1 using ./bin/pyspark --master local This works with no issues. When I run it using Mesos with all the settings configured (as they had worked in the past) I get lost tasks and when I zoom in them, the error that is being reported is below. Basically it's a NullPointerException on the com.mapr.fs.ShimLoader. What's weird to me is is I took each instance and compared both together, the class path, everything is exactly the same. Yet running in local mode works, and running in mesos fails. Also of note, when the task is scheduled to run on the same node as when I run locally, that fails too! (Baffling). Ok, for comparison, how I configured Mesos was to download the mapr4 package from spark.apache.org. Using the exact same configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0. When I run this example with the mapr4 for 1.2.0 there is no issue in Mesos, everything runs as intended. Using the same package for 1.3.1 then it fails. (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well). So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos and 1.3.1 fails. Since this is a current version of Spark, MapR is supports 1.2.1 only. (Still working on that). I guess I am at a loss right now on why this would be happening, any pointers on where I could look or what I could tweak would be greatly appreciated. Additionally, if there is something I could specifically draw to the attention of MapR on this problem please let me know, I am perplexed on the change from 1.2.0 to 1.3.1. Thank you, John Full Error on 1.3.1 on Mesos: 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB java.lang.NullPointerException at com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:274) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847) at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)