Re: Spark 1.3.1 On Mesos Issues.

2015-06-08 Thread John Omernik
It appears this may be related.

https://issues.apache.org/jira/browse/SPARK-1403

Granted the NPE is in MapR's code, having Spark (seemingly, I am not an
expert here, just basing it off the comments) switch in its behavior (if
that's what it is doing) probably isn't good either. I guess the level that
this is happening at is way above my head.  :)



On Fri, Jun 5, 2015 at 4:38 PM, John Omernik j...@omernik.com wrote:

 Thanks all. The answers post is me too, I multi thread. That and Ted is
 aware to and Mapr is helping me with it.  I shall report the answer of that
 investigation when we have it.

 As to reproduction, I've installed mapr file system, tired both version
 4.0.2 and 4.1.0.  Have mesos running along side mapr, and then I use
 standard methods for submitting spark jobs to mesos. I don't have my
 configs now, on vacation :) but I can shar on Monday.

 I appreciate the support I am getting from every one, mesos community,
 spark community, and mapr.  Great to see folks solving problems and I will
 be sure report back findings as they arise.



 On Friday, June 5, 2015, Tim Chen t...@mesosphere.io wrote:

 It seems like there is another thread going on:


 http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html

 I'm not particularly sure why, seems like the problem is that getting the
 current context class loader is returning null in this instance.

 Do you have some repro steps or config we can try this?

 Tim

 On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com
 wrote:


  On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com wrote:

  It would be nice to see the code for MapR FS Java API, but my google
 foo failed me (assuming it's open source)...


  I know that MapRFS is closed source, don't know about the java JAR.
 Why not ask Ted Dunning (cc'd)  nicely to see if he can track down the
 stack trace for you.

   So, shooting in the dark ;) there are a few things I would check, if
 you haven't already:

  1. Could there be 1.2 versions of some Spark jars that get picked up
 at run time (but apparently not in local mode) on one or more nodes? (Side
 question: Does your node experiment fail on all nodes?) Put another way,
 are the classpaths good for all JVM tasks?
 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

  Incidentally, how are you combining Mesos and MapR? Are you running
 Spark in Mesos, but accessing data in MapR-FS?

  Perhaps the MapR shim library doesn't support Spark 1.3.1.

  HTH,

  dean

  Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com/
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:

 All -

  I am facing and odd issue and I am not really sure where to go for
 support at this point.  I am running MapR which complicates things as it
 relates to Mesos, however this HAS worked in the past with no issues so I
 am stumped here.

  So for starters, here is what I am trying to run. This is a simple
 show tables using the Hive Context:

  from pyspark import SparkContext, SparkConf
 from pyspark.sql import SQLContext, Row, HiveContext
 sparkhc = HiveContext(sc)
 test = sparkhc.sql(show tables)
 for r in test.collect():
   print r

  When I run it on 1.3.1 using ./bin/pyspark --master local  This works
 with no issues.

  When I run it using Mesos with all the settings configured (as they
 had worked in the past) I get lost tasks and when I zoom in them, the error
 that is being reported is below.  Basically it's a NullPointerException on
 the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
 and compared both together, the class path, everything is exactly the same.
 Yet running in local mode works, and running in mesos fails.  Also of note,
 when the task is scheduled to run on the same node as when I run locally,
 that fails too! (Baffling).

  Ok, for comparison, how I configured Mesos was to download the mapr4
 package from spark.apache.org.  Using the exact same configuration
 file (except for changing the executor tgz from 1.2.0 to 1.3.1) from the
 1.2.0.  When I run this example with the mapr4 for 1.2.0 there is no issue
 in Mesos, everything runs as intended. Using the same package for 1.3.1
 then it fails.

  (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails
 as well).

  So basically When I used 1.2.0 and followed a set of steps, it worked
 on Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR
 is supports 1.2.1 only.  (Still working on that).

  I guess I am at a loss right now on why this would be happening, any
 pointers on where I could look or what I could tweak would be greatly
 appreciated. Additionally, if there is something I could specifically draw
 to the attention of MapR on this problem please let me know, I am 

Re: Spark 1.3.1 On Mesos Issues.

2015-06-05 Thread Steve Loughran

On 2 Jun 2015, at 00:14, Dean Wampler 
deanwamp...@gmail.commailto:deanwamp...@gmail.com wrote:

It would be nice to see the code for MapR FS Java API, but my google foo failed 
me (assuming it's open source)...


I know that MapRFS is closed source, don't know about the java JAR. Why not ask 
Ted Dunning (cc'd)  nicely to see if he can track down the stack trace for you.

So, shooting in the dark ;) there are a few things I would check, if you 
haven't already:

1. Could there be 1.2 versions of some Spark jars that get picked up at run 
time (but apparently not in local mode) on one or more nodes? (Side question: 
Does your node experiment fail on all nodes?) Put another way, are the 
classpaths good for all JVM tasks?
2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

Incidentally, how are you combining Mesos and MapR? Are you running Spark in 
Mesos, but accessing data in MapR-FS?

Perhaps the MapR shim library doesn't support Spark 1.3.1.

HTH,

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd 
Editionhttp://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafehttp://typesafe.com/
@deanwamplerhttp://twitter.com/deanwampler
http://polyglotprogramming.comhttp://polyglotprogramming.com/

On Mon, Jun 1, 2015 at 2:49 PM, John Omernik 
j...@omernik.commailto:j...@omernik.com wrote:
All -

I am facing and odd issue and I am not really sure where to go for support at 
this point.  I am running MapR which complicates things as it relates to Mesos, 
however this HAS worked in the past with no issues so I am stumped here.

So for starters, here is what I am trying to run. This is a simple show tables 
using the Hive Context:

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext, Row, HiveContext
sparkhc = HiveContext(sc)
test = sparkhc.sql(show tables)
for r in test.collect():
  print r

When I run it on 1.3.1 using ./bin/pyspark --master local  This works with no 
issues.

When I run it using Mesos with all the settings configured (as they had worked 
in the past) I get lost tasks and when I zoom in them, the error that is being 
reported is below.  Basically it's a NullPointerException on the 
com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance and 
compared both together, the class path, everything is exactly the same. Yet 
running in local mode works, and running in mesos fails.  Also of note, when 
the task is scheduled to run on the same node as when I run locally, that fails 
too! (Baffling).

Ok, for comparison, how I configured Mesos was to download the mapr4 package 
from spark.apache.orghttp://spark.apache.org/.  Using the exact same 
configuration file (except for changing the executor tgz from 1.2.0 to 1.3.1) 
from the 1.2.0.  When I run this example with the mapr4 for 1.2.0 there is no 
issue in Mesos, everything runs as intended. Using the same package for 1.3.1 
then it fails.

(Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as well).

So basically When I used 1.2.0 and followed a set of steps, it worked on Mesos 
and 1.3.1 fails.  Since this is a current version of Spark, MapR is supports 
1.2.1 only.  (Still working on that).

I guess I am at a loss right now on why this would be happening, any pointers 
on where I could look or what I could tweak would be greatly appreciated. 
Additionally, if there is something I could specifically draw to the attention 
of MapR on this problem please let me know, I am perplexed on the change from 
1.2.0 to 1.3.1.

Thank you,

John




Full Error on 1.3.1 on Mesos:
15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity 1060.3 MB 
java.lang.NullPointerException at 
com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at 
com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at 
com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at 
org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60) at 
java.lang.Class.forName0(Native Method) at 
java.lang.Class.forName(Class.java:274) at 
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
 at org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062) 
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272) 
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224) 
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141) at 
org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at 
org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at 
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
 at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at 
org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at 
org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at 
org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at 

Re: Spark 1.3.1 On Mesos Issues.

2015-06-05 Thread John Omernik
Thanks all. The answers post is me too, I multi thread. That and Ted is
aware to and Mapr is helping me with it.  I shall report the answer of that
investigation when we have it.

As to reproduction, I've installed mapr file system, tired both version
4.0.2 and 4.1.0.  Have mesos running along side mapr, and then I use
standard methods for submitting spark jobs to mesos. I don't have my
configs now, on vacation :) but I can shar on Monday.

I appreciate the support I am getting from every one, mesos community,
spark community, and mapr.  Great to see folks solving problems and I will
be sure report back findings as they arise.



On Friday, June 5, 2015, Tim Chen t...@mesosphere.io wrote:

 It seems like there is another thread going on:


 http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html

 I'm not particularly sure why, seems like the problem is that getting the
 current context class loader is returning null in this instance.

 Do you have some repro steps or config we can try this?

 Tim

 On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com
 javascript:_e(%7B%7D,'cvml','ste...@hortonworks.com'); wrote:


  On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com
 javascript:_e(%7B%7D,'cvml','deanwamp...@gmail.com'); wrote:

  It would be nice to see the code for MapR FS Java API, but my google
 foo failed me (assuming it's open source)...


  I know that MapRFS is closed source, don't know about the java JAR. Why
 not ask Ted Dunning (cc'd)  nicely to see if he can track down the stack
 trace for you.

   So, shooting in the dark ;) there are a few things I would check, if
 you haven't already:

  1. Could there be 1.2 versions of some Spark jars that get picked up at
 run time (but apparently not in local mode) on one or more nodes? (Side
 question: Does your node experiment fail on all nodes?) Put another way,
 are the classpaths good for all JVM tasks?
 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

  Incidentally, how are you combining Mesos and MapR? Are you running
 Spark in Mesos, but accessing data in MapR-FS?

  Perhaps the MapR shim library doesn't support Spark 1.3.1.

  HTH,

  dean

  Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com/
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com
 javascript:_e(%7B%7D,'cvml','j...@omernik.com'); wrote:

 All -

  I am facing and odd issue and I am not really sure where to go for
 support at this point.  I am running MapR which complicates things as it
 relates to Mesos, however this HAS worked in the past with no issues so I
 am stumped here.

  So for starters, here is what I am trying to run. This is a simple
 show tables using the Hive Context:

  from pyspark import SparkContext, SparkConf
 from pyspark.sql import SQLContext, Row, HiveContext
 sparkhc = HiveContext(sc)
 test = sparkhc.sql(show tables)
 for r in test.collect():
   print r

  When I run it on 1.3.1 using ./bin/pyspark --master local  This works
 with no issues.

  When I run it using Mesos with all the settings configured (as they
 had worked in the past) I get lost tasks and when I zoom in them, the error
 that is being reported is below.  Basically it's a NullPointerException on
 the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
 and compared both together, the class path, everything is exactly the same.
 Yet running in local mode works, and running in mesos fails.  Also of note,
 when the task is scheduled to run on the same node as when I run locally,
 that fails too! (Baffling).

  Ok, for comparison, how I configured Mesos was to download the mapr4
 package from spark.apache.org.  Using the exact same configuration file
 (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0.
 When I run this example with the mapr4 for 1.2.0 there is no issue in
 Mesos, everything runs as intended. Using the same package for 1.3.1 then
 it fails.

  (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails
 as well).

  So basically When I used 1.2.0 and followed a set of steps, it worked
 on Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR
 is supports 1.2.1 only.  (Still working on that).

  I guess I am at a loss right now on why this would be happening, any
 pointers on where I could look or what I could tweak would be greatly
 appreciated. Additionally, if there is something I could specifically draw
 to the attention of MapR on this problem please let me know, I am perplexed
 on the change from 1.2.0 to 1.3.1.

  Thank you,

  John




  Full Error on 1.3.1 on Mesos:
 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
 1060.3 MB java.lang.NullPointerException at
 

Re: Spark 1.3.1 On Mesos Issues.

2015-06-05 Thread Tim Chen
It seems like there is another thread going on:

http://answers.mapr.com/questions/163353/spark-from-apache-downloads-site-for-mapr.html

I'm not particularly sure why, seems like the problem is that getting the
current context class loader is returning null in this instance.

Do you have some repro steps or config we can try this?

Tim

On Fri, Jun 5, 2015 at 3:40 AM, Steve Loughran ste...@hortonworks.com
wrote:


  On 2 Jun 2015, at 00:14, Dean Wampler deanwamp...@gmail.com wrote:

  It would be nice to see the code for MapR FS Java API, but my google foo
 failed me (assuming it's open source)...


  I know that MapRFS is closed source, don't know about the java JAR. Why
 not ask Ted Dunning (cc'd)  nicely to see if he can track down the stack
 trace for you.

   So, shooting in the dark ;) there are a few things I would check, if
 you haven't already:

  1. Could there be 1.2 versions of some Spark jars that get picked up at
 run time (but apparently not in local mode) on one or more nodes? (Side
 question: Does your node experiment fail on all nodes?) Put another way,
 are the classpaths good for all JVM tasks?
 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

  Incidentally, how are you combining Mesos and MapR? Are you running
 Spark in Mesos, but accessing data in MapR-FS?

  Perhaps the MapR shim library doesn't support Spark 1.3.1.

  HTH,

  dean

  Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com/
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:

 All -

  I am facing and odd issue and I am not really sure where to go for
 support at this point.  I am running MapR which complicates things as it
 relates to Mesos, however this HAS worked in the past with no issues so I
 am stumped here.

  So for starters, here is what I am trying to run. This is a simple show
 tables using the Hive Context:

  from pyspark import SparkContext, SparkConf
 from pyspark.sql import SQLContext, Row, HiveContext
 sparkhc = HiveContext(sc)
 test = sparkhc.sql(show tables)
 for r in test.collect():
   print r

  When I run it on 1.3.1 using ./bin/pyspark --master local  This works
 with no issues.

  When I run it using Mesos with all the settings configured (as they had
 worked in the past) I get lost tasks and when I zoom in them, the error
 that is being reported is below.  Basically it's a NullPointerException on
 the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
 and compared both together, the class path, everything is exactly the same.
 Yet running in local mode works, and running in mesos fails.  Also of note,
 when the task is scheduled to run on the same node as when I run locally,
 that fails too! (Baffling).

  Ok, for comparison, how I configured Mesos was to download the mapr4
 package from spark.apache.org.  Using the exact same configuration file
 (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0.
 When I run this example with the mapr4 for 1.2.0 there is no issue in
 Mesos, everything runs as intended. Using the same package for 1.3.1 then
 it fails.

  (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as
 well).

  So basically When I used 1.2.0 and followed a set of steps, it worked
 on Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR
 is supports 1.2.1 only.  (Still working on that).

  I guess I am at a loss right now on why this would be happening, any
 pointers on where I could look or what I could tweak would be greatly
 appreciated. Additionally, if there is something I could specifically draw
 to the attention of MapR on this problem please let me know, I am perplexed
 on the change from 1.2.0 to 1.3.1.

  Thank you,

  John




  Full Error on 1.3.1 on Mesos:
 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
 1060.3 MB java.lang.NullPointerException at
 com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at
 com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at
 com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
 org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
 at java.lang.Class.forName0(Native Method) at
 java.lang.Class.forName(Class.java:274) at
 org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
 at
 org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
 at
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
 at
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
 org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
 

Re: Spark 1.3.1 On Mesos Issues.

2015-06-04 Thread John Omernik
So a few updates.  When I run local as stated before, it works fine. When I
run in Yarn (via Apache Myriad on Mesos) it also runs fine. The only issue
is specifically with Mesos. I wonder if there is some sort of class path
goodness I need to fix or something along that lines.  Any tips would be
appreciated.

Thanks!

John

On Mon, Jun 1, 2015 at 6:14 PM, Dean Wampler deanwamp...@gmail.com wrote:

 It would be nice to see the code for MapR FS Java API, but my google foo
 failed me (assuming it's open source)...

 So, shooting in the dark ;) there are a few things I would check, if you
 haven't already:

 1. Could there be 1.2 versions of some Spark jars that get picked up at
 run time (but apparently not in local mode) on one or more nodes? (Side
 question: Does your node experiment fail on all nodes?) Put another way,
 are the classpaths good for all JVM tasks?
 2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

 Incidentally, how are you combining Mesos and MapR? Are you running Spark
 in Mesos, but accessing data in MapR-FS?

 Perhaps the MapR shim library doesn't support Spark 1.3.1.

 HTH,

 dean

 Dean Wampler, Ph.D.
 Author: Programming Scala, 2nd Edition
 http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
 Typesafe http://typesafe.com
 @deanwampler http://twitter.com/deanwampler
 http://polyglotprogramming.com

 On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:

 All -

 I am facing and odd issue and I am not really sure where to go for
 support at this point.  I am running MapR which complicates things as it
 relates to Mesos, however this HAS worked in the past with no issues so I
 am stumped here.

 So for starters, here is what I am trying to run. This is a simple show
 tables using the Hive Context:

 from pyspark import SparkContext, SparkConf
 from pyspark.sql import SQLContext, Row, HiveContext
 sparkhc = HiveContext(sc)
 test = sparkhc.sql(show tables)
 for r in test.collect():
   print r

 When I run it on 1.3.1 using ./bin/pyspark --master local  This works
 with no issues.

 When I run it using Mesos with all the settings configured (as they had
 worked in the past) I get lost tasks and when I zoom in them, the error
 that is being reported is below.  Basically it's a NullPointerException on
 the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
 and compared both together, the class path, everything is exactly the same.
 Yet running in local mode works, and running in mesos fails.  Also of note,
 when the task is scheduled to run on the same node as when I run locally,
 that fails too! (Baffling).

 Ok, for comparison, how I configured Mesos was to download the mapr4
 package from spark.apache.org.  Using the exact same configuration file
 (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0.
 When I run this example with the mapr4 for 1.2.0 there is no issue in
 Mesos, everything runs as intended. Using the same package for 1.3.1 then
 it fails.

 (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as
 well).

 So basically When I used 1.2.0 and followed a set of steps, it worked on
 Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR is
 supports 1.2.1 only.  (Still working on that).

 I guess I am at a loss right now on why this would be happening, any
 pointers on where I could look or what I could tweak would be greatly
 appreciated. Additionally, if there is something I could specifically draw
 to the attention of MapR on this problem please let me know, I am perplexed
 on the change from 1.2.0 to 1.3.1.

 Thank you,

 John




 Full Error on 1.3.1 on Mesos:
 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
 1060.3 MB java.lang.NullPointerException at
 com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at
 com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at
 com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
 org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
 at java.lang.Class.forName0(Native Method) at
 java.lang.Class.forName(Class.java:274) at
 org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
 at
 org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
 at
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
 at
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
 org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
 org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
 at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
 org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at
 org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
 

Spark 1.3.1 On Mesos Issues.

2015-06-01 Thread John Omernik
All -

I am facing and odd issue and I am not really sure where to go for support
at this point.  I am running MapR which complicates things as it relates to
Mesos, however this HAS worked in the past with no issues so I am stumped
here.

So for starters, here is what I am trying to run. This is a simple show
tables using the Hive Context:

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext, Row, HiveContext
sparkhc = HiveContext(sc)
test = sparkhc.sql(show tables)
for r in test.collect():
  print r

When I run it on 1.3.1 using ./bin/pyspark --master local  This works with
no issues.

When I run it using Mesos with all the settings configured (as they had
worked in the past) I get lost tasks and when I zoom in them, the error
that is being reported is below.  Basically it's a NullPointerException on
the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
and compared both together, the class path, everything is exactly the same.
Yet running in local mode works, and running in mesos fails.  Also of note,
when the task is scheduled to run on the same node as when I run locally,
that fails too! (Baffling).

Ok, for comparison, how I configured Mesos was to download the mapr4
package from spark.apache.org.  Using the exact same configuration file
(except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0.
When I run this example with the mapr4 for 1.2.0 there is no issue in
Mesos, everything runs as intended. Using the same package for 1.3.1 then
it fails.

(Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as
well).

So basically When I used 1.2.0 and followed a set of steps, it worked on
Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR is
supports 1.2.1 only.  (Still working on that).

I guess I am at a loss right now on why this would be happening, any
pointers on where I could look or what I could tweak would be greatly
appreciated. Additionally, if there is something I could specifically draw
to the attention of MapR on this problem please let me know, I am perplexed
on the change from 1.2.0 to 1.3.1.

Thank you,

John




Full Error on 1.3.1 on Mesos:
15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
1060.3 MB java.lang.NullPointerException at
com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at
com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at
com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
at java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:274) at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
at
org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at
org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at
org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at
org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at
org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at
org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at
org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)
java.lang.RuntimeException: Failure loading MapRClient. at
com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:283) at
com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
at java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:274) at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
at
org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at

Re: Spark 1.3.1 On Mesos Issues.

2015-06-01 Thread Dean Wampler
It would be nice to see the code for MapR FS Java API, but my google foo
failed me (assuming it's open source)...

So, shooting in the dark ;) there are a few things I would check, if you
haven't already:

1. Could there be 1.2 versions of some Spark jars that get picked up at run
time (but apparently not in local mode) on one or more nodes? (Side
question: Does your node experiment fail on all nodes?) Put another way,
are the classpaths good for all JVM tasks?
2. Can you use just MapR and Spark 1.3.1 successfully, bypassing Mesos?

Incidentally, how are you combining Mesos and MapR? Are you running Spark
in Mesos, but accessing data in MapR-FS?

Perhaps the MapR shim library doesn't support Spark 1.3.1.

HTH,

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com

On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:

 All -

 I am facing and odd issue and I am not really sure where to go for support
 at this point.  I am running MapR which complicates things as it relates to
 Mesos, however this HAS worked in the past with no issues so I am stumped
 here.

 So for starters, here is what I am trying to run. This is a simple show
 tables using the Hive Context:

 from pyspark import SparkContext, SparkConf
 from pyspark.sql import SQLContext, Row, HiveContext
 sparkhc = HiveContext(sc)
 test = sparkhc.sql(show tables)
 for r in test.collect():
   print r

 When I run it on 1.3.1 using ./bin/pyspark --master local  This works with
 no issues.

 When I run it using Mesos with all the settings configured (as they had
 worked in the past) I get lost tasks and when I zoom in them, the error
 that is being reported is below.  Basically it's a NullPointerException on
 the com.mapr.fs.ShimLoader.  What's weird to me is is I took each instance
 and compared both together, the class path, everything is exactly the same.
 Yet running in local mode works, and running in mesos fails.  Also of note,
 when the task is scheduled to run on the same node as when I run locally,
 that fails too! (Baffling).

 Ok, for comparison, how I configured Mesos was to download the mapr4
 package from spark.apache.org.  Using the exact same configuration file
 (except for changing the executor tgz from 1.2.0 to 1.3.1) from the 1.2.0.
 When I run this example with the mapr4 for 1.2.0 there is no issue in
 Mesos, everything runs as intended. Using the same package for 1.3.1 then
 it fails.

 (Also of note, 1.2.1 gives a 404 error, 1.2.2 fails, and 1.3.0 fails as
 well).

 So basically When I used 1.2.0 and followed a set of steps, it worked on
 Mesos and 1.3.1 fails.  Since this is a current version of Spark, MapR is
 supports 1.2.1 only.  (Still working on that).

 I guess I am at a loss right now on why this would be happening, any
 pointers on where I could look or what I could tweak would be greatly
 appreciated. Additionally, if there is something I could specifically draw
 to the attention of MapR on this problem please let me know, I am perplexed
 on the change from 1.2.0 to 1.3.1.

 Thank you,

 John




 Full Error on 1.3.1 on Mesos:
 15/05/19 09:31:26 INFO MemoryStore: MemoryStore started with capacity
 1060.3 MB java.lang.NullPointerException at
 com.mapr.fs.ShimLoader.getRootClassLoader(ShimLoader.java:96) at
 com.mapr.fs.ShimLoader.injectNativeLoader(ShimLoader.java:232) at
 com.mapr.fs.ShimLoader.load(ShimLoader.java:194) at
 org.apache.hadoop.conf.CoreDefaultProperties.(CoreDefaultProperties.java:60)
 at java.lang.Class.forName0(Native Method) at
 java.lang.Class.forName(Class.java:274) at
 org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1847)
 at
 org.apache.hadoop.conf.Configuration.getProperties(Configuration.java:2062)
 at
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2272)
 at
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2224)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2141)
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:992) at
 org.apache.hadoop.conf.Configuration.set(Configuration.java:966) at
 org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:98)
 at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:43) at
 org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:220) at
 org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) at
 org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1959) at
 org.apache.spark.storage.BlockManager.(BlockManager.scala:104) at
 org.apache.spark.storage.BlockManager.(BlockManager.scala:179) at
 org.apache.spark.SparkEnv$.create(SparkEnv.scala:310) at
 org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:186) at
 org.apache.spark.executor.MesosExecutorBackend.registered(MesosExecutorBackend.scala:70)