Re: debug jsonRDD problem?

2015-05-28 Thread Michael Stone
On Wed, May 27, 2015 at 02:06:16PM -0700, Ted Yu wrote: Looks like the exception was caused by resolved.get(prefix ++ a) returning None :         a = StructField(a.head, resolved.get(prefix ++ a).get, nullable = true) There are three occurrences of resolved.get() in createSchema() - None should

debug jsonRDD problem?

2015-05-27 Thread Michael Stone
Can anyone provide some suggestions on how to debug this? Using spark 1.3.1. The json itself seems to be valid (other programs can parse it) and the problem seems to lie in jsonRDD trying to describe use a schema. scala sqlContext.jsonRDD(rdd).count() java.util.NoSuchElementException:

Re: debug jsonRDD problem?

2015-05-27 Thread Michael Stone
On Wed, May 27, 2015 at 01:13:43PM -0700, Ted Yu wrote: Can you tell us a bit more about (schema of) your JSON ? It's fairly simple, consisting of 22 fields with values that are mostly strings or integers, except that some of the fields are objects with http header/value pairs. I'd guess it's

dynamicAllocation spark-shell

2015-04-23 Thread Michael Stone
If I enable dynamicAllocation and then use spark-shell or pyspark, things start out working as expected: running simple commands causes new executors to start and complete tasks. If the shell is left idle for a while, executors start getting killed off: 15/04/23 10:52:43 INFO

Re: spark.dynamicAllocation.minExecutors

2015-04-16 Thread Michael Stone
On Thu, Apr 16, 2015 at 12:16:13PM -0700, Marcelo Vanzin wrote: I think Michael is referring to this: Exception in thread main java.lang.IllegalArgumentException: You must specify at least 1 executor! Usage: org.apache.spark.deploy.yarn.Client [options] Yes, sorry, there were too many mins

Re: spark.dynamicAllocation.minExecutors

2015-04-16 Thread Michael Stone
On Thu, Apr 16, 2015 at 07:47:51PM +0100, Sean Owen wrote: IIRC that was fixed already in 1.3 https://github.com/apache/spark/commit/b2047b55c5fc85de6b63276d8ab9610d2496e08b From that commit: + private val minNumExecutors = conf.getInt(spark.dynamicAllocation.minExecutors, 0) ... + if

Re: spark.dynamicAllocation.minExecutors

2015-04-16 Thread Michael Stone
On Thu, Apr 16, 2015 at 08:10:54PM +0100, Sean Owen wrote: Yes, look what it was before -- would also reject a minimum of 0. That's the case you are hitting. 0 is a fine minimum. How can 0 be a fine minimum if it's rejected? Changing the value is easy enough, but in general it's nice for

spark.dynamicAllocation.minExecutors

2015-04-16 Thread Michael Stone
The default for spark.dynamicAllocation.minExecutors is 0, but that value causes a runtime error and a message that the minimum is 1. Perhaps the default should be changed to 1? Mike Stone - To unsubscribe, e-mail:

Re: HDP 2.2 AM abort : Unable to find ExecutorLauncher class

2015-03-28 Thread Michael Stone
I've also been having trouble running 1.3.0 on HDP. The spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 configuration directive seems to work with pyspark, but not propagate when using spark-shell. (That is, everything works find with pyspark, and spark-shell fails with the bad