Re: unit testing with spark

2014-02-19 Thread Ameet Kini
AM, Ameet Kini ameetk...@gmail.com wrote: Thanks, that really helps. So that helps me cache the spark context within a suite but not across suites. The closest I could find to caching across suites is extending Suites [1] and adding @DoNotDiscover annotations to the nested suites class

unit testing with spark

2014-02-18 Thread Ameet Kini
I'm writing unit tests with Spark and need some help. I've already read this helpful article: http://blog.quantifind.com/posts/spark-unit-test/ There are a couple differences in my testing environment versus the blog. 1. I'm using FunSpec instead of FunSuite. So my tests look like class

ghost executor messing up UI's stdout/stderr links

2013-12-30 Thread Ameet Kini
I refreshed my Spark version to the master branch as of this morning, and am noticing some strange behavior with executors and the UI reading executor logs while running a job in what used to be standalone mode (is still now called coarse grained scheduler mode or still standalone mode?). For

Re: debugging NotSerializableException while using Kryo

2013-12-24 Thread Ameet Kini
line195, you will at least know what cause the NPE. We can start from there. On Dec 23, 2013, at 10:21 AM, Ameet Kini ameetk...@gmail.com wrote: Thanks Imran. I tried setting spark.closure.serializer to org.apache.spark.serializer.KryoSerializer and now end up seeing

Re: debugging NotSerializableException while using Kryo

2013-12-24 Thread Ameet Kini
use both closure and object serializers while executing. This IMO is inconsistent of course with assumption that same data type should be supported uniformly regardless of where it serializes, but that's the state of things as it stands. On Mon, Dec 23, 2013 at 8:21 AM, Ameet Kini ameetk

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Ameet Kini
closures include referenced variables, like your TileIdWritable. So you need to either change that to use kryo, or make your object serializable to java. On Fri, Dec 20, 2013 at 2:18 PM, Ameet Kini ameetk...@gmail.com wrote: I'm getting the below NotSerializableException despite using Kryo

Re: debugging NotSerializableException while using Kryo

2013-12-23 Thread Ameet Kini
...@gmail.com wrote: maybe try to implement your class with serializable... 2013/12/23 Ameet Kini ameetk...@gmail.com Thanks Imran. I tried setting spark.closure.serializer to org.apache.spark.serializer.KryoSerializer and now end up seeing NullPointerException when the executor starts up

examples of map-side join of two hadoop sequence files

2013-10-18 Thread Ameet Kini
I've seen discussions where the suggestion is to do a map-side join, but haven't seen an example yet, and can certainly use one. I have two sequence files where the key is unique within each file, so the join is a one-to-one join, and can hence benefit from a map-side join. However both sequence