Hi Emre, thanks for the help will have a look. Cheers!
On Tue, Apr 21, 2015 at 1:46 PM, Emre Sevinc emre.sev...@gmail.com wrote:
Hello James,
Did you check the following resources:
-
https://github.com/apache/spark/tree/master/streaming/src/test/java/org/apache/spark/streaming
-
Hello James,
Did you check the following resources:
-
https://github.com/apache/spark/tree/master/streaming/src/test/java/org/apache/spark/streaming
-
http://www.slideshare.net/databricks/strata-sj-everyday-im-shuffling-tips-for-writing-better-spark-programs
--
Emre Sevinç
I'm trying to write some unit tests for my spark code.
I need to pass a JavaPairDStreamString, String to my spark class.
Is there a way to create a JavaPairDStream using Java API?
Also is there a good resource that covers an approach (or approaches) for
unit testing using Java.
Regards
jk
Thanks for the answers!
On a concrete example, here is what I did to test my (wrong :) ) hypothesis
before writing my email:
class SomethingNotSerializable {
def process(a: Int): Int = 2 *a
}
object NonSerializableClosure extends App {
val sc = new spark.SparkContext(
local,
+1, at least with current code
just watch the log printed by DAGScheduler…
--
Nan Zhu
On Wednesday, May 14, 2014 at 1:58 PM, Mark Hamstra wrote:
serDe
Hi,
Spark's local mode is great to create simple unit tests for our spark
logic. The disadvantage however is that certain types of problems are never
exposed in local mode because things never need to be put on the wire.
E.g. if I accidentally use a closure which has something non-serializable
There's an undocumented mode that looks like it simulates a cluster:
SparkContext.scala:
// Regular expression for simulating a Spark cluster of [N, cores,
memory] locally
val LOCAL_CLUSTER_REGEX =
local-cluster\[\s*([0-9]+)\s*,\s*([0-9]+)\s*,\s*([0-9]+)\s*].r
can you running your tests
Have you actually found this to be true? I have found Spark local mode
to be quite good about blowing up if there is something non-serializable
and so my unit tests have been great for detecting this. I have never
seen something that worked in local mode that didn't work on the cluster