Re: looking for a easier way to count the number of items in a JavaDStream

2015-12-16 Thread Bryan Cutler
To follow up with your other issue, if you are just trying to count elements in a DStream, you can do that without an Accumulator. foreachRDD is meant to be an output action, it does not return anything and it is actually run in the driver program. Because Java (before 8) handles closures a

Re: looking for a easier way to count the number of items in a JavaDStream

2015-12-16 Thread Todd Nist
Another possible alternative is to register a StreamingListener and then reference the BatchInfo.numRecords; good example here, https://gist.github.com/akhld/b10dc491aad1a2007183. After registering the listener, Simply implement the appropriate "onEvent" method where onEvent is onBatchStarted,

Re: looking for a easier way to count the number of items in a JavaDStream

2015-12-16 Thread Bryan Cutler
Hi Andy, Regarding the foreachrdd return value, this Jira that will be in 1.6 should take care of that https://issues.apache.org/jira/browse/SPARK-4557 and make things a little simpler. On Dec 15, 2015 6:55 PM, "Andy Davidson" wrote: > I am writing a JUnit test

looking for a easier way to count the number of items in a JavaDStream

2015-12-15 Thread Andy Davidson
I am writing a JUnit test for some simple streaming code. I want to make assertions about how many things are in a given JavaDStream. I wonder if there is an easier way in Java to get the count? I think there are two points of friction. 1. is it easy to create an accumulator of type double or