RE: RDD.aggregate versus accumulables...

2014-11-17 Thread Segerlind, Nathan L
Thanks for the link to the bug. Unfortunately, using accumulators like this is getting spread around as a recommended practice despite the bug. From: Daniel Siegmann [mailto:daniel.siegm...@velos.io] Sent: Monday, November 17, 2014 8:32 AM To: Segerlind, Nathan L Cc: user Subject: Re

RDD.aggregate versus accumulables...

2014-11-16 Thread Segerlind, Nathan L
Hi All. I am trying to get my head around why using accumulators and accumulables seems to be the most recommended method for accumulating running sums, averages, variances and the like, whereas the aggregate method seems to me to be the right one. I have no performance measurements as of yet,

is it possible to initiate Spark jobs from Oozie?

2014-04-09 Thread Segerlind, Nathan L
Howdy. Is it possible to initiate Spark jobs from Oozie (presumably as a java action)? If so, are there known limitations to this? And would anybody have a pointer to an example? Thanks, Nate