Thank you Ted!

On Wed, Feb 17, 2016 at 2:12 PM Ted Yu <yuzhih...@gmail.com> wrote:

> If the Accumulators are updated at the same time, calling foreach() once
> seems to have better performance.
>
> > On Feb 17, 2016, at 4:30 PM, Daniel Imberman <daniel.imber...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > So I'm currently figuring out how to accumulate three separate
> accumulators:
> >
> > val a:Accumulator
> > val b:Accumulator
> > val c:Accumulator
> >
> > I have an r:RDD[thing] and the code currently reads
> >
> > r.foreach{
> >    thing =>
> >             a += thing
> >             b += thing
> >             c += thing
> > }
> >
> >
> > Ideally, I would much prefer to split this up so that I can separate
> > concerns. I'm considering creating something along the lines of:
> >
> > def handleA(a:Accumulator, r:RDD[Thing]){
> > //a's logic
> > r.foreach{ thing => a += thing}
> > }
> >
> >
> > def handleB(b:Accumulator, r:RDD[Thing]){
> > //a's logic
> > r.foreach{ thing => b += thing}
> > }
> >
> > and so on. However Im worried that this would cause a performance hit.
> Does
> > anyone have any thoughts as to whether this would be a bad idea?
> >
> > thank you!
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Running-multiple-foreach-loops-tp26256.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>

Reply via email to