Re: Read Accumulator value while running

Daniel Imberman Wed, 13 Jan 2016 11:25:58 -0800

Hi Kira,

I'm having some trouble understanding your question. Could you please give
a code example?

>From what I think you're asking there are two issues with what you're
looking to do. (Please keep in mind I could be totally wrong on both of
these assumptions, but this is what I've been lead to believe)

1. The contract of an accumulator is that you can't actually read the value
as the function is performing because the values in the accumulator don't
actually mean anything until they are reduced. If you were looking for
progress in a local context, you could do mapPartitions and have a local
accumulator per partition, but I don't think it's possible to get the
actual accumulator value in the middle of the map job.

2. As far as performing ac2 while ac1 is "always running", I'm pretty sure
that's not possible. The way that lazy valuation works in Spark, the
transformations have to be done serially. Having it any other way would
actually be really bad because then you could have ac1 changing the data
thereby making ac2's output unpredictable.

That being said, with a more specific example it might be possible to help
figure out a solution that accomplishes what you are trying to do.

On Wed, Jan 13, 2016 at 5:43 AM Kira <mennou...@gmail.com> wrote:

> Hi,
>
> So i have an action on one RDD that is relatively long, let's call it ac1;
> what i want to do is to execute another action (ac2) on the same RDD to see
> the evolution of the first one (ac1); for this end i want to use an
> accumulator and read it's value progressively to see the changes on it (on
> the fly) while ac1 is always running. My problem is that the accumulator is
> only updated once the ac1 has been finished, this is not helpful for me :/
> .
>
> I ve seen  here
> <
> http://apache-spark-user-list.1001560.n3.nabble.com/Asynchronous-Broadcast-from-driver-to-workers-is-it-possible-td15758.html
> >
> what may seem like a solution for me but it doesn t work : "While Spark
> already offers support for asynchronous reduce (collect data from workers,
> while not interrupting execution of a parallel transformation) through
> accumulator"
>
> Another post suggested to use SparkListner to do that.
>
> are these solutions correct ? if yes, give me a simple exemple ?
> are there other solutions ?
>
> thank you.
> Regards
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Read-Accumulator-value-while-running-tp25960.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Read Accumulator value while running

Reply via email to