:
>> >
>> > TaskMetrics.get("f1-time")
>> >
>> > However, I don't think this would be possible with the named
>> accumulators
>> > -- I believe they'd need to be passed to every function that needs
>> them,
>> &g
>> TaskMetrics.get("f1-time")
>>
>> However, I don't think this would be possible with the named accumulators
>> -- I believe they'd need to be passed to every function that needs them,
>> which I think would be cumbersome in any application of reas
believe they'd need to be passed to every function that needs them,
> which I think would be cumbersome in any application of reasonable
> complexity.
>
> This is what I was trying to solve with my proposal for dynamic variables
> in Spark. However, the ability to retrieve named
e")
However, I don't think this would be possible with the named accumulators
-- I believe they'd need to be passed to every function that needs them,
which I think would be cumbersome in any application of reasonable
complexity.
This is what I was trying to solve with my proposal
Hi Patrick.
That looks very useful. The thing that seems to be missing from Shivaram's
example is the ability to access TaskMetrics statically (this is the same
problem that I am trying to solve with dynamic variables).
You mention defining an accumulator on the RDD. Perhaps I am missin
Shivaram,
You should take a look at this patch which adds support for naming
accumulators - this is likely to get merged in soon. I actually
started this patch by supporting named TaskMetrics similar to what you
have there, but then I realized there is too much semantic overlap
with accumulators,
Hi Christopher
Thanks for your reply. I'll try and address your points -- please let me
know if I missed anything.
Regarding clarifying the problem statement, let me try and do that with a
real-world example. I have a method that I want to measure the performance
of, which has the following signa
>From reading Neil's first e-mail, I think the motivation is to get some
metrics in ADAM ? -- I've run into a similar use-case with having
user-defined metrics in long-running tasks and I think a nice way to solve
this would be to have user-defined TaskMetrics.
To state my problem more clearly, l
Hi Reynold
Thanks for your reply.
Accumulators are, of course, stored in the Accumulators object as
thread-local variables. However, the Accumulators object isn't public, so
when a Task is executing there's no way to get the set of accumulators for
the current thread -- accumulators still have to
Thanks for the thoughtful email, Neil and Christopher.
If I understand this correctly, it seems like the dynamic variable is just
a variant of the accumulator (a static one since it is a global object).
Accumulators are already implemented using thread-local variables under the
hood. Am I misunder
Hi Neil, first off, I'm generally a sympathetic advocate for making changes
to Spark internals to make it easier/better/faster/more awesome.
In this case, I'm (a) not clear about what you're trying to accomplish, and
(b) a bit worried about the proposed solution.
On (a): it is stated that you wan
Hi all
I have been adding some metrics to the ADAM project
https://github.com/bigdatagenomics/adam, which runs on Spark, and have a
proposal for an enhancement to Spark that would make this work cleaner and
easier.
I need to pass some Accumulators around, which will aggregate metrics
(timing stat
12 matches
Mail list logo