Please send me the patch , I will apply and test.

On Fri, 17 Jan 2020 at 10:33 PM, Ryan Blue <rb...@netflix.com> wrote:

> We've implemented these metrics in the RDD (for input metrics) and in the
> v2 DataWritingSparkTask. That approach gives you the same metrics in the
> stage views that you get with v1 sources, regardless of the v2
> implementation.
>
> I'm not sure why they weren't included from the start. It looks like the
> way metrics are collected is changing. There are a couple of metrics for
> number of rows; looks like one that goes to the Spark SQL tab and one that
> is used for the stages view.
>
> If you'd like, I can send you a patch.
>
> rb
>
> On Fri, Jan 17, 2020 at 5:09 AM Wenchen Fan <cloud0...@gmail.com> wrote:
>
>> I think there are a few details we need to discuss.
>>
>> how frequently a source should update its metrics? For example, if file
>> source needs to report size metrics per row, it'll be super slow.
>>
>> what metrics a source should report? data size? numFiles? read time?
>>
>> shall we show metrics in SQL web UI as well?
>>
>> On Fri, Jan 17, 2020 at 3:07 PM Sandeep Katta <
>> sandeep0102.opensou...@gmail.com> wrote:
>>
>>> Hi Devs,
>>>
>>> Currently DS V2 does not update any input metrics. SPARK-30362 aims at
>>> solving this problem.
>>>
>>> We can have the below approach. Have marker interface let's say
>>> "ReportMetrics"
>>>
>>> If the DataSource Implements this interface, then it will be easy to
>>> collect the metrics.
>>>
>>> For e.g. FilePartitionReaderFactory can support metrics.
>>>
>>> So it will be easy to collect the metrics if FilePartitionReaderFactory
>>> implements ReportMetrics
>>>
>>>
>>> Please let me know the views, or even if we want to have new solution or
>>> design.
>>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to