Re: [Discuss] Metrics Support for DS V2

2020-01-20 Thread Ryan Blue
I sent them to you. I had to go direct because the ASF mailing list will remove attachments. I'm happy to send them to others if needed as well. On Sun, Jan 19, 2020 at 9:01 PM Sandeep Katta < sandeep0102.opensou...@gmail.com> wrote: > Please send me the patch , I will apply and test. > > On

Re: [Discuss] Metrics Support for DS V2

2020-01-19 Thread Sandeep Katta
Please send me the patch , I will apply and test. On Fri, 17 Jan 2020 at 10:33 PM, Ryan Blue wrote: > We've implemented these metrics in the RDD (for input metrics) and in the > v2 DataWritingSparkTask. That approach gives you the same metrics in the > stage views that you get with v1 sources,

Re: [Discuss] Metrics Support for DS V2

2020-01-17 Thread Ryan Blue
We've implemented these metrics in the RDD (for input metrics) and in the v2 DataWritingSparkTask. That approach gives you the same metrics in the stage views that you get with v1 sources, regardless of the v2 implementation. I'm not sure why they weren't included from the start. It looks like

Re: [Discuss] Metrics Support for DS V2

2020-01-17 Thread Wenchen Fan
I think there are a few details we need to discuss. how frequently a source should update its metrics? For example, if file source needs to report size metrics per row, it'll be super slow. what metrics a source should report? data size? numFiles? read time? shall we show metrics in SQL web UI

[Discuss] Metrics Support for DS V2

2020-01-16 Thread Sandeep Katta
Hi Devs, Currently DS V2 does not update any input metrics. SPARK-30362 aims at solving this problem. We can have the below approach. Have marker interface let's say "ReportMetrics" If the DataSource Implements this interface, then it will be easy to collect the metrics. For e.g.