It would be a dream to have an easy-to-use dynamic metric system AND a reliable counting system (accumulator-like) in Spark...
Thanks Roberto On Tue, May 7, 2019 at 3:54 AM Saisai Shao <sai.sai.s...@gmail.com> wrote: > I think the main reason why that was not merged is that Spark itself > doesn't have such requirement, and the metrics system is mainly used for > spark itself. Most of the needs are from the custom sources/sinks, but > Spark's MetricsSystem is not designed as a public API. > > I think we could revisit or improve that PR if there's a solid reason > about it. > > Thanks > Saisai > > Sergey Zhemzhitsky <szh.s...@gmail.com> 于2019年5月7日周二 上午5:49写道: > >> Hi Saisai, >> >> Thanks a lot for the link! This is exactly what I need. >> Just curious, why this PR has not been merged, as it seems to implement >> rather natural requirement. >> >> There are a number or use cases which can benefit from this feature, e.g. >> - collecting business metrics based on the data's attributes and >> reporting them into the monitoring system as a side effect of the data >> processing >> - visualizing technical metrics by means of alternative software (e.g. >> grafana) - currently it's hardly possible to know the actual number of >> jobs, stages, tasks and their names and IDs in advance to register all the >> corresponding metrics statically. >> >> >> Kind Regards, >> Sergey >> >> >> On Mon, May 6, 2019, 16:07 Saisai Shao <sai.sai.s...@gmail.com> wrote: >> >>> I remembered there was a PR about doing similar thing ( >>> https://github.com/apache/spark/pull/18406). From my understanding, >>> this seems like a quite specific requirement, it may requires code change >>> to support your needs. >>> >>> Thanks >>> Saisai >>> >>> Sergey Zhemzhitsky <szh.s...@gmail.com> 于2019年5月4日周六 下午4:44写道: >>> >>>> Hello Spark Users! >>>> >>>> Just wondering whether it is possible to register a metric source >>>> without metrics known in advance and add the metrics themselves to this >>>> source later on? >>>> >>>> It seems that currently MetricSystem puts all the metrics from the >>>> source's MetricRegistry into a shared MetricRegistry of a MetricSystem >>>> during metric source registration [1]. >>>> >>>> So in case there is a new metric with a new name added to the source's >>>> registry after this source registration, then this new metric will not be >>>> reported to the sinks. >>>> >>>> What I'd like to achieve is to be able to register new metrics with new >>>> names dynamically using a single metric source. >>>> Is it somehow possible? >>>> >>>> >>>> [1] >>>> https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162 >>>> >>>