It would be a dream to have an easy-to-use dynamic metric system AND a
reliable counting system (accumulator-like) in Spark...

Thanks
Roberto

On Tue, May 7, 2019 at 3:54 AM Saisai Shao <sai.sai.s...@gmail.com> wrote:

> I think the main reason why that was not merged is that Spark itself
> doesn't have such requirement, and the metrics system is mainly used for
> spark itself. Most of the needs are from the custom sources/sinks, but
> Spark's MetricsSystem is not designed as a public API.
>
> I think we could revisit or improve that PR if there's a solid reason
> about it.
>
> Thanks
> Saisai
>
> Sergey Zhemzhitsky <szh.s...@gmail.com> 于2019年5月7日周二 上午5:49写道:
>
>> Hi Saisai,
>>
>> Thanks a lot for the link! This is exactly what I need.
>> Just curious, why this PR has not been merged, as it seems to implement
>> rather natural requirement.
>>
>> There are a number or use cases which can benefit from this feature, e.g.
>> - collecting business metrics based on the data's attributes and
>> reporting them into the monitoring system as a side effect of the data
>> processing
>> - visualizing technical metrics by means of alternative software (e.g.
>> grafana) - currently it's hardly possible to know the actual number of
>> jobs, stages, tasks and their names and IDs in advance to register all the
>> corresponding metrics statically.
>>
>>
>> Kind Regards,
>> Sergey
>>
>>
>> On Mon, May 6, 2019, 16:07 Saisai Shao <sai.sai.s...@gmail.com> wrote:
>>
>>> I remembered there was a PR about doing similar thing (
>>> https://github.com/apache/spark/pull/18406). From my understanding,
>>> this seems like a quite specific requirement, it may requires code change
>>> to support your needs.
>>>
>>> Thanks
>>> Saisai
>>>
>>> Sergey Zhemzhitsky <szh.s...@gmail.com> 于2019年5月4日周六 下午4:44写道:
>>>
>>>> Hello Spark Users!
>>>>
>>>> Just wondering whether it is possible to register a metric source
>>>> without metrics known in advance and add the metrics themselves to this
>>>> source later on?
>>>>
>>>> It seems that currently MetricSystem puts all the metrics from the
>>>> source's MetricRegistry into a shared MetricRegistry of a MetricSystem
>>>> during metric source registration [1].
>>>>
>>>> So in case there is a new metric with a new name added to the source's
>>>> registry after this source registration, then this new metric will not be
>>>> reported to the sinks.
>>>>
>>>> What I'd like to achieve is to be able to register new metrics with new
>>>> names dynamically using a single metric source.
>>>> Is it somehow possible?
>>>>
>>>>
>>>> [1]
>>>> https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162
>>>>
>>>

Reply via email to