Re: Dynamic metric names

2019-05-07 Thread Roberto Coluccio
It would be a dream to have an easy-to-use dynamic metric system AND a
reliable counting system (accumulator-like) in Spark...

Thanks
Roberto

On Tue, May 7, 2019 at 3:54 AM Saisai Shao  wrote:

> I think the main reason why that was not merged is that Spark itself
> doesn't have such requirement, and the metrics system is mainly used for
> spark itself. Most of the needs are from the custom sources/sinks, but
> Spark's MetricsSystem is not designed as a public API.
>
> I think we could revisit or improve that PR if there's a solid reason
> about it.
>
> Thanks
> Saisai
>
> Sergey Zhemzhitsky  于2019年5月7日周二 上午5:49写道:
>
>> Hi Saisai,
>>
>> Thanks a lot for the link! This is exactly what I need.
>> Just curious, why this PR has not been merged, as it seems to implement
>> rather natural requirement.
>>
>> There are a number or use cases which can benefit from this feature, e.g.
>> - collecting business metrics based on the data's attributes and
>> reporting them into the monitoring system as a side effect of the data
>> processing
>> - visualizing technical metrics by means of alternative software (e.g.
>> grafana) - currently it's hardly possible to know the actual number of
>> jobs, stages, tasks and their names and IDs in advance to register all the
>> corresponding metrics statically.
>>
>>
>> Kind Regards,
>> Sergey
>>
>>
>> On Mon, May 6, 2019, 16:07 Saisai Shao  wrote:
>>
>>> I remembered there was a PR about doing similar thing (
>>> https://github.com/apache/spark/pull/18406). From my understanding,
>>> this seems like a quite specific requirement, it may requires code change
>>> to support your needs.
>>>
>>> Thanks
>>> Saisai
>>>
>>> Sergey Zhemzhitsky  于2019年5月4日周六 下午4:44写道:
>>>
 Hello Spark Users!

 Just wondering whether it is possible to register a metric source
 without metrics known in advance and add the metrics themselves to this
 source later on?

 It seems that currently MetricSystem puts all the metrics from the
 source's MetricRegistry into a shared MetricRegistry of a MetricSystem
 during metric source registration [1].

 So in case there is a new metric with a new name added to the source's
 registry after this source registration, then this new metric will not be
 reported to the sinks.

 What I'd like to achieve is to be able to register new metrics with new
 names dynamically using a single metric source.
 Is it somehow possible?


 [1]
 https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162

>>>


Re: Dynamic metric names

2019-05-06 Thread Sergey Zhemzhitsky
Hi Saisai,

Thanks a lot for the link! This is exactly what I need.
Just curious, why this PR has not been merged, as it seems to implement
rather natural requirement.

There are a number or use cases which can benefit from this feature, e.g.
- collecting business metrics based on the data's attributes and reporting
them into the monitoring system as a side effect of the data processing
- visualizing technical metrics by means of alternative software (e.g.
grafana) - currently it's hardly possible to know the actual number of
jobs, stages, tasks and their names and IDs in advance to register all the
corresponding metrics statically.


Kind Regards,
Sergey


On Mon, May 6, 2019, 16:07 Saisai Shao  wrote:

> I remembered there was a PR about doing similar thing (
> https://github.com/apache/spark/pull/18406). From my understanding, this
> seems like a quite specific requirement, it may requires code change to
> support your needs.
>
> Thanks
> Saisai
>
> Sergey Zhemzhitsky  于2019年5月4日周六 下午4:44写道:
>
>> Hello Spark Users!
>>
>> Just wondering whether it is possible to register a metric source without
>> metrics known in advance and add the metrics themselves to this source
>> later on?
>>
>> It seems that currently MetricSystem puts all the metrics from the
>> source's MetricRegistry into a shared MetricRegistry of a MetricSystem
>> during metric source registration [1].
>>
>> So in case there is a new metric with a new name added to the source's
>> registry after this source registration, then this new metric will not be
>> reported to the sinks.
>>
>> What I'd like to achieve is to be able to register new metrics with new
>> names dynamically using a single metric source.
>> Is it somehow possible?
>>
>>
>> [1]
>> https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162
>>
>


Re: Dynamic metric names

2019-05-06 Thread Saisai Shao
I remembered there was a PR about doing similar thing (
https://github.com/apache/spark/pull/18406). From my understanding, this
seems like a quite specific requirement, it may requires code change to
support your needs.

Thanks
Saisai

Sergey Zhemzhitsky  于2019年5月4日周六 下午4:44写道:

> Hello Spark Users!
>
> Just wondering whether it is possible to register a metric source without
> metrics known in advance and add the metrics themselves to this source
> later on?
>
> It seems that currently MetricSystem puts all the metrics from the
> source's MetricRegistry into a shared MetricRegistry of a MetricSystem
> during metric source registration [1].
>
> So in case there is a new metric with a new name added to the source's
> registry after this source registration, then this new metric will not be
> reported to the sinks.
>
> What I'd like to achieve is to be able to register new metrics with new
> names dynamically using a single metric source.
> Is it somehow possible?
>
>
> [1]
> https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162
>


Dynamic metric names

2019-05-04 Thread Sergey Zhemzhitsky
Hello Spark Users!

Just wondering whether it is possible to register a metric source without
metrics known in advance and add the metrics themselves to this source
later on?

It seems that currently MetricSystem puts all the metrics from the source's
MetricRegistry into a shared MetricRegistry of a MetricSystem during metric
source registration [1].

So in case there is a new metric with a new name added to the source's
registry after this source registration, then this new metric will not be
reported to the sinks.

What I'd like to achieve is to be able to register new metrics with new
names dynamically using a single metric source.
Is it somehow possible?


[1]
https://github.com/apache/spark/blob/51de86baed0776304c6184f2c04b6303ef48df90/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L162