Nikita,

Looks like all we need now is a 1 simple metric: are operations blocked?
Just a true or false.
Lest start from this.
All other metrics can be extracted from logs now and can be implemented
later.

On Tue, Jul 16, 2019 at 12:46 PM Nikolay Izhikov <nizhi...@apache.org>
wrote:

> +1.
>
> Nikita, please, go ahead.
>
>
> вт, 16 июля 2019 г., 11:45 Nikita Amelchev <nsamelc...@gmail.com>:
>
> > Hello, Igniters.
> >
> > I suggest to add some useful metrics about the partition map exchange
> > (PME). For now, the duration of PME stages available only in log files
> > and cannot be obtained using JMX or other external tools. [1]
> >
> > I made the list of local node metrics that help to understand the
> > actual status of current PME:
> >
> > 1. initialVersion. Topology version that initiates the exchange.
> > 2. initTime. Time PME was started.
> > 3. initEvent. Event that triggered PME.
> > 4. partitionReleaseTime. Time when a node has finished waiting for all
> > updates and translations on a previous topology.
> > 5. sendSingleMessageTime. Time when a node sent a single message.
> > 6. recieveFullMessageTime. Time when a node received a full message.
> > 7. finishTime. Time PME was ended.
> >
> > When new PME started all these metrics resets.
> >
> > These metrics help to understand:
> > - how long PME was (current or previous).
> > - how long awaited for all updates was completed.
> > - what node blocks PME (didn't send a single message)
> > - what triggered PME.
> >
> > Thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-11961
> >
> > --
> > Best wishes,
> > Amelchev Nikita
> >
>

Reply via email to