Hello, Igniters.

Especially, Ignite veterans.

I've prepared PR [1] for the ticket IGNITE-11926 [2].

I found that we don't have any tests for the current GridJobMetrics 
implementation.
So I added basic tests for the current implementation in the PR.

Guys, do we have real-world usages of numbers from these metrics?

Back to my PR: I think we should migrate only a few of the existing 
GridJobProcessor metrics.
And that's why:

1. We shouldn't migrate aggregate metrics - max*, avg*
Aggregation should be done with the metric collect system(Prometheus, Graphite, 
etc.).

2. We shouldn't migrate `cpuLoadAvg`
Metrics for CPU should be available from separate sources(OS sensors or 
similar).

3. We shouldn't migrate `curidleTime`, `totalIdleTime`.
Idle metrics doesn't make sense for me.

They can be obtained from regularly scrapped `activeJobs` value.
Seems, they can't be used in the real world. Imagine 32 CPU server with only 
one active job. 
Idle time will be 0 for this scenario.

4. Execution(waiting) time should be available per job in the job list.

So my PR contains counters for the following numbers.
All the code belongs to the GridJobProcessor becomes deprecated.

Can someone do the review?

```
    /** Number of started jobs. */
    final LongMetricImpl startedJobsMetric;

     /** Number of active jobs currently executing. */
    final LongMetricImpl activeJobsMetric;

     /** Number of currently queued jobs waiting to be executed. */
    final LongMetricImpl waitingJobsMetric;

     /** Number of cancelled jobs that are still running. */
    final LongMetricImpl canceledJobsMetric;

     /** Number of jobs rejected after more recent collision resolution 
operation. */
    final LongMetricImpl rejectedJobsMetric;

     /** Number of finished jobs. */
    final LongMetricImpl finishedJobsMetric;
```




[1] https://github.com/apache/ignite/pull/6622
[2] https://issues.apache.org/jira/browse/IGNITE-11926

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to