Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-11 Thread via GitHub


mlbiscoc merged PR #3453:
URL: https://github.com/apache/solr/pull/3453


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263993095


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   ah; cool.
   
   Merge away!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


mlbiscoc commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263833115


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   I fixed it and updated the PR description with the correct naming in 
[ee09699](https://github.com/apache/solr/pull/3453/commits/ee09699eddd7a35ef2811939ced9a821b7ee22af).
   
   Something I think isn't obvious though is how OTELs prometheus metric reader 
names things after and how OTEL sets units as it's metadata. 
   
   For example, I just fixed up the metric for zk bytes read to just name 
`solr_zk_read` but the prometheus output does `solr_zk_read_bytes_total`. Thats 
because the metric reader does some sanitizing and tries to follow prometheus 
data model and format with the unit metadata having `byte` and counters always 
ending with `total` even though I didn't add it.
   
   What I kind of regret of regret was the way I setup the `unit` parameter 
because you can see I set bytes as "By". Thats because OTEL only follows 
[UCUM](https://ucum.org/ucum) for its units otherwise the metric throws an 
exception. I am going to go back with another PR to fix the units so that devs 
can just use a common set of constants UCUM units instead of passing in a 
string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


mlbiscoc commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263833115


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   I fixed it and updated the PR description with the correct naming in 
[ee09699](https://github.com/apache/solr/pull/3453/commits/ee09699eddd7a35ef2811939ced9a821b7ee22af).
   
   Something I think isn't obvious though is how OTELs prometheus metric reader 
names things after and how OTEL sets units as it's metadata. 
   
   For example, I just fixed up the metric for zk bytes read to just name 
`solr_zk_read` but the prometheus output does `solr_zk_read_bytes_total`. Thats 
because the metric reader does some sanitizing and tries to follow prometheus 
data model and format with the unit metadata having `byte` and counters always 
ending with `total` even though I didn't add it.
   
   What I kind of regret was the way I setup the `unit` parameter because you 
can see I set bytes as "By". Thats because OTEL only follows 
[UCUM](https://ucum.org/ucum) for its units otherwise the metric throws an 
exception. I am going to go back with another PR to fix the units so that devs 
can just use a common set of constants UCUM units instead of passing in a 
string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3166427156

   Doh!  :palm-to-face: indeed you have the excerpt in the PR description, a 
good/obvious location.  I looked everywhere _else_ (JIRA & bottom of PR).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2261839053


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   the excerpt in the PR description shows `solr_zk_bytes_written_bytes_total` 
instead of what you have here.  What's here looks better, since the PR 
description shows "bytes" repeated.  Similar for the read metric.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


mlbiscoc commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3165290824

   > Can you post an excerpt (prometheus format) on what these looks like?
   The sample is in the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2261089800


##
solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/SolrZKMetricsListener.java:
##
@@ -16,40 +16,27 @@
  */
 package org.apache.solr.common.cloud;
 
-import java.io.IOException;
 import java.util.concurrent.atomic.LongAdder;
 import org.apache.curator.drivers.AdvancedTracerDriver;
 import org.apache.curator.drivers.EventTrace;
 import org.apache.curator.drivers.OperationTrace;
 import org.apache.curator.framework.CuratorFramework;
 import org.apache.curator.framework.api.CuratorEvent;
 import org.apache.curator.framework.api.CuratorListener;
-import org.apache.solr.common.MapWriter;
-import org.apache.solr.common.annotation.JsonProperty;
-import org.apache.solr.common.util.ReflectMapWriter;
 
-public class SolrZKMetricsListener extends AdvancedTracerDriver
-implements ReflectMapWriter, CuratorListener {

Review Comment:
   I wonder why RefflectMapWriter was implemented (hence `JsonProperty` 
annotations)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-06 Thread via GitHub


mlbiscoc commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3161086049

   There doesn't appear to be any tests around these ZkContainer metrics. If 
someone wants, I can create tests to assert these are correct, otherwise this 
PR was just a straight migration.
   
   I also squeezed a few other changes that fixes tests when I was merging 
other PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]