Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-11 Thread via GitHub


mlbiscoc merged PR #3453:
URL: https://github.com/apache/solr/pull/3453


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263993095


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   ah; cool.
   
   Merge away!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


mlbiscoc commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263833115


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   I fixed it and updated the PR description with the correct naming in 
[ee09699](https://github.com/apache/solr/pull/3453/commits/ee09699eddd7a35ef2811939ced9a821b7ee22af).
   
   Something I think isn't obvious though is how OTELs prometheus metric reader 
names things after and how OTEL sets units as it's metadata. 
   
   For example, I just fixed up the metric for zk bytes read to just name 
`solr_zk_read` but the prometheus output does `solr_zk_read_bytes_total`. Thats 
because the metric reader does some sanitizing and tries to follow prometheus 
data model and format with the unit metadata having `byte` and counters always 
ending with `total` even though I didn't add it.
   
   What I kind of regret of regret was the way I setup the `unit` parameter 
because you can see I set bytes as "By". Thats because OTEL only follows 
[UCUM](https://ucum.org/ucum) for its units otherwise the metric throws an 
exception. I am going to go back with another PR to fix the units so that devs 
can just use a common set of constants UCUM units instead of passing in a 
string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-08 Thread via GitHub


mlbiscoc commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2263833115


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   I fixed it and updated the PR description with the correct naming in 
[ee09699](https://github.com/apache/solr/pull/3453/commits/ee09699eddd7a35ef2811939ced9a821b7ee22af).
   
   Something I think isn't obvious though is how OTELs prometheus metric reader 
names things after and how OTEL sets units as it's metadata. 
   
   For example, I just fixed up the metric for zk bytes read to just name 
`solr_zk_read` but the prometheus output does `solr_zk_read_bytes_total`. Thats 
because the metric reader does some sanitizing and tries to follow prometheus 
data model and format with the unit metadata having `byte` and counters always 
ending with `total` even though I didn't add it.
   
   What I kind of regret was the way I setup the `unit` parameter because you 
can see I set bytes as "By". Thats because OTEL only follows 
[UCUM](https://ucum.org/ucum) for its units otherwise the metric throws an 
exception. I am going to go back with another PR to fix the units so that devs 
can just use a common set of constants UCUM units instead of passing in a 
string.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3166427156

   Doh!  :palm-to-face: indeed you have the excerpt in the PR description, a 
good/obvious location.  I looked everywhere _else_ (JIRA & bottom of PR).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2261839053


##
solr/core/src/java/org/apache/solr/core/ZkContainer.java:
##
@@ -141,18 +140,82 @@ public void initZooKeeper(final CoreContainer cc, 
CloudConfig config) {
 }
 
 this.zkController = zkController;
-MetricsMap metricsMap = new 
MetricsMap(zkController.getZkClient().getMetrics());
+
 metricProducer =
 new SolrMetricProducer() {
   SolrMetricsContext ctx;
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
   SolrMetricsContext parentContext, Attributes attributes, 
String scope) {
 ctx = parentContext.getChildContext(this);
-ctx.gauge(
-metricsMap, true, scope, null, 
SolrInfoBean.Category.CONTAINER.toString());
+
+var metricsListener = zkController.getZkClient().getMetrics();
+
+ctx.observableLongCounter(
+"solr_zk_ops",
+"Total number of ZooKeeper operations",
+measurement -> {
+  measurement.record(
+  metricsListener.getReads(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"read").build());
+  measurement.record(
+  metricsListener.getDeletes(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"delete").build());
+  measurement.record(
+  metricsListener.getWrites(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"write").build());
+  measurement.record(
+  metricsListener.getMultiOps(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"multi").build());
+  measurement.record(
+  metricsListener.getExistsChecks(),
+  attributes.toBuilder().put(OPERATION_ATTR, 
"exists").build());
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_read",
+"Total bytes read from ZooKeeper",
+measurement -> {
+  measurement.record(metricsListener.getBytesRead(), 
attributes);
+},
+"By");
+
+ctx.observableLongCounter(
+"solr_zk_watches_fired",
+"Total number of ZooKeeper watches fired",
+measurement -> {
+  measurement.record(metricsListener.getWatchesFired(), 
attributes);
+});
+
+ctx.observableLongCounter(
+"solr_zk_bytes_written_total",

Review Comment:
   the excerpt in the PR description shows `solr_zk_bytes_written_bytes_total` 
instead of what you have here.  What's here looks better, since the PR 
description shows "bytes" repeated.  Similar for the read metric.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


mlbiscoc commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3165290824

   > Can you post an excerpt (prometheus format) on what these looks like?
   The sample is in the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-07 Thread via GitHub


dsmiley commented on code in PR #3453:
URL: https://github.com/apache/solr/pull/3453#discussion_r2261089800


##
solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/SolrZKMetricsListener.java:
##
@@ -16,40 +16,27 @@
  */
 package org.apache.solr.common.cloud;
 
-import java.io.IOException;
 import java.util.concurrent.atomic.LongAdder;
 import org.apache.curator.drivers.AdvancedTracerDriver;
 import org.apache.curator.drivers.EventTrace;
 import org.apache.curator.drivers.OperationTrace;
 import org.apache.curator.framework.CuratorFramework;
 import org.apache.curator.framework.api.CuratorEvent;
 import org.apache.curator.framework.api.CuratorListener;
-import org.apache.solr.common.MapWriter;
-import org.apache.solr.common.annotation.JsonProperty;
-import org.apache.solr.common.util.ReflectMapWriter;
 
-public class SolrZKMetricsListener extends AdvancedTracerDriver
-implements ReflectMapWriter, CuratorListener {

Review Comment:
   I wonder why RefflectMapWriter was implemented (hence `JsonProperty` 
annotations)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-06 Thread via GitHub


mlbiscoc commented on PR #3453:
URL: https://github.com/apache/solr/pull/3453#issuecomment-3161086049

   There doesn't appear to be any tests around these ZkContainer metrics. If 
someone wants, I can create tests to assert these are correct, otherwise this 
PR was just a straight migration.
   
   I also squeezed a few other changes that fixes tests when I was merging 
other PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[PR] SOLR-17806: Migrate ZkContainer metrics to OTEL [solr]

2025-08-06 Thread via GitHub


mlbiscoc opened a new pull request, #3453:
URL: https://github.com/apache/solr/pull/3453

   https://issues.apache.org/jira/browse/SOLR-17806
   
   Migrate ZkContainer metrics and remove Dropwizard initializations.
   
   ```
   # HELP solr_zk_bytes_read_bytes_total Total bytes read from ZooKeeper
   # TYPE solr_zk_bytes_read_bytes_total counter
   
solr_zk_bytes_read_bytes_total{category="CONTAINER",otel_scope_name="org.apache.solr"}
 463403.0
   # HELP solr_zk_bytes_written_bytes_total Total bytes written to ZooKeeper
   # TYPE solr_zk_bytes_written_bytes_total counter
   solr_zk_bytes_written_bytes_total{otel_scope_name="org.apache.solr"} 941.0
   # HELP solr_zk_child_fetches_total Total number of ZooKeeper child node 
fetches
   # TYPE solr_zk_child_fetches_total counter
   solr_zk_child_fetches_total{otel_scope_name="org.apache.solr"} 19.0
   # HELP solr_zk_cumulative_children_fetched_total Total cumulative children 
fetched count
   # TYPE solr_zk_cumulative_children_fetched_total counter
   solr_zk_cumulative_children_fetched_total{otel_scope_name="org.apache.solr"} 
10.0
   # HELP solr_zk_cumulative_multi_ops_total Total cumulative multi-operations 
count
   # TYPE solr_zk_cumulative_multi_ops_total counter
   solr_zk_cumulative_multi_ops_total{otel_scope_name="org.apache.solr"} 8.0
   # HELP solr_zk_ops_total Total number of ZooKeeper operations
   # TYPE solr_zk_ops_total counter
   
solr_zk_ops_total{category="CONTAINER",ops="delete",otel_scope_name="org.apache.solr"}
 0.0
   
solr_zk_ops_total{category="CONTAINER",ops="exists",otel_scope_name="org.apache.solr"}
 219.0
   
solr_zk_ops_total{category="CONTAINER",ops="multi",otel_scope_name="org.apache.solr"}
 3.0
   
solr_zk_ops_total{category="CONTAINER",ops="read",otel_scope_name="org.apache.solr"}
 232.0
   
solr_zk_ops_total{category="CONTAINER",ops="write",otel_scope_name="org.apache.solr"}
 5.0
   # HELP solr_zk_watches_fired_total Total number of ZooKeeper watches fired
   # TYPE solr_zk_watches_fired_total counter
   
solr_zk_watches_fired_total{category="CONTAINER",otel_scope_name="org.apache.solr"}
 5.0
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]