Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-23 Thread via GitHub


mlbiscoc merged PR #3384:
URL: https://github.com/apache/solr/pull/3384


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-20 Thread via GitHub


mlbiscoc commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2992448965

   So I didn't move attributes into SolrMetricsContext. Ended up being a bigger 
refactor because currently keeping Dropwizard in parallel temporarily to 
incrementally migrate.  Maybe will do it closer to the end of this migration 
depending what happens to SolrMetricsContext.
   
   I have another PR pretty close to ready and would like to put that up for 
review. 
   It adds in dynamic solr core creation and deletion with dedicated 
SdkMeterProvider as well as adds a bunch of tests around these foundational 
components. There are some larger changes in-order to make this work that you'd 
be probably be interested in. 
   
   I also have a WIP for some basic filtering but there are some "gotchas" that 
we need to discuss but will put it aside for a separate discussion.
   
   If you have nothing else here, I'd like to merge this and put up the new PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-17 Thread via GitHub


mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2153298468


##
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String 
parentName) {
   }
 
   /**
-   * Initialize metrics specific to this producer.
-   *
-   * @param parentContext parent metrics context. If this component has the 
same life-cycle as the
-   * parent it can simply use the parent context, otherwise it should 
obtain a child context
-   * using {@link SolrMetricsContext#getChildContext(Object)} passing 
this as the
-   * child object.
-   * @param scope component scope
+   * Deprecated entry point for initializing metrics. TODO SOLR-17458: This 
will be removed Change
+   * to only take attributes completely removing Dropwizard
*/
-  void initializeMetrics(SolrMetricsContext parentContext, String scope);
+  void initializeMetrics(SolrMetricsContext parentContext, Attributes 
attributes, String scope);

Review Comment:
   > We don't need an attribute for the registry so long as our metrics are 
reasonably differentiated / clear. We can change our mind in the future with 
basically no disturbance, I suppose. I wouldn't get rid of SolrMetricsContext 
yet. AB also kind of recommended keeping it to reduce the review/impact.
   
   If we keep `SolrMetricsContext` then it might actually then make sense to 
just place attributes in here like you said. The `Base Attributes` can still 
sit in here then be augmented with additional attributes on the metric no 
differently and it would significantly reduce number of PR line changes. Also 
gives `SolrMetricContext` a use besides just being a passthrough to 
`metricManager`. Currently Solr creates a new `SolrMetricsContext` per 
registry, but we will instead need to create on per implementation of 
`SolrMetricProducer` . I'm going to play with this and see if it makes sense 
otherwise I'll keep it as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-17 Thread via GitHub


dsmiley commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2981828661

   We don't need an attribute for the registry so long as our metrics are 
reasonably differentiated / clear.  We can change our mind in the future with 
basically no disturbance, I suppose.
   
   I wouldn't get rid of SolrMetricsContext _yet_.  AB also kind of recommended 
keeping it to reduce the review/impact.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-17 Thread via GitHub


mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2152733217


##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope, 
SolrMetricProducer producer) {
   + ", producer = "
   + producer);
 }
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
 // use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);

Review Comment:
   Look inside `RequestHandlerBase` at the `HandlerMetrics` constructor.
   
   ```
   Attributes.builder()
 .putAll(attributes)
 .put(AttributeKey.stringKey("source"), "client")
 .put(TYPE_ATTR, "errors")
 .build()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-17 Thread via GitHub


mlbiscoc commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2981090664

   > I think scope is intended to basically refer to all of Solr (e.g. be 
something like "org.apache.solr"), possibly with module consideration added 
(e.g. add ".llm").
   
   Actually this makes sense. If you have a single place where many different 
metrics are stored, this scope name might be the differentiator especially if 
all your systems have some general metric like `http_requests_total`. So then 
all scope should be `org.apache.solr` technically. Do you still find value in 
keeping the concept of registry as a label on all metrics then? 
   
   If not, I don't see the need to keep `SolrMetricsContext` and we should just 
create metrics through `SolrMetricManager` because that is currently all I am 
using it for. I don't see much value in aggregating on a label like `registry` 
except for `solr.core.*` but we have core as a label already.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-17 Thread via GitHub


dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151901980


##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope, 
SolrMetricProducer producer) {
   + ", producer = "
   + producer);
 }
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
 // use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);

Review Comment:
   I'm looking for where you augment base attributes with more info



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151201538


##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope, 
SolrMetricProducer producer) {
   + ", producer = "
   + producer);
 }
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
 // use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);

Review Comment:
   > Can you point me where you augment the base attributes with more so I can 
see what this looks like?
   
   @dsmiley Here is where I initialize the base attributes for 
`requestHandlerBase` that was instrumented.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


dsmiley commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2978644556

   Remember AB's 
[comment](https://issues.apache.org/jira/browse/SOLR-17458?focusedCommentId=17941031&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17941031).
  To add/clarify something he's getting at is the need to unregister metrics 
related to a core.  The DropWizard MetricsRegistry per core was a way to do 
that.  I chatted with Google Gemini about this matter, and it recommended 
creating a dedicated SdkMeterProvider in place of a DropWizard MetricRegistry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


dsmiley commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2978617096

   "Context" and "Scope" are almost synonymous... "Context" just potentially 
holds more than a scope.  I'm not sure a rename is warranted; let's table that. 
 Better to do big renames at the end to make the change easier to review 
beforehand, then any wrote renames that we think are warranted.  Perhaps in a 
completely separate PR/commit.
   
   I read about [instrumentation 
scope](https://opentelemetry.io/docs/specs/otel/common/instrumentation-scope/); 
thanks.  My overall sense reading it is that it's _not_ for how you propose to 
use it.  Although, the last example given is "Internal application components 
emitting their own telemetry, relying on InstrumentationScope attributes to 
differentiate themselves in case multiple instances of the same type exist.".   
Which might mean how you are using it, or I suspect more likely it was thinking 
a big multi-versioned isolated system capable of running different versions of 
a component.  Shrug.  As I look at the code internally, I _do_ think this scope 
is structurally set up to be used as we want.  Maybe we could identify another 
system using OTEL metrics, some DB-like system, and follow in their footsteps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151112226


##
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String 
parentName) {
   }
 
   /**
-   * Initialize metrics specific to this producer.
-   *
-   * @param parentContext parent metrics context. If this component has the 
same life-cycle as the
-   * parent it can simply use the parent context, otherwise it should 
obtain a child context
-   * using {@link SolrMetricsContext#getChildContext(Object)} passing 
this as the
-   * child object.
-   * @param scope component scope
+   * Deprecated entry point for initializing metrics. TODO SOLR-17458: This 
will be removed Change
+   * to only take attributes completely removing Dropwizard
*/
-  void initializeMetrics(SolrMetricsContext parentContext, String scope);
+  void initializeMetrics(SolrMetricsContext parentContext, Attributes 
attributes, String scope);

Review Comment:
   h.  Can you point me where you augment the base attributes with more so 
I can see what this looks like?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


mlbiscoc commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2977881603

   Actually the more I work on this, I think I am going to go with the path of 
renaming `SolrMetricsContext` to `SolrMetricsScope`.
   
   For `otel_scope_name`, we should update scope based on how the [OTEL 
instrumentation scope 
spec](https://opentelemetry.io/docs/specs/otel/common/instrumentation-scope/) 
recommends 
   
   > Developers can decide what denotes a reasonable instrumentation scope. For 
example, they can select a module, a package, or a class as the instrumentation 
scope.
   
   So for `requestHandlerBase`, the `scopeName` in `SolrMetricsScope` should 
maybe be the package name of `org.apache.solr.handler`.  If we really want to 
keep registries, then we can add a label called registry but I don't know if on 
an attribute based framework we still need the registry tag? Core metrics for 
example already have the `core` name as a label making it unique.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


mlbiscoc commented on PR #3384:
URL: https://github.com/apache/solr/pull/3384#issuecomment-2977412580

   > The introduction of use of OTEL's "scope" will be very interesting & 
foundational to Solr's adoption of OTEL metrics. I think that's the next step 
after this PR.
   
   This PR already has "scope" in the metrics. All metrics have a tag called 
`otel_scope_name`. When creating an instrument from `SolrMetricsContext`, it 
has to create the instrument from a `scope` to `SolrMetricManager` which is the 
registry name that the SolrMetricsContext holds. Unless we want scope to be 
something else, I am keeping it the same registries or "scope` which is 
`solr.node`, `solr.core.*`, `solr.jetty`...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-16 Thread via GitHub


mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2150416427


##
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements 
SolrInfoBean {
   private SolrMetricsContext solrMetricsContext;
   private long previous = System.nanoTime();
 
+  // TODO SOLR-17458: Migrate to Otel

Review Comment:
   Didn't know about `NOCOMMIT` I updated a few places with it and will use 
that instead going forward and keep the rest as TODO for now.



##
solr/core/src/java/org/apache/solr/metrics/SolrMetricsContext.java:
##
@@ -116,21 +134,94 @@ public SolrMetricsContext getChildContext(Object child) {
* Register a metric name that this component reports. This method is called 
by various metric
* registration methods in {@link org.apache.solr.metrics.SolrMetricManager} 
in order to capture
* what metric names are reported from this component (which in turn is 
called from {@link
-   * 
org.apache.solr.metrics.SolrMetricProducer#initializeMetrics(SolrMetricsContext,
 String)}).
+   * SolrMetricProducer#initializeMetrics(SolrMetricsContext,
+   * io.opentelemetry.api.common.Attributes, String)}).
*/
+  // TODO We can continue to register metric names
   public void registerMetricName(String name) {
 metricNames.add(name);
   }
 
   /** Return a snapshot of metric values that this component reports. */
+  // TODO Don't think this is needed anymore
   public Map getMetricsSnapshot() {
 return MetricUtils.convertMetrics(getMetricRegistry(), metricNames);
   }
 
+  public LongCounter longCounter(String metricName, String description) {

Review Comment:
   I'm debating if SolrMetricsContext isn't necessarily needed so I'll either 
delete it entirely or rename it to just `SolrMetricsScope` to pin instrument 
creation to the scope it wants to create to metricManager. I didn't determine 
that entirely in this PR yet.
   
   In this PR, I should have mentioned it as well but the concept of 
`registryName` just changes to `otel scope` which is why you see when we pass 
`registryName` into metric manager. Also if you look at the metrics, there is a 
tag that OTEL created itself called `otel_scope_name` which was the registry or 
"scope" that this metric was created in. So we still keep the `solr.node` 
`solr.core.*` registries unless we want some much more granular.
   
   ```
   solr_metrics_core_requests_times_milliseconds_bucket{category="QUERY", ... 
otel_scope_name="solr.core.foobar.shard1.replica_n1", ...} 1
   
   ```



##
solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java:
##
@@ -706,459 +693,6 @@ public void testExprMetrics() throws Exception {
 assertTrue(map.toString(), map.containsKey("count"));
   }
 
-  @Test
-  @SuppressWarnings("unchecked")
-  public void testPrometheusMetricsCore() throws Exception {

Review Comment:
   Many of these Prometheus Tests are relating to the PrometheusFormatter 
features that was removed. A version of this should come back eventually 
further into the migration but with metric naming and attributes may change, I 
didn't see a reason to currently just fix it if naming is subject to change.



##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
 public final Timer requestTimes;
 public final Counter totalTime;
 
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String... 
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes, 
String... metricPath) {
+
+  // TODO SOLR-17458: To be removed
   numErrors = solrMetricsContext.meter("errors", metricPath);
   numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
   numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
   numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
   requests = solrMetricsContext.counter("requests", metricPath);
   requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
   totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+  var baseRequestMetric =
+  solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP 
Solr request counts");
+
+  var baseRequestTimeMetric =
+  solrMetricsContext.longHistogram(
+  "solr_metrics_core_requests_times", "HTTP Solr request times", 
"ms");
+
+  otelRequests =
+  new AttributedLongCounter(
+  baseRequestMetric,
+   

Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-15 Thread via GitHub


dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2147630116


##
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements 
SolrInfoBean {
   private SolrMetricsContext solrMetricsContext;
   private long previous = System.nanoTime();
 
+  // TODO SOLR-17458: Migrate to Otel

Review Comment:
   The Lucene & Solr projects use the magic word "NOCOMMIT" not TODO.  TODO 
means someday maybe.  NOCOMMIT means, do NOT merge this into a release branch.  
"precommit" will fail.
   
   But since you've written this so many times already, we could leave the 
existing ones as-is and remember to search for "TODO SOLR-17458"



##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
 public final Timer requestTimes;
 public final Counter totalTime;
 
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String... 
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes, 
String... metricPath) {
+
+  // TODO SOLR-17458: To be removed
   numErrors = solrMetricsContext.meter("errors", metricPath);
   numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
   numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
   numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
   requests = solrMetricsContext.counter("requests", metricPath);
   requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
   totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+  var baseRequestMetric =
+  solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP 
Solr request counts");
+
+  var baseRequestTimeMetric =
+  solrMetricsContext.longHistogram(
+  "solr_metrics_core_requests_times", "HTTP Solr request times", 
"ms");
+
+  otelRequests =
+  new AttributedLongCounter(
+  baseRequestMetric,
+  Attributes.builder()
+  .putAll(attributes)
+  .put(AttributeKey.stringKey("type"), "requests")
+  .build());
+
+  otelNumErrors =

Review Comment:
   I wonder if we're using OTEL the right way here to handle this scenario -- 
basically there are errors, and some are server and some are client.  I see you 
simply vary the "type" for each.  Maybe... instead we should only have the 
server & client, both have "type=errors", but differentiate by another 
attribute like "errorSource=client|server".  This still allows users to easily 
look at metrics undifferentiated if they don't care (i.e. errors of whatever 
source).  Presumably this would also work?



##
solr/core/src/java/org/apache/solr/metrics/otel/instruments/AttributedLongTimer.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.metrics.otel.instruments;
+
+import io.opentelemetry.api.common.Attributes;
+import io.opentelemetry.api.metrics.LongHistogram;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class records the elapsed time (in milliseconds) to a {@link 
LongHistogram}. Each thread
+ * measures its own timer allowing concurrent timing operations on separate 
threads..
+ */
+public class AttributedLongTimer extends AttributedLongHistogram {
+
+  private final ThreadLocal startTimeNanos = new ThreadLocal<>();

Review Comment:
   The use of ThreadLocal concerns me; seems like an over-use of ThreadLocal.  
I could be convinced otherwise, though.  Instead of conceptually changing the 
state of this timer (scoped to this thread, granted), couldn't we return a 
Timer object that the caller closes?  Note that Solr has RTimer / RTimerTree

Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]

2025-06-12 Thread via GitHub


mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2141160310


##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
 public final Timer requestTimes;
 public final Counter totalTime;
 
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String... 
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes, 
String... metricPath) {
+
+  // TODO SOLR-17458: To be removed
   numErrors = solrMetricsContext.meter("errors", metricPath);
   numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
   numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
   numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
   requests = solrMetricsContext.counter("requests", metricPath);
   requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
   totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+  var baseRequestMetric =
+  solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP 
Solr request counts");
+
+  var baseRequestTimeMetric =
+  solrMetricsContext.longHistogram(
+  "solr_metrics_core_requests_times", "HTTP Solr request times", 
"ms");
+
+  otelRequests =
+  new AttributedLongCounter(
+  baseRequestMetric,
+  Attributes.builder()
+  .putAll(attributes)
+  .put(AttributeKey.stringKey("type"), "requests")
+  .build());

Review Comment:
   These are the base metrics. Then you bind them to attributes so we don't 
need to keep rebuilding them every time we record an event.



##
solr/core/src/java/org/apache/solr/handler/admin/MetricsHandler.java:
##
@@ -127,11 +126,12 @@ private void handleRequest(SolrParams params, 
BiConsumer consume
   return;
 }
 
+// TODO SOLR-17458: Make this the default option after dropwizard removal
 if (PROMETHEUS_METRICS_WT.equals(params.get(CommonParams.WT))) {
-  response = handlePrometheusExport(params);
-  consumer.accept("metrics", response);
+  consumer.accept("metrics", metricManager.getMetricReader().collect());

Review Comment:
   Instead of going through the whole dropwizard -> prometheus formatter logic. 
Very simply the metric reader does the OTEL -> prometheus for us through the 
SDK.



##
solr/core/src/test/org/apache/solr/response/TestPrometheusResponseWriter.java:
##
@@ -127,54 +131,6 @@ public void testPrometheusStructureOutput() throws 
Exception {
 }
   }
 
-  public void testPrometheusDummyOutput() throws Exception {

Review Comment:
   Pretty much all Prometheus Formmater tests go away with the exception of 
`testPrometheusStructureOutput` which tests for prometheus exposition format 
from the OTEL sdk.



##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
 public final Timer requestTimes;
 public final Counter totalTime;
 
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String... 
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;

Review Comment:
   These are essentially equivalent to Dropwizards metrics above. Once we do 
deprecations, we can just change the name to the ones above and remove them.



##
solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java:
##
@@ -152,18 +153,28 @@ public void init(PluginInfo info) {
 }
   }
 
+  // TODO SOLR-17458: Fix metric Attributes
   @Override
-  public void initializeMetrics(SolrMetricsContext parentContext, String 
scope) {
-super.initializeMetrics(parentContext, scope);
+  public void initializeMetrics(
+  SolrMetricsContext parentContext, Attributes attributes, String scope) {
+super.initializeMetrics(
+parentContext,
+Attributes.builder()
+.putAll(attributes)
+.put(AttributeKey.stringKey("category"), getCategory().toString())
+.put(AttributeKey.stringKey("internal"), "false")

Review Comment:
   Instead of having the handler do something like `/select[shard]`, we use a 
tag saying `internal` true/false



##
solr/