Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc merged PR #3384: URL: https://github.com/apache/solr/pull/3384 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2992448965 So I didn't move attributes into SolrMetricsContext. Ended up being a bigger refactor because currently keeping Dropwizard in parallel temporarily to incrementally migrate. Maybe will do it closer to the end of this migration depending what happens to SolrMetricsContext. I have another PR pretty close to ready and would like to put that up for review. It adds in dynamic solr core creation and deletion with dedicated SdkMeterProvider as well as adds a bunch of tests around these foundational components. There are some larger changes in-order to make this work that you'd be probably be interested in. I also have a WIP for some basic filtering but there are some "gotchas" that we need to discuss but will put it aside for a separate discussion. If you have nothing else here, I'd like to merge this and put up the new PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2153298468
##
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String
parentName) {
}
/**
- * Initialize metrics specific to this producer.
- *
- * @param parentContext parent metrics context. If this component has the
same life-cycle as the
- * parent it can simply use the parent context, otherwise it should
obtain a child context
- * using {@link SolrMetricsContext#getChildContext(Object)} passing
this as the
- * child object.
- * @param scope component scope
+ * Deprecated entry point for initializing metrics. TODO SOLR-17458: This
will be removed Change
+ * to only take attributes completely removing Dropwizard
*/
- void initializeMetrics(SolrMetricsContext parentContext, String scope);
+ void initializeMetrics(SolrMetricsContext parentContext, Attributes
attributes, String scope);
Review Comment:
> We don't need an attribute for the registry so long as our metrics are
reasonably differentiated / clear. We can change our mind in the future with
basically no disturbance, I suppose. I wouldn't get rid of SolrMetricsContext
yet. AB also kind of recommended keeping it to reduce the review/impact.
If we keep `SolrMetricsContext` then it might actually then make sense to
just place attributes in here like you said. The `Base Attributes` can still
sit in here then be augmented with additional attributes on the metric no
differently and it would significantly reduce number of PR line changes. Also
gives `SolrMetricContext` a use besides just being a passthrough to
`metricManager`. Currently Solr creates a new `SolrMetricsContext` per
registry, but we will instead need to create on per implementation of
`SolrMetricProducer` . I'm going to play with this and see if it makes sense
otherwise I'll keep it as is.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2981828661 We don't need an attribute for the registry so long as our metrics are reasonably differentiated / clear. We can change our mind in the future with basically no disturbance, I suppose. I wouldn't get rid of SolrMetricsContext _yet_. AB also kind of recommended keeping it to reduce the review/impact. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2152733217
##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope,
SolrMetricProducer producer) {
+ ", producer = "
+ producer);
}
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
// use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);
Review Comment:
Look inside `RequestHandlerBase` at the `HandlerMetrics` constructor.
```
Attributes.builder()
.putAll(attributes)
.put(AttributeKey.stringKey("source"), "client")
.put(TYPE_ATTR, "errors")
.build()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2981090664 > I think scope is intended to basically refer to all of Solr (e.g. be something like "org.apache.solr"), possibly with module consideration added (e.g. add ".llm"). Actually this makes sense. If you have a single place where many different metrics are stored, this scope name might be the differentiator especially if all your systems have some general metric like `http_requests_total`. So then all scope should be `org.apache.solr` technically. Do you still find value in keeping the concept of registry as a label on all metrics then? If not, I don't see the need to keep `SolrMetricsContext` and we should just create metrics through `SolrMetricManager` because that is currently all I am using it for. I don't see much value in aggregating on a label like `registry` except for `solr.core.*` but we have core as a label already. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151901980
##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope,
SolrMetricProducer producer) {
+ ", producer = "
+ producer);
}
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
// use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);
Review Comment:
I'm looking for where you augment base attributes with more info
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151201538
##
solr/core/src/java/org/apache/solr/metrics/SolrCoreMetricManager.java:
##
@@ -135,8 +144,19 @@ public void registerMetricProducer(String scope,
SolrMetricProducer producer) {
+ ", producer = "
+ producer);
}
+
+// NOCOMMIT SOLR-17458: These attributes may not work for standalone mode
// use deprecated method for back-compat, remove in 9.0
-producer.initializeMetrics(solrMetricsContext, scope);
+producer.initializeMetrics(
+solrMetricsContext,
+Attributes.builder()
+.put(CORE_ATTR, core.getCoreDescriptor().getName())
+.put(COLLECTION_ATTR, collectionName)
+.put(SHARD_ATTR, shardName)
+.put(REPLICA_ATTR, replicaName)
+.put((scope.startsWith("/")) ? HANDLER_ATTR : SCOPE_ATTR, scope)
+.build(),
+scope);
Review Comment:
> Can you point me where you augment the base attributes with more so I can
see what this looks like?
@dsmiley Here is where I initialize the base attributes for
`requestHandlerBase` that was instrumented.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2978644556 Remember AB's [comment](https://issues.apache.org/jira/browse/SOLR-17458?focusedCommentId=17941031&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17941031). To add/clarify something he's getting at is the need to unregister metrics related to a core. The DropWizard MetricsRegistry per core was a way to do that. I chatted with Google Gemini about this matter, and it recommended creating a dedicated SdkMeterProvider in place of a DropWizard MetricRegistry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2978617096 "Context" and "Scope" are almost synonymous... "Context" just potentially holds more than a scope. I'm not sure a rename is warranted; let's table that. Better to do big renames at the end to make the change easier to review beforehand, then any wrote renames that we think are warranted. Perhaps in a completely separate PR/commit. I read about [instrumentation scope](https://opentelemetry.io/docs/specs/otel/common/instrumentation-scope/); thanks. My overall sense reading it is that it's _not_ for how you propose to use it. Although, the last example given is "Internal application components emitting their own telemetry, relying on InstrumentationScope attributes to differentiate themselves in case multiple instances of the same type exist.". Which might mean how you are using it, or I suspect more likely it was thinking a big multi-versioned isolated system capable of running different versions of a component. Shrug. As I look at the code internally, I _do_ think this scope is structurally set up to be used as we want. Maybe we could identify another system using OTEL metrics, some DB-like system, and follow in their footsteps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2151112226
##
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String
parentName) {
}
/**
- * Initialize metrics specific to this producer.
- *
- * @param parentContext parent metrics context. If this component has the
same life-cycle as the
- * parent it can simply use the parent context, otherwise it should
obtain a child context
- * using {@link SolrMetricsContext#getChildContext(Object)} passing
this as the
- * child object.
- * @param scope component scope
+ * Deprecated entry point for initializing metrics. TODO SOLR-17458: This
will be removed Change
+ * to only take attributes completely removing Dropwizard
*/
- void initializeMetrics(SolrMetricsContext parentContext, String scope);
+ void initializeMetrics(SolrMetricsContext parentContext, Attributes
attributes, String scope);
Review Comment:
h. Can you point me where you augment the base attributes with more so
I can see what this looks like?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2977881603 Actually the more I work on this, I think I am going to go with the path of renaming `SolrMetricsContext` to `SolrMetricsScope`. For `otel_scope_name`, we should update scope based on how the [OTEL instrumentation scope spec](https://opentelemetry.io/docs/specs/otel/common/instrumentation-scope/) recommends > Developers can decide what denotes a reasonable instrumentation scope. For example, they can select a module, a package, or a class as the instrumentation scope. So for `requestHandlerBase`, the `scopeName` in `SolrMetricsScope` should maybe be the package name of `org.apache.solr.handler`. If we really want to keep registries, then we can add a label called registry but I don't know if on an attribute based framework we still need the registry tag? Core metrics for example already have the `core` name as a label making it unique. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on PR #3384: URL: https://github.com/apache/solr/pull/3384#issuecomment-2977412580 > The introduction of use of OTEL's "scope" will be very interesting & foundational to Solr's adoption of OTEL metrics. I think that's the next step after this PR. This PR already has "scope" in the metrics. All metrics have a tag called `otel_scope_name`. When creating an instrument from `SolrMetricsContext`, it has to create the instrument from a `scope` to `SolrMetricManager` which is the registry name that the SolrMetricsContext holds. Unless we want scope to be something else, I am keeping it the same registries or "scope` which is `solr.node`, `solr.core.*`, `solr.jetty`... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2150416427
##
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements
SolrInfoBean {
private SolrMetricsContext solrMetricsContext;
private long previous = System.nanoTime();
+ // TODO SOLR-17458: Migrate to Otel
Review Comment:
Didn't know about `NOCOMMIT` I updated a few places with it and will use
that instead going forward and keep the rest as TODO for now.
##
solr/core/src/java/org/apache/solr/metrics/SolrMetricsContext.java:
##
@@ -116,21 +134,94 @@ public SolrMetricsContext getChildContext(Object child) {
* Register a metric name that this component reports. This method is called
by various metric
* registration methods in {@link org.apache.solr.metrics.SolrMetricManager}
in order to capture
* what metric names are reported from this component (which in turn is
called from {@link
- *
org.apache.solr.metrics.SolrMetricProducer#initializeMetrics(SolrMetricsContext,
String)}).
+ * SolrMetricProducer#initializeMetrics(SolrMetricsContext,
+ * io.opentelemetry.api.common.Attributes, String)}).
*/
+ // TODO We can continue to register metric names
public void registerMetricName(String name) {
metricNames.add(name);
}
/** Return a snapshot of metric values that this component reports. */
+ // TODO Don't think this is needed anymore
public Map getMetricsSnapshot() {
return MetricUtils.convertMetrics(getMetricRegistry(), metricNames);
}
+ public LongCounter longCounter(String metricName, String description) {
Review Comment:
I'm debating if SolrMetricsContext isn't necessarily needed so I'll either
delete it entirely or rename it to just `SolrMetricsScope` to pin instrument
creation to the scope it wants to create to metricManager. I didn't determine
that entirely in this PR yet.
In this PR, I should have mentioned it as well but the concept of
`registryName` just changes to `otel scope` which is why you see when we pass
`registryName` into metric manager. Also if you look at the metrics, there is a
tag that OTEL created itself called `otel_scope_name` which was the registry or
"scope" that this metric was created in. So we still keep the `solr.node`
`solr.core.*` registries unless we want some much more granular.
```
solr_metrics_core_requests_times_milliseconds_bucket{category="QUERY", ...
otel_scope_name="solr.core.foobar.shard1.replica_n1", ...} 1
```
##
solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java:
##
@@ -706,459 +693,6 @@ public void testExprMetrics() throws Exception {
assertTrue(map.toString(), map.containsKey("count"));
}
- @Test
- @SuppressWarnings("unchecked")
- public void testPrometheusMetricsCore() throws Exception {
Review Comment:
Many of these Prometheus Tests are relating to the PrometheusFormatter
features that was removed. A version of this should come back eventually
further into the migration but with metric naming and attributes may change, I
didn't see a reason to currently just fix it if naming is subject to change.
##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
public final Timer requestTimes;
public final Counter totalTime;
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String...
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes,
String... metricPath) {
+
+ // TODO SOLR-17458: To be removed
numErrors = solrMetricsContext.meter("errors", metricPath);
numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
requests = solrMetricsContext.counter("requests", metricPath);
requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+ var baseRequestMetric =
+ solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP
Solr request counts");
+
+ var baseRequestTimeMetric =
+ solrMetricsContext.longHistogram(
+ "solr_metrics_core_requests_times", "HTTP Solr request times",
"ms");
+
+ otelRequests =
+ new AttributedLongCounter(
+ baseRequestMetric,
+
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
dsmiley commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2147630116
##
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements
SolrInfoBean {
private SolrMetricsContext solrMetricsContext;
private long previous = System.nanoTime();
+ // TODO SOLR-17458: Migrate to Otel
Review Comment:
The Lucene & Solr projects use the magic word "NOCOMMIT" not TODO. TODO
means someday maybe. NOCOMMIT means, do NOT merge this into a release branch.
"precommit" will fail.
But since you've written this so many times already, we could leave the
existing ones as-is and remember to search for "TODO SOLR-17458"
##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
public final Timer requestTimes;
public final Counter totalTime;
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String...
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes,
String... metricPath) {
+
+ // TODO SOLR-17458: To be removed
numErrors = solrMetricsContext.meter("errors", metricPath);
numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
requests = solrMetricsContext.counter("requests", metricPath);
requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+ var baseRequestMetric =
+ solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP
Solr request counts");
+
+ var baseRequestTimeMetric =
+ solrMetricsContext.longHistogram(
+ "solr_metrics_core_requests_times", "HTTP Solr request times",
"ms");
+
+ otelRequests =
+ new AttributedLongCounter(
+ baseRequestMetric,
+ Attributes.builder()
+ .putAll(attributes)
+ .put(AttributeKey.stringKey("type"), "requests")
+ .build());
+
+ otelNumErrors =
Review Comment:
I wonder if we're using OTEL the right way here to handle this scenario --
basically there are errors, and some are server and some are client. I see you
simply vary the "type" for each. Maybe... instead we should only have the
server & client, both have "type=errors", but differentiate by another
attribute like "errorSource=client|server". This still allows users to easily
look at metrics undifferentiated if they don't care (i.e. errors of whatever
source). Presumably this would also work?
##
solr/core/src/java/org/apache/solr/metrics/otel/instruments/AttributedLongTimer.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.metrics.otel.instruments;
+
+import io.opentelemetry.api.common.Attributes;
+import io.opentelemetry.api.metrics.LongHistogram;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class records the elapsed time (in milliseconds) to a {@link
LongHistogram}. Each thread
+ * measures its own timer allowing concurrent timing operations on separate
threads..
+ */
+public class AttributedLongTimer extends AttributedLongHistogram {
+
+ private final ThreadLocal startTimeNanos = new ThreadLocal<>();
Review Comment:
The use of ThreadLocal concerns me; seems like an over-use of ThreadLocal.
I could be convinced otherwise, though. Instead of conceptually changing the
state of this timer (scoped to this thread, granted), couldn't we return a
Timer object that the caller closes? Note that Solr has RTimer / RTimerTree
Re: [PR] SOLR-17785: Integrate foundational OTEL meter instruments [solr]
mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2141160310
##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
public final Timer requestTimes;
public final Counter totalTime;
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String...
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
+
+public HandlerMetrics(
+SolrMetricsContext solrMetricsContext, Attributes attributes,
String... metricPath) {
+
+ // TODO SOLR-17458: To be removed
numErrors = solrMetricsContext.meter("errors", metricPath);
numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
requests = solrMetricsContext.counter("requests", metricPath);
requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+ var baseRequestMetric =
+ solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP
Solr request counts");
+
+ var baseRequestTimeMetric =
+ solrMetricsContext.longHistogram(
+ "solr_metrics_core_requests_times", "HTTP Solr request times",
"ms");
+
+ otelRequests =
+ new AttributedLongCounter(
+ baseRequestMetric,
+ Attributes.builder()
+ .putAll(attributes)
+ .put(AttributeKey.stringKey("type"), "requests")
+ .build());
Review Comment:
These are the base metrics. Then you bind them to attributes so we don't
need to keep rebuilding them every time we record an event.
##
solr/core/src/java/org/apache/solr/handler/admin/MetricsHandler.java:
##
@@ -127,11 +126,12 @@ private void handleRequest(SolrParams params,
BiConsumer consume
return;
}
+// TODO SOLR-17458: Make this the default option after dropwizard removal
if (PROMETHEUS_METRICS_WT.equals(params.get(CommonParams.WT))) {
- response = handlePrometheusExport(params);
- consumer.accept("metrics", response);
+ consumer.accept("metrics", metricManager.getMetricReader().collect());
Review Comment:
Instead of going through the whole dropwizard -> prometheus formatter logic.
Very simply the metric reader does the OTEL -> prometheus for us through the
SDK.
##
solr/core/src/test/org/apache/solr/response/TestPrometheusResponseWriter.java:
##
@@ -127,54 +131,6 @@ public void testPrometheusStructureOutput() throws
Exception {
}
}
- public void testPrometheusDummyOutput() throws Exception {
Review Comment:
Pretty much all Prometheus Formmater tests go away with the exception of
`testPrometheusStructureOutput` which tests for prometheus exposition format
from the OTEL sdk.
##
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
public final Timer requestTimes;
public final Counter totalTime;
-public HandlerMetrics(SolrMetricsContext solrMetricsContext, String...
metricPath) {
+public AttributedLongCounter otelRequests;
+public AttributedLongCounter otelNumErrors;
+public AttributedLongCounter otelNumServerErrors;
+public AttributedLongCounter otelNumClientErrors;
+public AttributedLongCounter otelNumTimeouts;
+public AttributedLongTimer otelRequestTimes;
Review Comment:
These are essentially equivalent to Dropwizards metrics above. Once we do
deprecations, we can just change the name to the ones above and remove them.
##
solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java:
##
@@ -152,18 +153,28 @@ public void init(PluginInfo info) {
}
}
+ // TODO SOLR-17458: Fix metric Attributes
@Override
- public void initializeMetrics(SolrMetricsContext parentContext, String
scope) {
-super.initializeMetrics(parentContext, scope);
+ public void initializeMetrics(
+ SolrMetricsContext parentContext, Attributes attributes, String scope) {
+super.initializeMetrics(
+parentContext,
+Attributes.builder()
+.putAll(attributes)
+.put(AttributeKey.stringKey("category"), getCategory().toString())
+.put(AttributeKey.stringKey("internal"), "false")
Review Comment:
Instead of having the handler do something like `/select[shard]`, we use a
tag saying `internal` true/false
##
solr/
