mlbiscoc commented on code in PR #3665:
URL: https://github.com/apache/solr/pull/3665#discussion_r2361245873


##########
solr/core/src/java/org/apache/solr/security/AuditLoggerPlugin.java:
##########
@@ -261,39 +260,70 @@ public void setFormatter(AuditEventFormatter formatter) {
     this.formatter = formatter;
   }
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
       SolrMetricsContext parentContext, Attributes attributes, final String 
scope) {
     solrMetricsContext = parentContext.getChildContext(this);
     String className = this.getClass().getSimpleName();
     log.debug("Initializing metrics for {}", className);
-    numErrors = solrMetricsContext.meter("errors", getCategory().toString(), 
scope, className);
-    numLost = solrMetricsContext.meter("lost", getCategory().toString(), 
scope, className);
-    numLogged = solrMetricsContext.meter("count", getCategory().toString(), 
scope, className);
+
+    Attributes attrsWithCategory =
+        Attributes.builder()
+            .putAll(attributes)
+            .put(CATEGORY_ATTR, Category.SECURITY.toString())
+            .put(PLUGIN_NAME_ATTR, this.getClass().getSimpleName())
+            .build();
+
+    numLogged =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_count",
+                "The number of audit events that were successfully logged."),
+            attrsWithCategory);
+    numErrors =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_errors", "The number of audit events that 
resulted in errors."),
+            attrsWithCategory);
+    numLost =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_lost", "The number of audit events that were 
lost."),

Review Comment:
   Can put a additional info what lost means here? Looks like something is lost 
if the async queue was full so it failed to insert.



##########
solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java:
##########
@@ -773,7 +794,19 @@ public SdkMeterProvider meterProvider(String providerName) 
{
               var reader = new FilterablePrometheusMetricReader(true, null);
               // NOCOMMIT: We need to add a Periodic Metric Reader here if we 
want to push with OTLP
               // with an exporter
-              var provider = 
SdkMeterProvider.builder().registerMetricReader(reader);
+              var provider =
+                  SdkMeterProvider.builder()
+                      .registerMetricReader(reader)
+                      .registerView(
+                          InstrumentSelector.builder()

Review Comment:
   Thanks finding this. Most of our timers were milliseconds so OTELs default 
buckets were fine but nanoseconds doesn't apply here. I think defining our own 
set of large buckets here is good for now. Probably down the road we should 
have this configurable for the user to pick the bucket sizes instead. I would 
add that as a TODO item somewhere here.



##########
solr/core/src/java/org/apache/solr/security/AuditLoggerPlugin.java:
##########
@@ -261,39 +260,70 @@ public void setFormatter(AuditEventFormatter formatter) {
     this.formatter = formatter;
   }
 
-  // TODO SOLR-17458: Migrate to Otel
   @Override
   public void initializeMetrics(
       SolrMetricsContext parentContext, Attributes attributes, final String 
scope) {
     solrMetricsContext = parentContext.getChildContext(this);
     String className = this.getClass().getSimpleName();
     log.debug("Initializing metrics for {}", className);
-    numErrors = solrMetricsContext.meter("errors", getCategory().toString(), 
scope, className);
-    numLost = solrMetricsContext.meter("lost", getCategory().toString(), 
scope, className);
-    numLogged = solrMetricsContext.meter("count", getCategory().toString(), 
scope, className);
+
+    Attributes attrsWithCategory =
+        Attributes.builder()
+            .putAll(attributes)
+            .put(CATEGORY_ATTR, Category.SECURITY.toString())
+            .put(PLUGIN_NAME_ATTR, this.getClass().getSimpleName())
+            .build();
+
+    numLogged =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_count",
+                "The number of audit events that were successfully logged."),
+            attrsWithCategory);
+    numErrors =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_errors", "The number of audit events that 
resulted in errors."),
+            attrsWithCategory);
+    numLost =
+        new AttributedLongCounter(
+            solrMetricsContext.longCounter(
+                "solr_auditlogger_lost", "The number of audit events that were 
lost."),
+            attrsWithCategory);
     requestTimes =
-        solrMetricsContext.timer("requestTimes", getCategory().toString(), 
scope, className);
-    totalTime = solrMetricsContext.counter("totalTime", 
getCategory().toString(), scope, className);
+        new AttributedLongTimer(
+            this.solrMetricsContext.longHistogram(
+                "solr_auditlogger_request_times",
+                "Distribution of audit event request durations",
+                OtelUnit.NANOSECONDS),
+            attrsWithCategory);
+
     if (async) {
-      solrMetricsContext.gauge(
-          () -> blockingQueueSize,
-          true,
-          "queueCapacity",
-          getCategory().toString(),
-          scope,
-          className);
-      solrMetricsContext.gauge(
-          () -> blockingQueueSize - queue.remainingCapacity(),
-          true,
-          "queueSize",
-          getCategory().toString(),
-          scope,
-          className);
+      solrMetricsContext.observableLongGauge(
+          "solr_auditlogger_queue",
+          "Metrics around the audit logger queue when running in async mode",
+          (observableLongMeasurement -> {
+            observableLongMeasurement.record(
+                blockingQueueSize,
+                attrsWithCategory.toBuilder().put(TYPE_ATTR, 
"capacity").build());
+            observableLongMeasurement.record(
+                blockingQueueSize - queue.remainingCapacity(),
+                attrsWithCategory.toBuilder().put(TYPE_ATTR, "size").build());
+          }));
       queuedTime =
-          solrMetricsContext.timer("queuedTime", getCategory().toString(), 
scope, className);
+          new AttributedLongTimer(
+              solrMetricsContext.longHistogram(
+                  "solr_auditlogger_queued_time",
+                  "Distribution of time events spend queued before processing",
+                  OtelUnit.NANOSECONDS),
+              attrsWithCategory);
     }
-    solrMetricsContext.gauge(
-        () -> async, true, "async", getCategory().toString(), scope, 
className);
+
+    solrMetricsContext.observableLongGauge(
+        "solr_auditlogger_async_enabled",
+        "Whether the audit logger is running in async mode (1) or not (0)",
+        (observableLongMeasurement ->
+            observableLongMeasurement.record(async ? 1 : 0, 
attrsWithCategory)));

Review Comment:
   I'm assuming this doesn't really change? Lets make this a synchronous gauge 
instead (regular gauge and not observable) so a callback doesn't need to be 
called every time just to get a 1 or 0 as this doesn't change much.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to