[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866878#comment-17866878
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit 728f8b57deb6c828725e187d8d354224a9d14ab7 in solr's branch 
refs/heads/branch_9x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=728f8b57deb ]

SOLR-10654: PrometheusResponseWriter test tweaks

 - Fix MetricsHandlerTest

 - AwaitsFix TestPrometheusResponseWriter pending SOLR-17368

(cherry picked from commit 6967c7b695c39b045e1ac49eb346ab49ceab2a13)


> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-17 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866877#comment-17866877
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit a8c4a90e9fdb72fab6b1bfbc5bb76b8c72f020d0 in solr's branch 
refs/heads/branch_9x from Matthew Biscocho
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=a8c4a90e9fd ]

SOLR-10654: Metrics: New wt=prometheus option (#2405)

An alternative to the "Prometheus Exporter" in which each Solr node can return 
the Prometheus format natively from the MetricsHandler using `wt=prometheus` 
param.  It's much faster and architecturally simpler, albeit less flexible.

Co-authored-by: mbiscocho 
Co-authored-by: Christine Poerschke 
Co-authored-by: David Smiley 

(cherry picked from commit fd7d44771e0c35edb5373d22a499929b3e828469)


> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-15 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866143#comment-17866143
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

{quote}Is this really specifically about Gauges (vs say Timers)?  Any way, my 
understanding is that we can call SolrMetricManager.removeRegistry on the 
various registries.  Perhaps tests generally or maybe just this test isn't 
doing that.  As we have test infrastructure that detects leaks of things, maybe 
we need to detect a metric registry leak.
{quote}
I am not an expert on the dropwizard stuff, and it's been a while since i 
looked at it, but yes: Gauges are particularly tricky to deal with from a 
lifecycle standpoint, which is why Solr has it's own GaugeWrapper.  But that 
doesn't help us when other libraries register gauges.

SOLR-16918 (which i linked to previously) is an example of problematic gauge 
lifecycle situations that (i believe) could leak between tests.  Any gauges 
created directly by jetty are also likeley to leak between tests – which is i 
believe exactly what is happening here.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866008#comment-17866008
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit 6967c7b695c39b045e1ac49eb346ab49ceab2a13 in solr's branch 
refs/heads/pr/2550 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=6967c7b695c ]

SOLR-10654: PrometheusResponseWriter test tweaks

 - Fix MetricsHandlerTest

 - AwaitsFix TestPrometheusResponseWriter pending SOLR-17368


> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17866001#comment-17866001
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit fd7d44771e0c35edb5373d22a499929b3e828469 in solr's branch 
refs/heads/pr/2550 from Matthew Biscocho
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=fd7d44771e0 ]

SOLR-10654: Metrics: New wt=prometheus option (#2405)

An alternative to the "Prometheus Exporter" in which each Solr node can return 
the Prometheus format natively from the MetricsHandler using `wt=prometheus` 
param.  It's much faster and architecturally simpler, albeit less flexible.

Co-authored-by: mbiscocho 
Co-authored-by: Christine Poerschke 
Co-authored-by: David Smiley 

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-12 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865642#comment-17865642
 ] 

David Smiley commented on SOLR-10654:
-

Thanks for jumping in Hoss.

I agree it'd be a good idea to test the values match the format (e.g. is an 
integer vs string vs float).  If the content needs to be scrubbed, we should 
*replace* values with a nominal value (i.e. 42 becomes 0) instead of with 
nothing.  We agree that the regex in there is too complex! I complained in the 
PR but should have insisted on further improvements.  In the JVM category, we 
should also blank out certain label values that are JVM specific like 
{{{}space="CodeCache"{}}}.
{quote}then WTF is the point of having the test
{quote}
New/removed metrics *might* require code updates.  This is one of the bigger 
concerns I have with the underlying approach; it's a rather hard-coded 
translation because we decided to not switch out DropWizard with a Prometheus 
native metrics backend.  Whoever updates the TXT file should do it as a 
thinking person who considers that the change makes sense – has a name, label, 
label-value, and metric format that makes sense – matches conventions of others 
as well too, for example.  A metric could inexplicably disappear; should be 
investigated if not intended.
{quote}Unless/until the lifecycle of Gauges is drastically overhauled in a 
future version of dropwizard
{quote}
Is this really specifically about Gauges (vs say Timers)?  Any way, my 
understanding is that we can call SolrMetricManager.removeRegistry on the 
various registries.  Perhaps tests generally or maybe just this test isn't 
doing that.  As we have test infrastructure that detects leaks of things, maybe 
we need to detect a metric registry leak.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-12 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865589#comment-17865589
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

Since we've established that both problems seem to be test specific, "rolling 
back" doesn't seem necessary.

I went ahead and did the quick fix to MetricsHandlerTest and opened SOLR-17368 
to track improving TestPrometheusResponseWriter so that we can re-resolve this 
jira.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865587#comment-17865587
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit 6967c7b695c39b045e1ac49eb346ab49ceab2a13 in solr's branch 
refs/heads/main from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=6967c7b695c ]

SOLR-10654: PrometheusResponseWriter test tweaks

 - Fix MetricsHandlerTest

 - AwaitsFix TestPrometheusResponseWriter pending SOLR-17368


> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-12 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865546#comment-17865546
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

Ok, but your stated goal...
{quote}The test using solr-prometheus-output.txt, which I asked for, is 
intended to ensure we don't accidentally remove an exported metric due to some 
internal renaming choice or a change in the type of metric.  That's it.  It 
also serves as an integration test. 
{quote}
Is 100% compatible with my suggestion...
{quote}The test needs to parse the "actual" response and make some basic 
assertions about the _structure_ it finds, and if there are certain metrics 
that should *always* be present, assert they exist with values that are _valid_ 
 (ie: can be parsed as an int, or can be parsed as a float) w/o asserting 
exactly what those values are.
{quote}
...if you want to keep that list of "mandatory" expected metrics that should 
*always* be present in a txt file because you think that's easier to maintain 
then a static {{Arrays.asList(...)}} so be it.  But the way it works now: 
asserting that the entire actual response, with a confusing regex applied, will 
exactly equals the contents of text file is seriously problematic...
 # It's Brittle AF, and is going to be tedius to update anytime we add metrics 
– or anytime a library we use (like jetty or dropwizard's executor service) 
adds metrics.  And if the dev process is "update this exepcted txt fle to match 
the actual response anytime the test fails" ... then WTF is the point of having 
the test?
 # As seen in the 100% jenkins failure rate: Unless/until the lifecycle of 
Gauges is drastically overhauled in a future version of dropwizard, you can 
never prededict the full list of (jetty, maybe other) metrics that will be in 
the actual response depending on what other tests ran before this one in the 
same JVM.

#2 is really my biggest concern right now  – this isn't a question of what to 
do *_*IF*_* the test fails in the future – this test has literally *_never 
passed!_*

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865296#comment-17865296
 ] 

David Smiley commented on SOLR-10654:
-

The test using solr-prometheus-output.txt, which I asked for, is intended to 
ensure we don't accidentally remove an exported metric due to some internal 
renaming choice or a change in the type of metric.  That's it.  It also serves 
as an integration test.  The test may fail when metrics are added, yes, but we 
need only re-save the file.  Ideally this test would have docs to state such 
things, and maybe cater to the situation of needing to re-save the file (such 
as by dumping the output to a file that can then be saved-over.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865249#comment-17865249
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

Unrelated to the test issues – I'm no prometheus expert, but this can't 
possibly be correct...
{noformat}
$ curl -sS -I 'http://localhost:8983/solr/admin/metrics?wt=prometheus' | grep 
^Content-Type
Content-Type: application/json;charset=utf-8 {noformat}
 

I suspect PrometheusResponseWriter needs to override the the {{getContentType}} 
method?

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865247#comment-17865247
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

{quote}In the PR, I have a regex to try and remove some of the JVM 
implementations but seems as it's not enough.
{quote}
For {{{}TestPrometheusResponseWriter{}}}...

You need to just plain kill the "equals" assertion against 
{{solr-prometheus-output.txt}} ... no matter how many regex tricks you try, 
that's way to brittle to survive future additions/tweaks to the metrics solr 
supports.  The test needs to parse the "actual" response and make some basic 
assertions about the _structure_ it finds, and if there are certain metrics 
that should *always* be present, assert they exist with values that are _valid_ 
 (ie: can be parsed as an int, or can be parsed as a float) w/o asserting 
exactly what those values are.

But under no circumstances should you assert that you know the full list of 
every metrics that _might_ someday exist in solr/jetty.

(this would also be a good place register dummy metrics with fake names that 
you can easily assert have _specific_ expect values)

 

For {{MetricsHandlerTest.testPrometheusMetricsJvm}}...

{code}
s/assertEquals(0, /assertNotNull(/
{code}

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865246#comment-17865246
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

This is the assert that (sometimes) fails in 
{{{}MetricsHandlerTest.testPrometheusMetricsJvm{}}}...
{code:java}
actualSnapshot = getMetricSnapshot(actualSnapshots, 
"solr_metrics_jvm_memory_pools_bytes");
actualGaugeDataPoint =
getGaugeDatapointSnapshot(
actualSnapshot, Labels.of("item", "committed", "space", 
"CodeHeap-'non-nmethods'"));
assertEquals(0, actualGaugeDataPoint.getValue(), 0);
{code}
...I couldn't fathom how that assert could ever be successful, on any JVM, and 
went down a rabbit hole of trying to figure out what it is about committed 
CodeHeap sizes that might prevent them from being used in test JVMs ... until i 
realized that you are (attempting) to stub out those gauges in your test setup 
with dummies that always return 0!

*EXCEPT:*
 * As mentioned, the lifecycle of gauges is prblematic – if any other test (or 
code path) in this JVM creates those same gauges (and doesn't explicitly 
unregister them) your stubs ill be ignored.
 * this is the *only* one of those dummy gauges that you create where you also 
{{{}assertEquals(0, ...){}}}...
 ** For all the other dummy gauges this test method looks at, the only 
assertion made is {{assertNotNull(...)}}
 ** Almost as if [you already ran into this problem with the gauges, but 
overlooked fixing this assert when you fixed all the 
others|https://github.com/mlbiscoc/solr/commit/835c946ecac04900f840ccd25cf278a6a6a2aef4]
 and added this comment...

{noformat}
// Some JVM metrics are non-deterministic due to testing environment such as
// availableProcessors. We confirm snapshot exists and is nonNull instead.
{noformat}

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865239#comment-17865239
 ] 

Matthew Biscocho commented on SOLR-10654:
-

Thanks Chris, I agree. My intention wasn't to just fix the test failures but to 
try and get a failure on my end to try and remove as much environmental 
variables as possible.  In the PR, I have a regex to try and remove some of the 
JVM implementations but seems as it's not enough. The gauge issue is 
interesting but plan on taking a deeper look to fix these up.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17865236#comment-17865236
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

{quote}These tests must be failing depending on environmen or type of JVM being 
used as I am unable to reproduce this on my end. Do you have more information 
on how the Jenkins builds are being run?
{quote}
I don't, but even if i/you did "fixing" the test to pass on jenkins shouldn't 
be the goal – you'd still be leaving a time bomb for some future dev whose JVM 
doesn't exactly match yours

 

The bottom line is that every tests we have should be able pass on *ANY* valid 
JVM – tests that are known to be problematic on _specific_ JVM implementations 
(like IBM's OpenJ9) should have specific {{assume()}} calls to prevent them 
from running.

If test assertions are based on what metrics are available, they should _only_ 
look at metrics that are guaranteed to be available on every JVM, and not make 
assumptions about what "extra" metrics may or may not exist on _some_ JVMs (and 
may or may not have values that differ from what you see on *your* JVM)

 

 

In the case of {{TestPrometheusResponseWriter}} – it's pretty easy to reproduce 
the problem on _any_ JVM by running multiple "jetty" based test in the same 
Test JVM  – the jetty dispatches/request metrics that aren't accounted for in 
your hardcoded "expected" value seem to get created on the fly once there are 
some requests (and they seem to stick around beyond the lifecycle of the tests 
– (i suspect they are gauges which are known to have problematic lifecycles, 
but i haven't checked) ...

 
{noformat}
$ ./gradlew test -Ptests.jvms=1 -Ptests.verbose=true --tests 
TestPrometheusResponseWriter --tests \*Jetty\*
...
org.apache.solr.response.TestPrometheusResponseWriter > testPrometheusOutput 
FAILED
    org.junit.ComparisonFailure: expected:<... TYPE solr_metrics_j[vm_buffers 
gauge
    solr_metrics_jvm_buffers{item="Count",pool="direct"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
    # TYPE solr_metrics_jvm_buffers_bytes gauge
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped - 
'non-volatile memory'"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped - 
'non-volatile memory'"}
    # TYPE solr_metrics_jvm_gc gauge
    solr_metrics_jvm_gc{}
    solr_metrics_jvm_gc{}
    # TYPE solr_metrics_jvm_gc_seconds gauge
    solr_metrics_jvm_gc_seconds{}
    solr_metrics_jvm_gc_seconds{}
    # TYPE solr_metrics_jvm_heap gauge
    solr_metrics_jvm_heap{item="committed",memory="heap"}
    solr_metrics_jvm_heap{item="committed",memory="non-heap"}
    solr_metrics_jvm_heap{item="committed",memory="total"}
    solr_metrics_jvm_heap{item="init",memory="heap"}
    solr_metrics_jvm_heap{item="init",memory="non-heap"}
    solr_metrics_jvm_heap{item="init",memory="total"}
    solr_metrics_jvm_heap{item="max",memory="heap"}
    solr_metrics_jvm_heap{item="max",memory="non-heap"}
    solr_metrics_jvm_heap{item="max",memory="total"}
    solr_metrics_jvm_heap{item="usage",memory="heap"}
    solr_metrics_jvm_heap{item="usage",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="heap"}
    solr_metrics_jvm_heap{item="used",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="total"}
    # TYPE solr_metrics_jvm_memory_pools_bytes gauge
    solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeCache]"}
    solr_metrics_jvm_...> but was:<... TYPE 
solr_metrics_j[etty_dispatches_total counter
    solr_metrics_jetty_dispatches_total 0.0
    # TYPE solr_metrics_jetty_requests_total counter
    solr_metrics_jetty_requests_total{method="active"}
    # TYPE solr_metrics_jetty_response_total counter
    solr_metrics_jetty_response_total{status="2xx"}
    # TYPE solr_metrics_jvm_buffers gauge
    solr_metrics_jvm_buffers{item="Count",pool="direct"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
    # TYPE solr_metrics_jvm_buffers_bytes gauge
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped - 
'non-volatile memory'"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped - 
'non-volatile memory'"}
    # TYPE 

[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-10 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864665#comment-17864665
 ] 

Matthew Biscocho commented on SOLR-10654:
-

These tests must be failing depending on environmen or type of JVM being used 
as I am unable to reproduce this on my end. Do you have more information on how 
the Jenkins builds are being run?

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-09 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864416#comment-17864416
 ] 

Chris M. Hostetter commented on SOLR-10654:
---

{{TestPrometheusResponseWriter.testPrometheusOutput}} has failed in 100% of the 
jenkins builds that have run it since being added yesterday ... the problem 
appears to be some jardcoded assumptions about metric ordering?
{noformat}
org.apache.solr.response.TestPrometheusResponseWriter > testPrometheusOutput 
FAILED
    org.junit.ComparisonFailure: expected:<... TYPE solr_metrics_j[]vm_buffers 
gauge
    sol...> but was:<... TYPE solr_metrics_j[etty_dispatches_total counter
    solr_metrics_jetty_dispatches_total 0.0
    # TYPE solr_metrics_jetty_requests_total counter
    solr_metrics_jetty_requests_total{method="active"}
    # TYPE solr_metrics_jetty_response_total counter
    solr_metrics_jetty_response_total{status="2xx"}
    # TYPE solr_metrics_j]vm_buffers gauge
    sol...>
        at 
__randomizedtesting.SeedInfo.seed([76D2036CF55247FD:F7C26E458CF7A80B]:0)
        at org.junit.Assert.assertEquals(Assert.java:117)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at 
org.apache.solr.response.TestPrometheusResponseWriter.testPrometheusOutput(TestPrometheusResponseWriter.java:86)
{noformat}
 

Likewise {{MetricsHandlerTest.testPrometheusMetricsJvm}} has failed 50% of the 
jenkins builds that jave run it since being added yesterday ... notably always 
on Uwe's jenkins machine, suggesting some hardcoded assumptions about metrics 
that may not be true in long running JVMs...

{noformat}
org.apache.solr.handler.admin.MetricsHandlerTest > testPrometheusMetricsJvm 
FAILED
java.lang.AssertionError: expected:<0.0> but was:<3604480.0>
at 
__randomizedtesting.SeedInfo.seed([5F3548D00F329FF4:577F9BC55C62D284]:0)
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:685)
at 
org.apache.solr.handler.admin.MetricsHandlerTest.testPrometheusMetricsJvm(MetricsHandlerTest.java:925)
 
{noformat}

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-07-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17863994#comment-17863994
 ] 

ASF subversion and git services commented on SOLR-10654:


Commit fd7d44771e0c35edb5373d22a499929b3e828469 in solr's branch 
refs/heads/main from Matthew Biscocho
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=fd7d44771e0 ]

SOLR-10654: Metrics: New wt=prometheus option (#2405)

An alternative to the "Prometheus Exporter" in which each Solr node can return 
the Prometheus format natively from the MetricsHandler using `wt=prometheus` 
param.  It's much faster and architecturally simpler, albeit less flexible.

Co-authored-by: mbiscocho 
Co-authored-by: Christine Poerschke 
Co-authored-by: David Smiley 

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-05-21 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848396#comment-17848396
 ] 

Matthew Biscocho commented on SOLR-10654:
-

The overhead post processing due to JQ is not the main problem but certainly is 
one. I would say, running the Prometheus exporter can be costly, especially at 
scale and running multiple instances. It offers the flexibility for 
configurability but I don't think that solves everyones use-case as it is not 
free to just run the Prometheus Exporter. I think the overhead of aggregating 
metrics should happen on the Grafana or Prometheus level while the exposed 
metrics themselves should just be raw values. With this PR, prometheus can just 
scrape and then the aggregation can be done on Grafana directly and skips the 
extra http call hops from the prometheus exporter and JQ processing.

I took a bit of time to measure some performance between my PR and the 
prometheus exporter. I created a cloud with 2 nodes and 50 collections to get a 
bunch of metrics. For the cloud, I curl'd each node individually and captured 
the response time of each node. Not sure if prometheus scrapes sequentially or 
in parallel but looks like both just take around ~0.6s locally.

I modified the Prometheus exporter config to only scrape the same metrics my PR 
currently exports (Core registry) and added a few lines of code to capture the 
timing it takes for scraping and JQ processing. Looking at the timing it was 
taking around 4-5 seconds per collection interval which is significantly longer.

`My PR:`

`curl -o /dev/null -s -w 'Total: %\{time_total}s\n' 
'localhost:8983/solr/admin/metrics?wt=prometheus'`
`Total: 0.614125s`
`curl -o /dev/null -s -w 'Total: %\{time_total}s\n' 
'localhost:7574/solr/admin/metrics?wt=prometheus'`
`Total: 0.597078s`

 

`Prometheus Exporter:`

INFO  - 2024-05-21 18:10:28.930; 
org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed 
metrics collection
INFO  - 2024-05-21 18:11:28.923; 
org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning 
metrics collection
PT4.355627S

I want to say this is due to the Http calls and JQ processing the Prometheus 
Exporter needs to do while my PR is doing a straight internal conversion. 
Although it is doing the conversion per call, it doesn't seem to be as costly 
as the prometheus exporter is.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-05-21 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848325#comment-17848325
 ] 

David Smiley commented on SOLR-10654:
-

A possible way to have it both ways would be to +embed+ the Prometheus Exporter 
_partially_, without the caching aspect -- it'd be a request handler that 
fetches metrics locally (would talk to MetricsHandler in a direct way) and 
post-processes via JQ.  I don't love JQ but... hey, some do.  XSLT/XQuery is 
more my thing.  No new dependencies to add directly to Solr; people would just 
add the Exporter's as if it's a module.  Regardless of some details, there 
would still be *some* overhead in this post processing due to JQ.  I'm not sure 
that's the pain point we're solving for here?  I haven't measured lately.  It 
could be interesting to compare the performance of the Prometheus Exporter and 
this patch.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-05-20 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847998#comment-17847998
 ] 

David Smiley commented on SOLR-10654:
-

Overall I wonder what others think about maintaining two parallel yet 
consistent metrics mappings -- one in the "Prometheus Exporter" configured 
using lots of "jq" and that which is very flexible (intended for users to 
configure/hack as needed), the second what this PR does, basically as 
hard-coded as can be.  For example if we add a new metric, we then probably 
need to update the exporter's config, and also edit source code being added 
here.  This could be helped by having the Prometheus Exporter fetch certain 
metrics pass-through on-demand.  But based on the design of the Prometheus 
Exporter, I think that could be tricky/awkward.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-04-17 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838265#comment-17838265
 ] 

Matthew Biscocho commented on SOLR-10654:
-

Created a draft [PR|https://github.com/apache/solr/pull/2405] for exposing 
prometheus metrics. Going to work on writing tests, but put up the draft to get 
some initial feedback on implementation. Output looks similar to the prometheus 
exporter.

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-04-11 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836390#comment-17836390
 ] 

Matthew Biscocho commented on SOLR-10654:
-

Thanks for the info, [~dsmiley]! Actually came back to this recently since I 
wrote that last comment and I agree, parallel registry is way too much overhead 
and would probably make things much more difficult. Currently I have a lot of 
work written to transform the dropwizard registries to Prometheus with relevant 
tags for Solr! Looks pretty similar to the Prometheus exporter. Eager to post 
the first PR to get some feedback on my implementation. Hoping to get it out 
soon. 

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2024-04-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836368#comment-17836368
 ] 

David Smiley commented on SOLR-10654:
-

Cool to see [your PR|https://github.com/apache/solr/pull/2375] Mathew :-)
You didn't title that PR starting with this JIRA issue, so it didn't auto-link. 
 Also it's good to comment in the JIRA issue about a PR any way since there is 
no notification whatsoever to those watching this JIRA issue even when the link 
is auto-added.

Disclaimer:  I haven't been looking closely at metrics lately.
A parallel registry seems painful -- overhead and the synchronization risks.
Moving to Micrometer -- do you think it would affect most metrics publishers in 
Solr (thus touch tons of source files) or only the metrics internals/plumbing?  
Either way, probably for Solr 10 if we go this way.
Maybe there could be a hard-coded algorithmic approach that can convert the raw 
name to a tagged/labelled one metric?

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr

2023-12-20 Thread Matthew Biscocho (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799191#comment-17799191
 ] 

Matthew Biscocho commented on SOLR-10654:
-

So I found some time to look more into a solution to implement these metrics 
and ran into a few problems mostly around Solr metric collection with 
Dropwizard.

I took a first approach by taking Solr Dropwizard Registry and coverting it 
with a Util function from the prometheus client library to a Prometheus 
Registry and use that output in new response writer I made. Doing so gave an 
output that looks like so [^prometheus_metrics.txt]. 

I ran into the first issue in that 1 to 1 naming from Dropwizard isn't very 
useful and doesn't follow [prometheus recommended naming convention. 
|https://prometheus.io/docs/practices/naming/] The bigger issue is that 
Dropwizard doesn't support tagging metrics and without tags/label essentially 
makes these metrics useless for aggregation across Solr nodes.

So based on this, the only other way I can find to expose these metrics from 
Solr was to either:
1. Create a separate [Prometheus 
Registry|https://prometheus.github.io/client_java/getting-started/registry/] 
next to the existing Dropwizard Registry and collect these Solr metrics in 
parallel but with Prometheus having its own naming convention and applying 
tags/labels

2. Possibly migrate off of Dropwizard and migrate to Micrometer as the new 
default registry as it supports metric tagging. I did some research on 
Micrometer and it looks to support [prometheus and a lot of other monitoring 
systems|https://micrometer.io/docs/concepts#_supported_monitoring_systems] 
making it possible to output micrometer in prometheus format.

I'm still newish to the Solr code base but was hoping to get someone who has 
experience with metrics in Solr ( maybe [~dsmiley] ) to discuss possible other 
approaches before going forward and help choose the best?

> Expose Metrics in Prometheus format DIRECTLY from Solr
> --
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Keith Laban
>Priority: Major
> Attachments: prometheus_metrics.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code}
> scrape_configs:
>   - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
>   wt: ["prometheus"]
> static_configs:
>   - targets: ['localhost:8983']
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org