Re: Improving documentation about observability

2024-05-13 Thread Wilfred Spiegelenburg
Please file jiras for any of the issues mentioned, or one jira if it
can all be handled from one.
All your remarks make sense.

You can even open a PR for the changes that you would like to make.
Documentation in the yunikorn-site repository.
The sample deployment is located in the yunikorn-k8shim repository [1].
Contributions are always welcome. We should document and or set
sensible configuration values if we provide any.

Wilfred

[1] 
https://github.com/apache/yunikorn-k8shim/blob/master/deployments/scheduler/prometheus.yml#L18

On Mon, 13 May 2024 at 19:31, Wiard van Rij  wrote:
>
> Hello everyone,
>
> I'm getting in touch through the mailing list since I haven't set up my Jira 
> account yet.
>
> I'd like to discuss the content found at 
> https://yunikorn.apache.org/docs/user_guide/prometheus/. It seems that out of 
> the box, it doesn't offer sensible default values. Typically, Prometheus is 
> deployed as a comprehensive solution, not just for a single service like 
> yunikorn. Thus, suggesting a configuration change that alters the global 
> interval rate to 3 seconds might not be the most advisable approach. Instead, 
> I'd argue that adjusting this interval isn't necessary, especially 
> considering you're recommending adding another job to the static config.
>
> Specifically, I propose the following adjustments:
>
>   *   Eliminate the global block from the configuration.
>   *   If an evaluation_interval is suggested, ensure its value matches the 
> scrape interval.
>   *   Set the scrape_interval to either 15 seconds or 30 seconds. I lean 
> towards 15 seconds as it should be more than adequate.
>   *
> Encourage users to avoid using overrides in the scrape_configs. Instead, they 
> could utilize annotations on the service or implement a serviceMonitor when 
> using Prometheus Operator.
>  *
> This is honestly a more easier solution that doesn't involve changing 
> Prometheus 'core' configuration
>
> Thanks in advance,
>
> Wiard
>

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Improving documentation about observability

2024-05-13 Thread Wiard van Rij
Hello everyone,

I'm getting in touch through the mailing list since I haven't set up my Jira 
account yet.

I'd like to discuss the content found at 
https://yunikorn.apache.org/docs/user_guide/prometheus/. It seems that out of 
the box, it doesn't offer sensible default values. Typically, Prometheus is 
deployed as a comprehensive solution, not just for a single service like 
yunikorn. Thus, suggesting a configuration change that alters the global 
interval rate to 3 seconds might not be the most advisable approach. Instead, 
I'd argue that adjusting this interval isn't necessary, especially considering 
you're recommending adding another job to the static config.

Specifically, I propose the following adjustments:

  *   Eliminate the global block from the configuration.
  *   If an evaluation_interval is suggested, ensure its value matches the 
scrape interval.
  *   Set the scrape_interval to either 15 seconds or 30 seconds. I lean 
towards 15 seconds as it should be more than adequate.
  *
Encourage users to avoid using overrides in the scrape_configs. Instead, they 
could utilize annotations on the service or implement a serviceMonitor when 
using Prometheus Operator.
 *
This is honestly a more easier solution that doesn't involve changing 
Prometheus 'core' configuration

Thanks in advance,

Wiard