[prometheus-users] Couple of issues with prometheus and yet another cloudwatch exporter (YACWE)

Niranjan Panch Thu, 09 Jan 2025 05:13:22 -0800

Issue#1 

My YACE config looks like below but sometimes cpu utilization or some other 
metric reports value as Not a number(NaN) but I dont understand why because 
underlying cloudwatch data points are correct. what is wrong here?


apiVersion: v1alpha1
discovery:
  jobs:
    - type: AWS/RDS
      regions:
        - us-east-1
        - us-west-2
      roles:
        - roleArn: "arn:aws:iam::xxxxxxxxxx:role/yyyyy"
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
        - roleArn: "arn:aws:iam:: xxxxxxxxxx  :role/ yyyyy  "
 
*     period: 60      length: 60*
      metrics:
        - name: CPUUtilization
          statistics: [Average]
        - name: DatabaseConnections
          statistics: [Average]
        - name: FreeableMemory
          statistics: [Average]
        - name: FreeStorageSpace
          statistics: [Average]
        - name: ReadThroughput
          statistics: [Average]
        - name: WriteThroughput
          statistics: [Average]
        - name: ReadLatency
          statistics: [Average]
        - name: WriteLatency
          statistics: [Average]
        - name: ReadIOPS


Issue#2

Now exporter to cloudwatch scrape interval is 60 seconds, prometheus to 
exporter scrape interval is 20 seconds and evaluation interval in 
prometheus is 20 seconds. 

Now my following rules are always stuck in peding state, how do I adjust my 
configuration to make things work? 

(aws_rds_write_latency_average{account_id!="",dimension_DBInstanceIdentifier!=""})
 
* 1000 > 20 
<https://prometheus-cieops-monitoring.us-east-1.prod.cieops.aws.athenahealth.com/graph?g0.expr=%28aws_rds_write_latency_average%7Baccount_id%21%3D%22973732892259%22%2Cdimension_DBInstanceIdentifier%21%3D%22%22%7D%29+%2A+1000+%3E+20&g0.tab=1>

avg_over_time(aws_rds_write_latency_average{account_id!="973732892259",dimension_DBInstanceIdentifier!=""}[10m])
 
* 1000 > 10 
<https://prometheus-cieops-monitoring.us-east-1.prod.cieops.aws.athenahealth.com/graph?g0.expr=avg_over_time%28aws_rds_write_latency_average%7Baccount_id%21%3D%22973732892259%22%2Cdimension_DBInstanceIdentifier%21%3D%22%22%7D%5B10m%5D%29+%2A+1000+%3E+10&g0.tab=1>

isue#3

sometimes when I manually verify the values between cloudwatch and 
prometheus exporter, I see some fluctuations in cloudwatch data points 
whereas prometheus is not reflecting that.

For example - every few minutes cloudwatch reports write latency as 80 ms 
and then goes down to 10ms , stays there for 3-4 minutes and then goes up 
whereas in grafana dashboard this fluction is not seen and it always shows 
60-80 milliseconds for the entire hour.

Please help me to fix my configuration.

Regards,
Niranjan


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/630e054e-c2b1-49d6-9a33-10a3f57e93afn%40googlegroups.com.

[prometheus-users] Couple of issues with prometheus and yet another cloudwatch exporter (YACWE)

Reply via email to