I think it should (for now) go into the reference for the ec2_sd_config: https://github.com/prometheus/prometheus/blob/master/docs/configuration/configuration.md#ec2_sd_config
right now it doesn't explain anything about authentication, so a paragraph on that in general would be helpful I think. /MR On Tue, Apr 14, 2020 at 10:52 PM William Findley <[email protected]> wrote: > I fixed my problem. I was using the wrong service account and my angry > eyes failed to notice. I blame helm for making 2 service accounts that > look almost exactly alike. ;-) However, you were *also* correct that I > needed to declare less things to induce the default SDK behavior. Where > would it be appropriate to update the AWS service discovery docs with the > advice that the SDK usually pick up the proper things? It seems like that > bit of guidance should go *someplace*. > > On Friday, April 10, 2020 at 8:21:19 PM UTC-4, William Findley wrote: >> >> >> I'm having trouble getting ec2 service discovery to work using an IAM >> role bound to an EKS service account. Here's what I have. >> >> I have a pod that has successfully had a web identity token projected >> into it. I'm fairly confident that there's no problem with this. I have >> customers on this EKS that I've rigged up with IAM roles and kubez service >> accounts, and they're happily using services. >> >> /prometheus $ ls -la /var/run/secrets/eks.amazonaws.com/serviceaccount >> total 0 >> drwxrwsrwt 3 root 2000 100 Apr 10 17:23 . >> drwxr-xr-x 3 root root 28 Apr 10 17:49 .. >> drwxr-sr-x 2 root 2000 60 Apr 10 17:23 >> ..2020_04_10_17_23_59.145300320 >> lrwxrwxrwx 1 root root 31 Apr 10 17:23 ..data -> >> ..2020_04_10_17_23_59.145300320 >> lrwxrwxrwx 1 root root 12 Apr 10 17:23 token -> >> ..data/token >> >> >> I'm the information about what role/token to use is exposed on the >> following env vars: >> >> AWS_ROLE_ARN: >> arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks >> AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/ >> eks.amazonaws.com/serviceaccount/token >> >> Here's my scrape config. I'm trying to discover and scrape node exporter >> on a box that I've tagged with prometheus.io/discover and has a name >> biginning like I expect. >> scrape_configs: >> - ec2_sd_configs: >> - filters: >> - name: tag-key >> values: >> - prometheus.io/discover >> role_arn: >> arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks >> job_name: service-ec2 >> relabel_configs: >> - action: keep >> regex: ^mycoolnameprefix-.* >> source_labels: >> - __meta_ec2_tag_Name >> - replacement: $1:9100 >> source_labels: >> - __meta_ec2_private_ip >> target_label: __address__ >> >> My assumption from the docs and the use of the latest version of >> prometheus and the dependant AWS SDK was that it would use these ENV >> variables in the way that it needed to discover the role and go out and >> bind it. However, these logs indicate otherwise: >> >> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 >> component="discovery manager scrape" msg="Starting provider" >> provider=*ec2.SDConfig/0 subs=[service-ec2] >> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 >> component="discovery manager notify" msg="Starting provider" >> provider=string/0 subs=[config-0] >> level=info ts=2020-04-10T21:08:03.271Z caller=main.go:816 msg="Completed >> loading of configuration file" >> filename=/etc/prometheus/config_out/prometheus.env.yaml >> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:242 >> component="discovery manager notify" msg="discoverer channel closed" >> provider=string/0 >> level=error ts=2020-04-10T21:08:03.493Z caller=refresh.go:79 >> component="discovery manager scrape" discovery=ec2 msg="Unable to refresh >> target groups" err="could not describe instances: WebIdentityErr: failed to >> retrieve credentials\ncaused by: AccessDenied: Not authorized to perform >> sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: >> 3317a2e2-5357-4535-9b53-085209fdfb5c" >> level=error ts=2020-04-10T21:09:03.502Z caller=refresh.go:98 >> component="discovery manager scrape" discovery=ec2 msg="Unable to refresh >> target groups" err="could not describe instances: WebIdentityErr: failed to >> retrieve credentials\ncaused by: AccessDenied: Not authorized to perform >> sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: >> 455fddb6-9b42-449b-b603-d7f453923a7b" >> >> Any tips on where I might have gone wrong? I made the best effort I >> could to follow the existing documentation, but I don't feel like it's >> telling me everything I need to know. >> >> >> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/85c5c4fe-aba3-443e-8a42-cd6078f2d659%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/85c5c4fe-aba3-443e-8a42-cd6078f2d659%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAMV%3D_gaaQ7tU143OB3e5w07KC-MFMDaULP2ZMcZRNhsHH%2B726w%40mail.gmail.com.

