On the off chance, I fired up a pod with a container with the AWS CLI on it under the service account I'm using, and it was able to do the ec2:describeinstances api call just fine. I'm not sure how to track down what's happening here. Maybe I've run into a bug?
On Friday, April 10, 2020 at 8:21:19 PM UTC-4, William Findley wrote: > > > I'm having trouble getting ec2 service discovery to work using an IAM role > bound to an EKS service account. Here's what I have. > > I have a pod that has successfully had a web identity token projected into > it. I'm fairly confident that there's no problem with this. I have > customers on this EKS that I've rigged up with IAM roles and kubez service > accounts, and they're happily using services. > > /prometheus $ ls -la /var/run/secrets/eks.amazonaws.com/serviceaccount > total 0 > drwxrwsrwt 3 root 2000 100 Apr 10 17:23 . > drwxr-xr-x 3 root root 28 Apr 10 17:49 .. > drwxr-sr-x 2 root 2000 60 Apr 10 17:23 > ..2020_04_10_17_23_59.145300320 > lrwxrwxrwx 1 root root 31 Apr 10 17:23 ..data -> > ..2020_04_10_17_23_59.145300320 > lrwxrwxrwx 1 root root 12 Apr 10 17:23 token -> > ..data/token > > > I'm the information about what role/token to use is exposed on the > following env vars: > > AWS_ROLE_ARN: > arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks > AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/ > eks.amazonaws.com/serviceaccount/token > > Here's my scrape config. I'm trying to discover and scrape node exporter > on a box that I've tagged with prometheus.io/discover and has a name > biginning like I expect. > scrape_configs: > - ec2_sd_configs: > - filters: > - name: tag-key > values: > - prometheus.io/discover > role_arn: > arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks > job_name: service-ec2 > relabel_configs: > - action: keep > regex: ^mycoolnameprefix-.* > source_labels: > - __meta_ec2_tag_Name > - replacement: $1:9100 > source_labels: > - __meta_ec2_private_ip > target_label: __address__ > > My assumption from the docs and the use of the latest version of > prometheus and the dependant AWS SDK was that it would use these ENV > variables in the way that it needed to discover the role and go out and > bind it. However, these logs indicate otherwise: > > level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 > component="discovery manager scrape" msg="Starting provider" > provider=*ec2.SDConfig/0 subs=[service-ec2] > level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 > component="discovery manager notify" msg="Starting provider" > provider=string/0 subs=[config-0] > level=info ts=2020-04-10T21:08:03.271Z caller=main.go:816 msg="Completed > loading of configuration file" > filename=/etc/prometheus/config_out/prometheus.env.yaml > level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:242 > component="discovery manager notify" msg="discoverer channel closed" > provider=string/0 > level=error ts=2020-04-10T21:08:03.493Z caller=refresh.go:79 > component="discovery manager scrape" discovery=ec2 msg="Unable to refresh > target groups" err="could not describe instances: WebIdentityErr: failed to > retrieve credentials\ncaused by: AccessDenied: Not authorized to perform > sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: > 3317a2e2-5357-4535-9b53-085209fdfb5c" > level=error ts=2020-04-10T21:09:03.502Z caller=refresh.go:98 > component="discovery manager scrape" discovery=ec2 msg="Unable to refresh > target groups" err="could not describe instances: WebIdentityErr: failed to > retrieve credentials\ncaused by: AccessDenied: Not authorized to perform > sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: > 455fddb6-9b42-449b-b603-d7f453923a7b" > > Any tips on where I might have gone wrong? I made the best effort I could > to follow the existing documentation, but I don't feel like it's telling me > everything I need to know. > > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/67f41258-7d11-44aa-92b2-43e60b58a616%40googlegroups.com.

