I'm having trouble getting ec2 service discovery to work using an IAM role 
bound to an EKS service account.  Here's what I have.

I have a pod that has successfully had a web identity token projected into 
it.  I'm fairly confident that there's no problem with this.  I have 
customers on this EKS that I've rigged up with IAM roles and kubez service 
accounts, and they're happily using services.

/prometheus $ ls -la /var/run/secrets/eks.amazonaws.com/serviceaccount 
total 0
drwxrwsrwt    3 root     2000           100 Apr 10 17:23 .
drwxr-xr-x    3 root     root            28 Apr 10 17:49 ..
drwxr-sr-x    2 root     2000            60 Apr 10 17:23 
..2020_04_10_17_23_59.145300320
lrwxrwxrwx    1 root     root            31 Apr 10 17:23 ..data -> 
..2020_04_10_17_23_59.145300320
lrwxrwxrwx    1 root     root            12 Apr 10 17:23 token -> 
..data/token


I'm the information about what role/token to use  is exposed on the 
following env vars:

      AWS_ROLE_ARN:                 
arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks
      AWS_WEB_IDENTITY_TOKEN_FILE:  
/var/run/secrets/eks.amazonaws.com/serviceaccount/token

Here's my scrape config.  I'm trying to discover and scrape node exporter 
on a box that I've tagged with prometheus.io/discover and has a name 
biginning like I expect. 
scrape_configs:
- ec2_sd_configs:
  - filters:
    - name: tag-key
      values:
      - prometheus.io/discover
    role_arn: 
arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks
  job_name: service-ec2
  relabel_configs:
  - action: keep
    regex: ^mycoolnameprefix-.*
    source_labels:
    - __meta_ec2_tag_Name
  - replacement: $1:9100
    source_labels:
    - __meta_ec2_private_ip
    target_label: __address__

My assumption from the docs and the use of the latest version of prometheus 
and the dependant AWS SDK was that it would use these ENV variables in the 
way that it needed to discover the role and go out and bind it.  However, 
these logs indicate otherwise:

level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 
component="discovery manager scrape" msg="Starting provider" 
provider=*ec2.SDConfig/0 subs=[service-ec2]
level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 
component="discovery manager notify" msg="Starting provider" 
provider=string/0 subs=[config-0]
level=info ts=2020-04-10T21:08:03.271Z caller=main.go:816 msg="Completed 
loading of configuration file" 
filename=/etc/prometheus/config_out/prometheus.env.yaml
level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:242 
component="discovery manager notify" msg="discoverer channel closed" 
provider=string/0
level=error ts=2020-04-10T21:08:03.493Z caller=refresh.go:79 
component="discovery manager scrape" discovery=ec2 msg="Unable to refresh 
target groups" err="could not describe instances: WebIdentityErr: failed to 
retrieve credentials\ncaused by: AccessDenied: Not authorized to perform 
sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 
3317a2e2-5357-4535-9b53-085209fdfb5c"
level=error ts=2020-04-10T21:09:03.502Z caller=refresh.go:98 
component="discovery manager scrape" discovery=ec2 msg="Unable to refresh 
target groups" err="could not describe instances: WebIdentityErr: failed to 
retrieve credentials\ncaused by: AccessDenied: Not authorized to perform 
sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 
455fddb6-9b42-449b-b603-d7f453923a7b"

Any tips on where I might have gone wrong?  I made the best effort I could 
to follow the existing documentation, but I don't feel like it's telling me 
everything I need to know.


-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8af21e7a-0557-4565-892d-0af09ddd8e76%40googlegroups.com.

Reply via email to