I fixed my problem.  I was using the wrong service account and my angry 
eyes failed to notice.  I blame helm for making 2 service accounts that 
look almost exactly alike.  ;-)  However, you were *also* correct that I 
needed to declare less things to induce the default SDK behavior.  Where 
would it be appropriate to update the AWS service discovery docs with the 
advice that the SDK usually pick up the proper things?  It seems like that 
bit of guidance should go *someplace*.

On Friday, April 10, 2020 at 8:21:19 PM UTC-4, William Findley wrote:
>
>
> I'm having trouble getting ec2 service discovery to work using an IAM role 
> bound to an EKS service account.  Here's what I have.
>
> I have a pod that has successfully had a web identity token projected into 
> it.  I'm fairly confident that there's no problem with this.  I have 
> customers on this EKS that I've rigged up with IAM roles and kubez service 
> accounts, and they're happily using services.
>
> /prometheus $ ls -la /var/run/secrets/eks.amazonaws.com/serviceaccount 
> total 0
> drwxrwsrwt    3 root     2000           100 Apr 10 17:23 .
> drwxr-xr-x    3 root     root            28 Apr 10 17:49 ..
> drwxr-sr-x    2 root     2000            60 Apr 10 17:23 
> ..2020_04_10_17_23_59.145300320
> lrwxrwxrwx    1 root     root            31 Apr 10 17:23 ..data -> 
> ..2020_04_10_17_23_59.145300320
> lrwxrwxrwx    1 root     root            12 Apr 10 17:23 token -> 
> ..data/token
>
>
> I'm the information about what role/token to use  is exposed on the 
> following env vars:
>
>       AWS_ROLE_ARN:                 
> arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks
>       AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/
> eks.amazonaws.com/serviceaccount/token
>
> Here's my scrape config.  I'm trying to discover and scrape node exporter 
> on a box that I've tagged with prometheus.io/discover and has a name 
> biginning like I expect. 
> scrape_configs:
> - ec2_sd_configs:
>   - filters:
>     - name: tag-key
>       values:
>       - prometheus.io/discover
>     role_arn: 
> arn:aws:iam::2XXXXXXXXXX0:role/prometheus-service-discovery-eks
>   job_name: service-ec2
>   relabel_configs:
>   - action: keep
>     regex: ^mycoolnameprefix-.*
>     source_labels:
>     - __meta_ec2_tag_Name
>   - replacement: $1:9100
>     source_labels:
>     - __meta_ec2_private_ip
>     target_label: __address__
>
> My assumption from the docs and the use of the latest version of 
> prometheus and the dependant AWS SDK was that it would use these ENV 
> variables in the way that it needed to discover the role and go out and 
> bind it.  However, these logs indicate otherwise:
>
> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 
> component="discovery manager scrape" msg="Starting provider" 
> provider=*ec2.SDConfig/0 subs=[service-ec2]
> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:224 
> component="discovery manager notify" msg="Starting provider" 
> provider=string/0 subs=[config-0]
> level=info ts=2020-04-10T21:08:03.271Z caller=main.go:816 msg="Completed 
> loading of configuration file" 
> filename=/etc/prometheus/config_out/prometheus.env.yaml
> level=debug ts=2020-04-10T21:08:03.271Z caller=manager.go:242 
> component="discovery manager notify" msg="discoverer channel closed" 
> provider=string/0
> level=error ts=2020-04-10T21:08:03.493Z caller=refresh.go:79 
> component="discovery manager scrape" discovery=ec2 msg="Unable to refresh 
> target groups" err="could not describe instances: WebIdentityErr: failed to 
> retrieve credentials\ncaused by: AccessDenied: Not authorized to perform 
> sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 
> 3317a2e2-5357-4535-9b53-085209fdfb5c"
> level=error ts=2020-04-10T21:09:03.502Z caller=refresh.go:98 
> component="discovery manager scrape" discovery=ec2 msg="Unable to refresh 
> target groups" err="could not describe instances: WebIdentityErr: failed to 
> retrieve credentials\ncaused by: AccessDenied: Not authorized to perform 
> sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 
> 455fddb6-9b42-449b-b603-d7f453923a7b"
>
> Any tips on where I might have gone wrong?  I made the best effort I could 
> to follow the existing documentation, but I don't feel like it's telling me 
> everything I need to know.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/85c5c4fe-aba3-443e-8a42-cd6078f2d659%40googlegroups.com.

Reply via email to