That looks interesting! I've also found the full list of S3 properties[1] for the version of presto-hive bundled with Flink 1.12 (see [2]), which includes an option for a KMS key (hive.s3.kms-key-id).
(also, adding back the user list) [1]: https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration [2]: https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/filesystems/s3.html#hadooppresto-s3-file-systems-plugins On Mon, Apr 5, 2021 at 4:21 PM Swagat Mishra <swaga...@gmail.com> wrote: > Btw, there is also an option to provide a custom credential provider, > what are your thoughts on this? > > presto.s3.credentials-provider > > > On Tue, Apr 6, 2021 at 12:43 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> I've confirmed that for the bundled + shaded aws dependency, the only way >> to upgrade it is to build a flink-s3-fs-presto jar with the updated >> dependency. Let me know if this is feasible for you, if the KMS key >> solution doesn't work. >> >> Best, >> Austin >> >> On Mon, Apr 5, 2021 at 2:18 PM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> Hi Swagat, >>> >>> I don't believe there is an explicit configuration option for the KMS >>> key – please let me know if you're able to make that work! >>> >>> Best, >>> Austin >>> >>> On Mon, Apr 5, 2021 at 1:45 PM Swagat Mishra <swaga...@gmail.com> wrote: >>> >>>> Hi Austin, >>>> >>>> Let me know what you think on my latest email, if the approach might >>>> work, or if it is already supported and I am not using the configurations >>>> properly. >>>> >>>> Thanks for your interest and support. >>>> >>>> Regards, >>>> Swagat >>>> >>>> On Mon, Apr 5, 2021 at 10:39 PM Austin Cawley-Edwards < >>>> austin.caw...@gmail.com> wrote: >>>> >>>>> Hi Swagat, >>>>> >>>>> It looks like Flink 1.6 bundles the 1.11.165 version of the >>>>> aws-java-sdk-core with the Presto implementation (transitively from Presto >>>>> 0.185[1]). >>>>> The minimum support version for the ServiceAccount authentication >>>>> approach is 1.11.704 (see [2]) which was released on Jan 9th, 2020[3], >>>>> long >>>>> after Flink 1.6 was released. It looks like even the most recent Presto is >>>>> on a version below that, concretely 1.11.697 in the master branch[4], so I >>>>> don't think even upgrading Flink to 1.6+ will solve this though it looks >>>>> to >>>>> me like the AWS dependency is managed better in more recent Flink >>>>> versions. >>>>> I'll have more for you on that front tomorrow, after the Easter break. >>>>> >>>>> I think what you would have to do to make this authentication approach >>>>> work for Flink 1.6 is building a custom version of the flink-s3-fs-presto >>>>> jar, replacing the bundled AWS dependency with the 1.11.704 version, and >>>>> then shading it the same way. >>>>> >>>>> In the meantime, would you mind creating a JIRA ticket with this use >>>>> case? That'll give you the best insight into the status of fixing this :) >>>>> >>>>> Let me know if that makes sense, >>>>> Austin >>>>> >>>>> [1]: >>>>> https://github.com/prestodb/presto/blob/1d4ee196df4327568c0982811d8459a44f1792b9/pom.xml#L53 >>>>> [2]: >>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html >>>>> [3]: https://github.com/aws/aws-sdk-java/releases/tag/1.11.704 >>>>> [4]: https://github.com/prestodb/presto/blob/master/pom.xml#L52 >>>>> >>>>> On Sun, Apr 4, 2021 at 3:32 AM Swagat Mishra <swaga...@gmail.com> >>>>> wrote: >>>>> >>>>>> Austin - >>>>>> >>>>>> In my case the set up is such that services are deployed on >>>>>> Kubernetes with Docker, running on EKS. There is also an istio service >>>>>> mesh. So all the services communicate and access AWS resources like S3 >>>>>> using the service account. Service account is associated with IAM roles. >>>>>> I >>>>>> have verified that the service account has access to S3, by running a >>>>>> program that connects to S3 to read a file also aws client when >>>>>> packaged into the pod is able to access S3. So that means the roles and >>>>>> policies are good. >>>>>> >>>>>> When I am running flink, I am following the same configuration for >>>>>> job manager and task manager as provided here: >>>>>> >>>>>> >>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/kubernetes.html >>>>>> >>>>>> The exception we are getting is - >>>>>> org.apache.flink.fs.s3presto.shaded.com.amazonaws.SDKClientException: >>>>>> Unable to load credentials from service end point. >>>>>> >>>>>> This happens in the EC2CredentialFetcher class method >>>>>> fetchCredentials - line number 66, when it tries to read resource, >>>>>> effectively executing >>>>>> CURL 169.254.170.2/AWS_CONTAINER_CREDENTIALS_RELATIVE_URI >>>>>> >>>>>> I am not setting the variable AWS_CONTAINER_CREDENTIALS_RELATIVE_URI >>>>>> because its not the right way to do it for us, we are on EKS. Similarly >>>>>> any >>>>>> of the ~/.aws/credentials file approach will also not work for us. >>>>>> >>>>>> >>>>>> Atm, I haven't tried the kuberenetes service account property you >>>>>> mentioned above. I will try and let you know how it goes. >>>>>> >>>>>> Question - do i need to provide any parameters while building the >>>>>> docker image or any configuration in the flink config to tell flink that >>>>>> for all purposes it should be using the service account and not try to >>>>>> get >>>>>> into the EC2CredentialFetcher class. >>>>>> >>>>>> One more thing - we were trying this on the 1.6 version of Flink and >>>>>> not the 1.12 version. >>>>>> >>>>>> Regards, >>>>>> Swagat >>>>>> >>>>>> On Sun, Apr 4, 2021 at 8:56 AM Sameer Wadkar <sam...@axiomine.com> >>>>>> wrote: >>>>>> >>>>>>> Kube2Iam needs to modify IPtables to proxy calls to ec2 metadata to >>>>>>> a daemonset which runs privileged pods which maps a IP Address of the >>>>>>> pods >>>>>>> and its associated service account to make STS calls and return >>>>>>> temporary >>>>>>> AWS credentials. Your pod “thinks” the ec2 metadata url works locally >>>>>>> like >>>>>>> in an ec2 instance. >>>>>>> >>>>>>> I have found that mutating webhooks are easier to deploy (when you >>>>>>> have no control over the Kubernetes environment - say you cannot change >>>>>>> iptables or run privileged pods). These can configure the >>>>>>> ~/.aws/credentials file. The webhook can make the STS call for the >>>>>>> service >>>>>>> account to role mapping. A side car container to which the main >>>>>>> container >>>>>>> has no access can even renew credentials becoz STS returns temp >>>>>>> credentials. >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> On Apr 3, 2021, at 10:29 PM, Austin Cawley-Edwards < >>>>>>> austin.caw...@gmail.com> wrote: >>>>>>> >>>>>>> >>>>>>> If you’re just looking to attach a service account to a pod using >>>>>>> the native AWS EKS IAM mapping[1], you should be able to attach the >>>>>>> service >>>>>>> account to the pod via the `kubernetes.service-account` configuration >>>>>>> option[2]. >>>>>>> >>>>>>> Let me know if that works for you! >>>>>>> >>>>>>> Best, >>>>>>> Austin >>>>>>> >>>>>>> [1]: >>>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html >>>>>>> [2]: >>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account >>>>>>> >>>>>>> On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards < >>>>>>> austin.caw...@gmail.com> wrote: >>>>>>> >>>>>>>> Can you describe your setup a little bit more? And perhaps how you >>>>>>>> use this setup to grant access to other non-Flink pods? >>>>>>>> >>>>>>>> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra <swaga...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Yes I looked at kube2iam, I haven't experimented with it. >>>>>>>>> >>>>>>>>> Given that the service account has access to S3, shouldn't we have >>>>>>>>> a simpler mechanism to connect to underlying resources based on the >>>>>>>>> service >>>>>>>>> account authorization? >>>>>>>>> >>>>>>>>> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards < >>>>>>>>> austin.caw...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Swagat, >>>>>>>>>> >>>>>>>>>> I’ve used kube2iam[1] for granting AWS access to Flink pods in >>>>>>>>>> the past with good results. It’s all based on mapping pod >>>>>>>>>> annotations to >>>>>>>>>> AWS IAM roles. Is this something that might work for you? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Austin >>>>>>>>>> >>>>>>>>>> [1]: https://github.com/jtblin/kube2iam >>>>>>>>>> >>>>>>>>>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra <swaga...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> No we are running on aws. The mechanisms supported by flink to >>>>>>>>>>> connect to resources like S3, need us to make changes that will >>>>>>>>>>> impact all >>>>>>>>>>> services, something that we don't want to do. So providing the aws >>>>>>>>>>> secret >>>>>>>>>>> key ID and passcode upfront or iam rules where it connects by >>>>>>>>>>> executing >>>>>>>>>>> curl/ http calls to connect to S3 , don't work for me. >>>>>>>>>>> >>>>>>>>>>> I want to be able to connect to S3, using aws Api's and if that >>>>>>>>>>> connection can be leveraged by the presto library, that is what I am >>>>>>>>>>> looking for. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Swagat >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo <israele...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Are you running on Azure Kubernetes Service. >>>>>>>>>>>> >>>>>>>>>>>> You should be able to do it because the identity can be mapped >>>>>>>>>>>> to the labels of the pods not necessary Flink. >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra < >>>>>>>>>>>> swaga...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I think flink doesn't support pod identity, any plans tk >>>>>>>>>>>>> achieve it in any subsequent release. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Swagat >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>