That looks interesting! I've also found the full list of S3 properties[1]
for the version of presto-hive bundled with Flink 1.12 (see [2]), which
includes an option for a KMS key (hive.s3.kms-key-id).

(also, adding back the user list)

[1]:
https://prestodb.io/docs/0.187/connector/hive.html#amazon-s3-configuration
[2]:
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/filesystems/s3.html#hadooppresto-s3-file-systems-plugins

On Mon, Apr 5, 2021 at 4:21 PM Swagat Mishra <swaga...@gmail.com> wrote:

> Btw, there is also an option to provide a custom credential provider,
> what are your thoughts on this?
>
> presto.s3.credentials-provider
>
>
> On Tue, Apr 6, 2021 at 12:43 AM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
>> I've confirmed that for the bundled + shaded aws dependency, the only way
>> to upgrade it is to build a flink-s3-fs-presto jar with the updated
>> dependency. Let me know if this is feasible for you, if the KMS key
>> solution doesn't work.
>>
>> Best,
>> Austin
>>
>> On Mon, Apr 5, 2021 at 2:18 PM Austin Cawley-Edwards <
>> austin.caw...@gmail.com> wrote:
>>
>>> Hi Swagat,
>>>
>>> I don't believe there is an explicit configuration option for the KMS
>>> key – please let me know if you're able to make that work!
>>>
>>> Best,
>>> Austin
>>>
>>> On Mon, Apr 5, 2021 at 1:45 PM Swagat Mishra <swaga...@gmail.com> wrote:
>>>
>>>> Hi Austin,
>>>>
>>>> Let me know what you think on my latest email, if the approach might
>>>> work, or if it is already supported and I am not using the configurations
>>>> properly.
>>>>
>>>> Thanks for your interest and support.
>>>>
>>>> Regards,
>>>> Swagat
>>>>
>>>> On Mon, Apr 5, 2021 at 10:39 PM Austin Cawley-Edwards <
>>>> austin.caw...@gmail.com> wrote:
>>>>
>>>>> Hi Swagat,
>>>>>
>>>>> It looks like Flink 1.6 bundles the 1.11.165 version of the
>>>>> aws-java-sdk-core with the Presto implementation (transitively from Presto
>>>>> 0.185[1]).
>>>>> The minimum support version for the ServiceAccount authentication
>>>>> approach is 1.11.704 (see [2]) which was released on Jan 9th, 2020[3], 
>>>>> long
>>>>> after Flink 1.6 was released. It looks like even the most recent Presto is
>>>>> on a version below that, concretely 1.11.697 in the master branch[4], so I
>>>>> don't think even upgrading Flink to 1.6+ will solve this though it looks 
>>>>> to
>>>>> me like the AWS dependency is managed better in more recent Flink 
>>>>> versions.
>>>>> I'll have more for you on that front tomorrow, after the Easter break.
>>>>>
>>>>> I think what you would have to do to make this authentication approach
>>>>> work for Flink 1.6 is building a custom version of the flink-s3-fs-presto
>>>>> jar, replacing the bundled AWS dependency with the 1.11.704 version, and
>>>>> then shading it the same way.
>>>>>
>>>>> In the meantime, would you mind creating a JIRA ticket with this use
>>>>> case? That'll give you the best insight into the status of fixing this :)
>>>>>
>>>>> Let me know if that makes sense,
>>>>> Austin
>>>>>
>>>>> [1]:
>>>>> https://github.com/prestodb/presto/blob/1d4ee196df4327568c0982811d8459a44f1792b9/pom.xml#L53
>>>>> [2]:
>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html
>>>>> [3]: https://github.com/aws/aws-sdk-java/releases/tag/1.11.704
>>>>> [4]: https://github.com/prestodb/presto/blob/master/pom.xml#L52
>>>>>
>>>>> On Sun, Apr 4, 2021 at 3:32 AM Swagat Mishra <swaga...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Austin -
>>>>>>
>>>>>> In my case the set up is such that services are deployed on
>>>>>> Kubernetes with Docker, running on EKS. There is also an istio service
>>>>>> mesh. So all the services communicate and access AWS resources like S3
>>>>>> using the service account. Service account is associated with IAM roles. 
>>>>>> I
>>>>>> have verified that the service account has access to S3, by running a
>>>>>> program that connects to S3 to read a file also aws client when
>>>>>> packaged into the pod is able to access S3. So that means the roles and
>>>>>> policies are good.
>>>>>>
>>>>>> When I am running flink, I am following the same configuration for
>>>>>> job manager and task manager as provided here:
>>>>>>
>>>>>>
>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/kubernetes.html
>>>>>>
>>>>>> The exception we are getting is -
>>>>>> org.apache.flink.fs.s3presto.shaded.com.amazonaws.SDKClientException:
>>>>>> Unable to load credentials from service end point.
>>>>>>
>>>>>> This happens in the EC2CredentialFetcher class method
>>>>>> fetchCredentials - line number 66, when it tries to read resource,
>>>>>> effectively executing
>>>>>> CURL 169.254.170.2/AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
>>>>>>
>>>>>> I am not setting the variable AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
>>>>>> because its not the right way to do it for us, we are on EKS. Similarly 
>>>>>> any
>>>>>> of the ~/.aws/credentials file approach will also not work for us.
>>>>>>
>>>>>>
>>>>>> Atm, I haven't tried the kuberenetes service account property you
>>>>>> mentioned above. I will try and let you know how it goes.
>>>>>>
>>>>>> Question - do i need to provide any parameters while building the
>>>>>> docker image or any configuration in the flink config to tell flink that
>>>>>> for all purposes it should be using the service account and not try to 
>>>>>> get
>>>>>> into the EC2CredentialFetcher class.
>>>>>>
>>>>>> One more thing - we were trying this on the 1.6 version of Flink and
>>>>>> not the 1.12 version.
>>>>>>
>>>>>> Regards,
>>>>>> Swagat
>>>>>>
>>>>>> On Sun, Apr 4, 2021 at 8:56 AM Sameer Wadkar <sam...@axiomine.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Kube2Iam needs to modify IPtables to proxy calls to ec2 metadata to
>>>>>>> a daemonset which runs privileged pods which maps a IP Address of the 
>>>>>>> pods
>>>>>>> and its associated service account to make STS calls and return 
>>>>>>> temporary
>>>>>>> AWS credentials. Your pod “thinks” the ec2 metadata url works locally 
>>>>>>> like
>>>>>>> in an ec2 instance.
>>>>>>>
>>>>>>> I have found that mutating webhooks are easier to deploy (when you
>>>>>>> have no control over the Kubernetes environment - say you cannot change
>>>>>>> iptables or run privileged pods). These can configure the
>>>>>>> ~/.aws/credentials file. The webhook can make the STS call for the 
>>>>>>> service
>>>>>>> account to role mapping. A side car container to which the main 
>>>>>>> container
>>>>>>> has no access can even renew credentials becoz STS returns temp
>>>>>>> credentials.
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On Apr 3, 2021, at 10:29 PM, Austin Cawley-Edwards <
>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>
>>>>>>> 
>>>>>>> If you’re just looking to attach a service account to a pod using
>>>>>>> the native AWS EKS IAM mapping[1], you should be able to attach the 
>>>>>>> service
>>>>>>> account to the pod via the `kubernetes.service-account` configuration
>>>>>>> option[2].
>>>>>>>
>>>>>>> Let me know if that works for you!
>>>>>>>
>>>>>>> Best,
>>>>>>> Austin
>>>>>>>
>>>>>>> [1]:
>>>>>>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
>>>>>>> [2]:
>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account
>>>>>>>
>>>>>>> On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards <
>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Can you describe your setup a little bit more? And perhaps how you
>>>>>>>> use this setup to grant access to other non-Flink pods?
>>>>>>>>
>>>>>>>> On Sat, Apr 3, 2021 at 2:29 PM Swagat Mishra <swaga...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yes I looked at kube2iam, I haven't experimented with it.
>>>>>>>>>
>>>>>>>>> Given that the service account has access to S3, shouldn't we have
>>>>>>>>> a simpler mechanism to connect to underlying resources based on the 
>>>>>>>>> service
>>>>>>>>> account authorization?
>>>>>>>>>
>>>>>>>>> On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards <
>>>>>>>>> austin.caw...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Swagat,
>>>>>>>>>>
>>>>>>>>>> I’ve used kube2iam[1] for granting AWS access to Flink pods in
>>>>>>>>>> the past with good results. It’s all based on mapping pod 
>>>>>>>>>> annotations to
>>>>>>>>>> AWS IAM roles. Is this something that might work for you?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Austin
>>>>>>>>>>
>>>>>>>>>> [1]: https://github.com/jtblin/kube2iam
>>>>>>>>>>
>>>>>>>>>> On Sat, Apr 3, 2021 at 10:40 AM Swagat Mishra <swaga...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> No we are running on aws. The mechanisms supported by flink to
>>>>>>>>>>> connect to resources like S3, need us to make changes that will 
>>>>>>>>>>> impact all
>>>>>>>>>>> services, something that we don't want to do. So providing the aws 
>>>>>>>>>>> secret
>>>>>>>>>>> key ID and passcode upfront or iam rules where it connects by 
>>>>>>>>>>> executing
>>>>>>>>>>> curl/ http calls to connect to S3 , don't work for me.
>>>>>>>>>>>
>>>>>>>>>>> I want to be able to connect to S3, using aws Api's and if that
>>>>>>>>>>> connection can be leveraged by the presto library, that is what I am
>>>>>>>>>>> looking for.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Swagat
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Apr 3, 2021, 7:37 PM Israel Ekpo <israele...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Are you running on Azure Kubernetes Service.
>>>>>>>>>>>>
>>>>>>>>>>>> You should be able to do it because the identity can be mapped
>>>>>>>>>>>> to the labels of the pods not necessary Flink.
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Apr 3, 2021 at 6:31 AM Swagat Mishra <
>>>>>>>>>>>> swaga...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think flink doesn't support pod identity, any plans tk
>>>>>>>>>>>>> achieve it in any subsequent release.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Swagat
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>

Reply via email to