RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-19 Thread Ivan Webber via user
I will provide more details as to how I was able to use AKV with CSI. Also, I 
looked in the Flink source at the ADLS FileSystem factory and I think despite 
what it says in the docs configuration options prefixed with flink.hadoop won’t 
get forwarded.

You can expose the key vault as Kubernetes secrets that can be exposed as 
environment variables (see the docs I previously sent). Then you can provide a 
KeyProvider class (org.apache.hadoop.fs.azure.KeyProvider) that reads from the 
environment variables. In the flink-conf.yaml you can configure Hadoop to use 
the KeyProvider (all keys with fs.azure prefix are forwarded to Hadoop). A jar 
with the KeyProvider should be included in the same directory as the ADLS 
plugin.

```flink-conf.yaml
fs.azure.account.keyprovider..dfs.core.windows.net: 

fs.azure.account.keyprovider..blob.core.windows.net: 

```

Keep in mind that all the available releases of Flink have one of two bugs 
causing problems reading and/or writing to ADLS, so you will need to re-build 
the ADLS plugin from source by checking out the release commit (probably 
1.17.0) and cherry-picking the bug fix (or wait for 1.17.1 or 1.18.0 which will 
have the fixes).

I’m new to using Flink, and it took me a while to figure out this; but 
hopefully it is helpful to you. I get the sense that few people are using ADLS 
with newer Flink versions or something because the docs and support seem 
half-baked.

Let me know if you make progress using MSI.

Best of luck,

Ivan

From: DEROCCO, CHRISTOPHER
Sent: Wednesday, May 17, 2023 6:20 AM
To: Ivan Webber
Cc: user@flink.apache.org; Shammon 
FY
Subject: [EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink Apache 
Operator not working

You don't often get email from cd9...@att.com. Learn why this is 
important
Ivan,

How did you use Azure Key Vault with CSI because the flink operator uses a 
configmap and not a Kubernetes secret to create the flink-conf file? I have 
also tried using pod-identities as well as the new workload identity 
(https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview) to no 
avail. It seems to be an issue with configuring 
flink-azure-fs-hadoop-1.16.0.jar with using the flink operator.

From: Ivan Webber 
Sent: Tuesday, May 16, 2023 8:01 PM
To: DEROCCO, CHRISTOPHER ; Shammon FY 
Cc: user@flink.apache.org
Subject: RE: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

When you create your cluster you probably need to ensure the following settings 
are set. I briefly looked into MSI but ended up using Azure Key Vault with 
CSI-storage driver for initial prototype 
(https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support).

For me it helped to think about it as Hadoop configuration.

If you do get MSI working I would be interested in hearing what made it work 
for you, so be sure to update the docs or put it on this thread.

 To create from scratch
Create an AKS cluster with the required settings.
```bash
# create an AKS cluster with pod-managed identity and Azure CNI
az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER 
--enable-managed-identity --network-plugin azure --enable-pod-identity
```

I hope that is somehow helpful.

Best of luck,

Ivan

From: DEROCCO, CHRISTOPHER
Sent: Monday, May 8, 2023 3:40 PM
To: Shammon FY
Cc: user@flink.apache.org
Subject: [EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink Apache 
Operator not working

You don't often get email from cd9...@att.com. Learn why 
this is 
important

Shammon,



I’m still having trouble setting the package in my cluster environment. I have 
these lines added to my dockerfile

mkdir ./plugins/azure-fs-hadoop

cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/

according to the flink docs here 
(https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/)
This should 

Re: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-17 Thread Surendra Singh Lilhore
Hi Derocco,

Good to hear that it is working. Let me create a Jira ticket and update the
document.

-Surendra


On Wed, May 17, 2023 at 9:29 PM DEROCCO, CHRISTOPHER  wrote:

> Surendra,
>
>
>
> Your recommended config change fixed my issue. Azure Managed Service
> Identity works for me now and I can write checkpoints to ADLSGen2 storage.
> My client id is the managed identity that is attached to the azure
> Kubernetes nodepools. For anyone else facing this issue, my configurations
> to get this working in the Kubernetes yaml are:
>
>
>
> flinkConfigurations:
>
> fs.azure.createRemoteFileSystemDuringInitialization: "true"
>
> fs.azure.account.oauth.provider.type..
> dfs.core.windows.net:
> *org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider*
>
> fs.azure.account.oauth2.msi.tenant. .
> dfs.core.windows.net: 
>
> fs.azure.account.oauth2.client.id. .
> dfs.core.windows.net: 
>
> fs.azure.account.oauth2.client.endpoint. .
> dfs.core.windows.net: https://login.microsoftonline.com/ ID>/oauth2/token
>
>
>
> Also this environment variable has to be added to the Kubernetes yaml
> configuration
>
>
>
>   containers:
>
> # Do not change the main container name
>
> - name: flink-main-container
>
>   env:
>
>   - name: ENABLE_BUILT_IN_PLUGINS
>
> value: flink-azure-fs-hadoop-1.16.1.jar
>
>
>
>
>
> This azure managed service identity configuration should be added to the
> flink docs. I couldn’t find anywhere that the
> fs.azure.account.oauth.provider.type had to be set as
> *org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider*
>
>
>
>
>
> *From:* Surendra Singh Lilhore 
> *Sent:* Tuesday, May 16, 2023 11:46 PM
> *To:* Ivan Webber 
> *Cc:* DEROCCO, CHRISTOPHER ; Shammon FY ;
> user@flink.apache.org
> *Subject:* Re: MSI Auth to Azure Storage Account with Flink Apache
> Operator not working
>
>
>
> Hi DEROCCO,
>
>
>
> Flink uses shaded jars for the Hadoop Azure Storage plugin, so in order to
> correct the ClassNotFoundException, you need to adjust the configuration.
> Please configure the MSITokenProvider as shown below.
>
>
>
> fs.azure.account.oauth.provider.type:
> *org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider*
>
>
>
> Thanks
>
> Surendra
>
>
>
>
>
> On Wed, May 17, 2023 at 5:32 AM Ivan Webber via user <
> user@flink.apache.org> wrote:
>
> When you create your cluster you probably need to ensure the following
> settings are set. I briefly looked into MSI but ended up using Azure Key
> Vault with CSI-storage driver for initial prototype (
> https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support
> 
> ).
>
>
>
> For me it helped to think about it as Hadoop configuration.
>
>
>
> If you do get MSI working I would be interested in hearing what made it
> work for you, so be sure to update the docs or put it on this thread.
>
>
>
> * To create from scratch*
>
> Create an AKS cluster with the required settings.
>
> ```bash
>
> # create an AKS cluster with pod-managed identity and Azure CNI
>
> az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER
> --enable-managed-identity --network-plugin azure --enable-pod-identity
>
> ```
>
>
>
> I hope that is somehow helpful.
>
>
>
> Best of luck,
>
>
>
> Ivan
>
>
>
> *From: *DEROCCO, CHRISTOPHER 
> *Sent: *Monday, May 8, 2023 3:40 PM
> *To: *Shammon FY 
> *Cc: *user@flink.apache.org
> *Subject: *[EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink
> Apache Operator not working
>
>
>
> You don't often get email from cd9...@att.com. Learn why this is important
> 
>
> Shammon,
>
>
>
> I’m still having trouble setting the package in my cluster environment. I 
> have these lines added to my dockerfile
>
> mkdir ./plugins/azure-fs-hadoop
>
> cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/
>
>
>
> according to the flink docs here (
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/
> 
> )
>
> This should enable the flink-azure-fs-hadoop jar in the environment which
> has the classes to 

RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-17 Thread DEROCCO, CHRISTOPHER
Surendra,

Your recommended config change fixed my issue. Azure Managed Service Identity 
works for me now and I can write checkpoints to ADLSGen2 storage. My client id 
is the managed identity that is attached to the azure Kubernetes nodepools. For 
anyone else facing this issue, my configurations to get this working in the 
Kubernetes yaml are:

flinkConfigurations:
fs.azure.createRemoteFileSystemDuringInitialization: "true"

fs.azure.account.oauth.provider.type..dfs.core.windows.net:
 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
fs.azure.account.oauth2.msi.tenant. 
.dfs.core.windows.net: 
fs.azure.account.oauth2.client.id. 
.dfs.core.windows.net: 
fs.azure.account.oauth2.client.endpoint. 
.dfs.core.windows.net: 
https://login.microsoftonline.com//oauth2/token

Also this environment variable has to be added to the Kubernetes yaml 
configuration

  containers:
# Do not change the main container name
- name: flink-main-container
  env:
  - name: ENABLE_BUILT_IN_PLUGINS
value: flink-azure-fs-hadoop-1.16.1.jar


This azure managed service identity configuration should be added to the flink 
docs. I couldn’t find anywhere that the fs.azure.account.oauth.provider.type 
had to be set as 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider


From: Surendra Singh Lilhore 
Sent: Tuesday, May 16, 2023 11:46 PM
To: Ivan Webber 
Cc: DEROCCO, CHRISTOPHER ; Shammon FY ; 
user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

Flink uses shaded jars for the Hadoop Azure Storage plugin, so in order to 
correct the ClassNotFoundException, you need to adjust the configuration. 
Please configure the MSITokenProvider as shown below.

fs.azure.account.oauth.provider.type: 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider


Thanks
Surendra


On Wed, May 17, 2023 at 5:32 AM Ivan Webber via user 
mailto:user@flink.apache.org>> wrote:
When you create your cluster you probably need to ensure the following settings 
are set. I briefly looked into MSI but ended up using Azure Key Vault with 
CSI-storage driver for initial prototype 
(https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support).

For me it helped to think about it as Hadoop configuration.

If you do get MSI working I would be interested in hearing what made it work 
for you, so be sure to update the docs or put it on this thread.

 To create from scratch
Create an AKS cluster with the required settings.
```bash
# create an AKS cluster with pod-managed identity and Azure CNI
az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER 
--enable-managed-identity --network-plugin azure --enable-pod-identity
```

I hope that is somehow helpful.

Best of luck,

Ivan

From: DEROCCO, CHRISTOPHER
Sent: Monday, May 8, 2023 3:40 PM
To: Shammon FY
Cc: user@flink.apache.org
Subject: [EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink Apache 
Operator not working

You don't often get email from cd9...@att.com. Learn why 
this is 
important

Shammon,



I’m still having trouble setting the package in my cluster environment. I have 
these lines added to my dockerfile

mkdir ./plugins/azure-fs-hadoop

cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/

according to the flink docs here 
(https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/)
This should enable the flink-azure-fs-hadoop jar in the environment which has 
the classes to enable the adls2 MSI authentication.
I also have the following dependency in my pom to add it to the FAT Jar.


org.apache.flink
flink-azure-fs-hadoop
${flink.version}


However, I still get the class not found error and the flink job is not able to 
authenticate to the azure storage account to store its checkpoints. I’m not 
sure what other configuration 

RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-17 Thread DEROCCO, CHRISTOPHER
Ivan,

How did you use Azure Key Vault with CSI because the flink operator uses a 
configmap and not a Kubernetes secret to create the flink-conf file? I have 
also tried using pod-identities as well as the new workload identity 
(https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview) to no 
avail. It seems to be an issue with configuring 
flink-azure-fs-hadoop-1.16.0.jar with using the flink operator.

From: Ivan Webber 
Sent: Tuesday, May 16, 2023 8:01 PM
To: DEROCCO, CHRISTOPHER ; Shammon FY 
Cc: user@flink.apache.org
Subject: RE: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

When you create your cluster you probably need to ensure the following settings 
are set. I briefly looked into MSI but ended up using Azure Key Vault with 
CSI-storage driver for initial prototype 
(https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support).

For me it helped to think about it as Hadoop configuration.

If you do get MSI working I would be interested in hearing what made it work 
for you, so be sure to update the docs or put it on this thread.

 To create from scratch
Create an AKS cluster with the required settings.
```bash
# create an AKS cluster with pod-managed identity and Azure CNI
az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER 
--enable-managed-identity --network-plugin azure --enable-pod-identity
```

I hope that is somehow helpful.

Best of luck,

Ivan

From: DEROCCO, CHRISTOPHER
Sent: Monday, May 8, 2023 3:40 PM
To: Shammon FY
Cc: user@flink.apache.org
Subject: [EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink Apache 
Operator not working

You don't often get email from cd9...@att.com. Learn why 
this is 
important

Shammon,



I’m still having trouble setting the package in my cluster environment. I have 
these lines added to my dockerfile

mkdir ./plugins/azure-fs-hadoop

cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/

according to the flink docs here 
(https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/)
This should enable the flink-azure-fs-hadoop jar in the environment which has 
the classes to enable the adls2 MSI authentication.
I also have the following dependency in my pom to add it to the FAT Jar.


org.apache.flink
flink-azure-fs-hadoop
${flink.version}


However, I still get the class not found error and the flink job is not able to 
authenticate to the azure storage account to store its checkpoints. I’m not 
sure what other configuration pieces I’m missing. Has anyone had successful 
with writing checkpoints to Azure ADLS2gen Storage with managed service 
identity (MSI) authentication.?



From: Shammon FY mailto:zjur...@gmail.com>>
Sent: Friday, May 5, 2023 8:38 PM
To: DEROCCO, CHRISTOPHER mailto:cd9...@att.com>>
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you can check the startup command of the job on k8s to see if the jar 
file is in the classpath.

If your job is DataStream, you need to add hadoop azure dependency in your 
project, and if it is an SQL job, you need to include this jar file in your 
Flink release package. Or you can also add this package in your cluster 
environment.

Best,
Shammon FY


On Fri, May 5, 2023 at 10:21 PM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:
How can I add the package to the flink job or check if it is there?

From: Shammon FY mailto:zjur...@gmail.com>>
Sent: Thursday, May 4, 2023 9:59 PM
To: DEROCCO, CHRISTOPHER mailto:cd9...@att.com>>
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you need to check whether there is a hadoop-azure jar file in the 
classpath of your flink job. From an error message 'Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider 

Re: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-16 Thread Surendra Singh Lilhore
Hi DEROCCO,

Flink uses shaded jars for the Hadoop Azure Storage plugin, so in order to
correct the ClassNotFoundException, you need to adjust the configuration.
Please configure the MSITokenProvider as shown below.

fs.azure.account.oauth.provider.type:
*org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider*


Thanks
Surendra


On Wed, May 17, 2023 at 5:32 AM Ivan Webber via user 
wrote:

> When you create your cluster you probably need to ensure the following
> settings are set. I briefly looked into MSI but ended up using Azure Key
> Vault with CSI-storage driver for initial prototype (
> https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support
> ).
>
>
>
> For me it helped to think about it as Hadoop configuration.
>
>
>
> If you do get MSI working I would be interested in hearing what made it
> work for you, so be sure to update the docs or put it on this thread.
>
>
>
> * To create from scratch*
>
> Create an AKS cluster with the required settings.
>
> ```bash
>
> # create an AKS cluster with pod-managed identity and Azure CNI
>
> az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER
> --enable-managed-identity --network-plugin azure --enable-pod-identity
>
> ```
>
>
>
> I hope that is somehow helpful.
>
>
>
> Best of luck,
>
>
>
> Ivan
>
>
>
> *From: *DEROCCO, CHRISTOPHER 
> *Sent: *Monday, May 8, 2023 3:40 PM
> *To: *Shammon FY 
> *Cc: *user@flink.apache.org
> *Subject: *[EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink
> Apache Operator not working
>
>
>
> You don't often get email from cd9...@att.com. Learn why this is important
> 
>
> Shammon,
>
>
>
> I’m still having trouble setting the package in my cluster environment. I 
> have these lines added to my dockerfile
>
> mkdir ./plugins/azure-fs-hadoop
>
> cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/
>
>
>
> according to the flink docs here (
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/
> )
>
> This should enable the flink-azure-fs-hadoop jar in the environment which
> has the classes to enable the adls2 MSI authentication.
>
> I also have the following dependency in my pom to add it to the FAT Jar.
>
>
>
> 
>
> org.apache.flink
>
> flink-azure-fs-hadoop
>
> ${flink.version}
>
> 
>
>
>
> However, I still get the class not found error and the flink job is not
> able to authenticate to the azure storage account to store its checkpoints.
> I’m not sure what other configuration pieces I’m missing. Has anyone had
> successful with writing checkpoints to Azure ADLS2gen Storage with managed
> service identity (MSI) authentication.?
>
>
>
>
>
>
>
> *From:* Shammon FY 
> *Sent:* Friday, May 5, 2023 8:38 PM
> *To:* DEROCCO, CHRISTOPHER 
> *Cc:* user@flink.apache.org
> *Subject:* Re: MSI Auth to Azure Storage Account with Flink Apache
> Operator not working
>
>
>
> Hi DEROCCO,
>
>
>
> I think you can check the startup command of the job on k8s to see if the
> jar file is in the classpath.
>
>
>
> If your job is DataStream, you need to add hadoop azure dependency in your
> project, and if it is an SQL job, you need to include this jar file in your
> Flink release package. Or you can also add this package in your cluster
> environment.
>
>
>
> Best,
>
> Shammon FY
>
>
>
>
>
> On Fri, May 5, 2023 at 10:21 PM DEROCCO, CHRISTOPHER 
> wrote:
>
> How can I add the package to the flink job or check if it is there?
>
>
>
> *From:* Shammon FY 
> *Sent:* Thursday, May 4, 2023 9:59 PM
> *To:* DEROCCO, CHRISTOPHER 
> *Cc:* user@flink.apache.org
> *Subject:* Re: MSI Auth to Azure Storage Account with Flink Apache
> Operator not working
>
>
>
> Hi DEROCCO,
>
>
>
> I think you need to check whether there is a hadoop-azure jar file in the
> classpath of your flink job. From an error message '*Caused by:
> java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*', your
> flink job may be missing this package.
>
>
>
> Best,
>
> Shammon FY
>
>
>
>
>
> On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER 
> wrote:
>
>
>
> I receive the error:  *Caused by: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*
>
> I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache
> Kubernetes Operator.
>
> I have the following specified in the spec.flinkConfiguration: as per the
> Apache Kubernetes operator documentation.
>
>
>
> fs.azure.createRemoteFileSystemDuringInitialization: "true"
>
> fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net
> 

RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-16 Thread Ivan Webber via user
When you create your cluster you probably need to ensure the following settings 
are set. I briefly looked into MSI but ended up using Azure Key Vault with 
CSI-storage driver for initial prototype 
(https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/aks/csi-secrets-store-driver.md#upgrade-an-existing-aks-cluster-with-azure-key-vault-provider-for-secrets-store-csi-driver-support).

For me it helped to think about it as Hadoop configuration.

If you do get MSI working I would be interested in hearing what made it work 
for you, so be sure to update the docs or put it on this thread.

 To create from scratch
Create an AKS cluster with the required settings.
```bash
# create an AKS cluster with pod-managed identity and Azure CNI
az aks create --resource-group $RESOURCE_GROUP --name $CLUSTER 
--enable-managed-identity --network-plugin azure --enable-pod-identity
```

I hope that is somehow helpful.

Best of luck,

Ivan

From: DEROCCO, CHRISTOPHER
Sent: Monday, May 8, 2023 3:40 PM
To: Shammon FY
Cc: user@flink.apache.org
Subject: [EXTERNAL] RE: MSI Auth to Azure Storage Account with Flink Apache 
Operator not working

You don't often get email from cd9...@att.com. Learn why this is 
important

Shammon,



I’m still having trouble setting the package in my cluster environment. I have 
these lines added to my dockerfile

mkdir ./plugins/azure-fs-hadoop

cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/

according to the flink docs here 
(https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/)
This should enable the flink-azure-fs-hadoop jar in the environment which has 
the classes to enable the adls2 MSI authentication.
I also have the following dependency in my pom to add it to the FAT Jar.


org.apache.flink
flink-azure-fs-hadoop
${flink.version}


However, I still get the class not found error and the flink job is not able to 
authenticate to the azure storage account to store its checkpoints. I’m not 
sure what other configuration pieces I’m missing. Has anyone had successful 
with writing checkpoints to Azure ADLS2gen Storage with managed service 
identity (MSI) authentication.?



From: Shammon FY 
Sent: Friday, May 5, 2023 8:38 PM
To: DEROCCO, CHRISTOPHER 
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you can check the startup command of the job on k8s to see if the jar 
file is in the classpath.

If your job is DataStream, you need to add hadoop azure dependency in your 
project, and if it is an SQL job, you need to include this jar file in your 
Flink release package. Or you can also add this package in your cluster 
environment.

Best,
Shammon FY


On Fri, May 5, 2023 at 10:21 PM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:
How can I add the package to the flink job or check if it is there?

From: Shammon FY mailto:zjur...@gmail.com>>
Sent: Thursday, May 4, 2023 9:59 PM
To: DEROCCO, CHRISTOPHER mailto:cd9...@att.com>>
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you need to check whether there is a hadoop-azure jar file in the 
classpath of your flink job. From an error message 'Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.', your flink 
job may be missing this package.

Best,
Shammon FY


On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:

I receive the error:  Caused by: java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.
I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache 
Kubernetes Operator.
I have the following specified in the spec.flinkConfiguration: as per the 
Apache Kubernetes operator documentation.

fs.azure.createRemoteFileSystemDuringInitialization: "true"

fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net:
 OAuth

fs.azure.account.oauth.provider.type..dfs.core.windows.net:
 org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
fs.azure.account.oauth2.msi.tenant. 
.dfs.core.windows.net:
 


RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-08 Thread DEROCCO, CHRISTOPHER
Shammon,



I’m still having trouble setting the package in my cluster environment. I have 
these lines added to my dockerfile


mkdir ./plugins/azure-fs-hadoop

cp ./opt/flink-azure-fs-hadoop-1.16.0.jar ./plugins/azure-fs-hadoop/

according to the flink docs here 
(https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/filesystems/azure/)
This should enable the flink-azure-fs-hadoop jar in the environment which has 
the classes to enable the adls2 MSI authentication.
I also have the following dependency in my pom to add it to the FAT Jar.


org.apache.flink
flink-azure-fs-hadoop
${flink.version}


However, I still get the class not found error and the flink job is not able to 
authenticate to the azure storage account to store its checkpoints. I’m not 
sure what other configuration pieces I’m missing. Has anyone had successful 
with writing checkpoints to Azure ADLS2gen Storage with managed service 
identity (MSI) authentication.?



From: Shammon FY 
Sent: Friday, May 5, 2023 8:38 PM
To: DEROCCO, CHRISTOPHER 
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you can check the startup command of the job on k8s to see if the jar 
file is in the classpath.

If your job is DataStream, you need to add hadoop azure dependency in your 
project, and if it is an SQL job, you need to include this jar file in your 
Flink release package. Or you can also add this package in your cluster 
environment.

Best,
Shammon FY


On Fri, May 5, 2023 at 10:21 PM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:
How can I add the package to the flink job or check if it is there?

From: Shammon FY mailto:zjur...@gmail.com>>
Sent: Thursday, May 4, 2023 9:59 PM
To: DEROCCO, CHRISTOPHER mailto:cd9...@att.com>>
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you need to check whether there is a hadoop-azure jar file in the 
classpath of your flink job. From an error message 'Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.', your flink 
job may be missing this package.

Best,
Shammon FY


On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:

I receive the error:  Caused by: java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.
I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache 
Kubernetes Operator.
I have the following specified in the spec.flinkConfiguration: as per the 
Apache Kubernetes operator documentation.

fs.azure.createRemoteFileSystemDuringInitialization: "true"

fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net:
 OAuth

fs.azure.account.oauth.provider.type..dfs.core.windows.net:
 org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
fs.azure.account.oauth2.msi.tenant. 
.dfs.core.windows.net:
 

fs.azure.account.oauth2.client.id.
 
.dfs.core.windows.net:
 
fs.azure.account.oauth2.client.endpoint. 
.dfs.core.windows.net:
 
https://login.microsoftonline.com//oauth2/token

I also have this specified in the container environment variables.
- name: ENABLE_BUILT_IN_PLUGINS
   value: flink-azure-fs-hadoop-1.16.1.jar

I think I’m missing a configuration step because the MsiTokenProvider class is 
not found based on the logs. Any help would be appreciated.


Chris deRocco
Senior – Cybersecurity
Chief Security Office | STORM Threat Analytics

AT
Middletown, NJ
Phone: 732-639-9342
Email: cd9...@att.com
[cid:image001.png@01D981DC.619C4600]



Re: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-05 Thread Shammon FY
Hi DEROCCO,

I think you can check the startup command of the job on k8s to see if the
jar file is in the classpath.

If your job is DataStream, you need to add hadoop azure dependency in your
project, and if it is an SQL job, you need to include this jar file in your
Flink release package. Or you can also add this package in your cluster
environment.

Best,
Shammon FY


On Fri, May 5, 2023 at 10:21 PM DEROCCO, CHRISTOPHER  wrote:

> How can I add the package to the flink job or check if it is there?
>
>
>
> *From:* Shammon FY 
> *Sent:* Thursday, May 4, 2023 9:59 PM
> *To:* DEROCCO, CHRISTOPHER 
> *Cc:* user@flink.apache.org
> *Subject:* Re: MSI Auth to Azure Storage Account with Flink Apache
> Operator not working
>
>
>
> Hi DEROCCO,
>
>
>
> I think you need to check whether there is a hadoop-azure jar file in the
> classpath of your flink job. From an error message '*Caused by:
> java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*', your
> flink job may be missing this package.
>
>
>
> Best,
>
> Shammon FY
>
>
>
>
>
> On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER 
> wrote:
>
>
>
> I receive the error:  *Caused by: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*
>
> I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache
> Kubernetes Operator.
>
> I have the following specified in the spec.flinkConfiguration: as per the
> Apache Kubernetes operator documentation.
>
>
>
> fs.azure.createRemoteFileSystemDuringInitialization: "true"
>
> fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net
> :
> OAuth
>
> fs.azure.account.oauth.provider.type..
> dfs.core.windows.net
> :
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
>
> fs.azure.account.oauth2.msi.tenant. .
> dfs.core.windows.net
> :
> 
>
> fs.azure.account.oauth2.client.id
> .
> .dfs.core.windows.net
> :
> 
>
> fs.azure.account.oauth2.client.endpoint. .
> dfs.core.windows.net
> :
> https://login.microsoftonline.com/
>  TENANT ID>/oauth2/token
>
>
>
> I also have this specified in the container environment variables.
>
> - name: ENABLE_BUILT_IN_PLUGINS
>
>value: flink-azure-fs-hadoop-1.16.1.jar
>
>
>
> I think I’m missing a configuration step because the MsiTokenProvider
> class is not found based on the logs. Any help would be appreciated.
>
>
>
>
>
> *Chris deRocco*
>
> Senior – Cybersecurity
>
> Chief Security Office | STORM Threat Analytics
>
>
>
> *AT*
>
> Middletown, NJ
>
> Phone: 732-639-9342
>
> Email: cd9...@att.com
>
>
>
>


RE: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-05 Thread DEROCCO, CHRISTOPHER
How can I add the package to the flink job or check if it is there?

From: Shammon FY 
Sent: Thursday, May 4, 2023 9:59 PM
To: DEROCCO, CHRISTOPHER 
Cc: user@flink.apache.org
Subject: Re: MSI Auth to Azure Storage Account with Flink Apache Operator not 
working

Hi DEROCCO,

I think you need to check whether there is a hadoop-azure jar file in the 
classpath of your flink job. From an error message 'Caused by: 
java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.', your flink 
job may be missing this package.

Best,
Shammon FY


On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER 
mailto:cd9...@att.com>> wrote:

I receive the error:  Caused by: java.lang.ClassNotFoundException: Class 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.
I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache 
Kubernetes Operator.
I have the following specified in the spec.flinkConfiguration: as per the 
Apache Kubernetes operator documentation.

fs.azure.createRemoteFileSystemDuringInitialization: "true"

fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net:
 OAuth

fs.azure.account.oauth.provider.type..dfs.core.windows.net:
 org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
fs.azure.account.oauth2.msi.tenant. 
.dfs.core.windows.net:
 

fs.azure.account.oauth2.client.id.
 
.dfs.core.windows.net:
 
fs.azure.account.oauth2.client.endpoint. 
.dfs.core.windows.net:
 
https://login.microsoftonline.com//oauth2/token

I also have this specified in the container environment variables.
- name: ENABLE_BUILT_IN_PLUGINS
   value: flink-azure-fs-hadoop-1.16.1.jar

I think I’m missing a configuration step because the MsiTokenProvider class is 
not found based on the logs. Any help would be appreciated.


Chris deRocco
Senior – Cybersecurity
Chief Security Office | STORM Threat Analytics

AT
Middletown, NJ
Phone: 732-639-9342
Email: cd9...@att.com
[cid:image001.png@01D97F3B.3DDB98E0]



Re: MSI Auth to Azure Storage Account with Flink Apache Operator not working

2023-05-04 Thread Shammon FY
Hi DEROCCO,

I think you need to check whether there is a hadoop-azure jar file in the
classpath of your flink job. From an error message '*Caused by:
java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*', your
flink job may be missing this package.

Best,
Shammon FY


On Fri, May 5, 2023 at 4:40 AM DEROCCO, CHRISTOPHER  wrote:

>
>
> I receive the error:  *Caused by: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider not found.*
>
> I’m using flink 1.16 running in Azure Kubernetes using the Flink Apache
> Kubernetes Operator.
>
> I have the following specified in the spec.flinkConfiguration: as per the
> Apache Kubernetes operator documentation.
>
>
>
> fs.azure.createRemoteFileSystemDuringInitialization: "true"
>
> fs.azure.account.auth.type.storageaccountname.dfs.core.windows.net:
> OAuth
>
> fs.azure.account.oauth.provider.type..
> dfs.core.windows.net:
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
>
> fs.azure.account.oauth2.msi.tenant. .
> dfs.core.windows.net: 
>
> fs.azure.account.oauth2.client.id. .
> dfs.core.windows.net: 
>
> fs.azure.account.oauth2.client.endpoint. .
> dfs.core.windows.net: https://login.microsoftonline.com/ ID>/oauth2/token
>
>
>
> I also have this specified in the container environment variables.
>
> - name: ENABLE_BUILT_IN_PLUGINS
>
>value: flink-azure-fs-hadoop-1.16.1.jar
>
>
>
> I think I’m missing a configuration step because the MsiTokenProvider
> class is not found based on the logs. Any help would be appreciated.
>
>
>
>
>
> *Chris deRocco*
>
> Senior – Cybersecurity
>
> Chief Security Office | STORM Threat Analytics
>
>
>
> *AT*
>
> Middletown, NJ
>
> Phone: 732-639-9342
>
> Email: cd9...@att.com
>
>
>