[ 
https://issues.apache.org/jira/browse/HADOOP-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sneha Vijayarajan updated HADOOP-19178:
---------------------------------------
    Description: 
*WASB Driver*

WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder operations 
hence are mimicked at client side by WASB driver and certain folder operations 
like Rename and Delete can lead to lot of IOPs with client-side enumeration and 
orchestration of rename/delete operation blob by blob. It was not ideal for 
other APIs too as initial checks for path is a file or folder needs to be done 
over multiple metadata calls. These led to a degraded performance.

To provide better service to Analytics customers, Microsoft released ADLS Gen2 
which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. ABFS 
driver was designed to overcome the inherent deficiencies of WASB and customers 
were informed to migrate to ABFS driver.

*Customers who still use the legacy WASB driver and the challenges they face* 

Some of our customers have not migrated to the ABFS driver yet and continue to 
use the legacy WASB driver with FNS accounts.  

These customers face the following challenges: 
 * They cannot leverage the optimizations and benefits of the ABFS driver.
 * They need to deal with the compatibility issues should the files and folders 
were modified with the legacy WASB driver and the ABFS driver concurrently in a 
phased transition situation.
 * There are differences for supported features for FNS and HNS over ABFS Driver
 * In certain cases, they must perform a significant amount of re-work on their 
workloads to migrate to the ABFS driver, which is available only on HNS enabled 
accounts in a fully tested and supported scenario.

*Deprecation plans for WASB*

We are introducing a new feature that will enable the ABFS driver to support 
FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
enable customers to use the ABFS driver to interact with data stored in GPv2 
(General Purpose v2) storage accounts. 

With this feature, the customers who still use the legacy WASB driver will be 
able to migrate to the ABFS driver without much re-work on their workloads. 
They will however need to change the URIs from the WASB scheme to the ABFS 
scheme. 

Once ABFS driver has built FNS support capability to migrate WASB customers, 
WASB driver will be declared deprecated in OSS documentation and marked for 
removal in next major release. This will remove any ambiguity for new customer 
onboards as there will be only one Microsoft driver for Azure Storage and 
migrating customers will get SLA bound support for driver and service, which 
was not guaranteed over WASB.

 We anticipate that this feature will serve as a stepping stone for customers 
to move to HNS enabled accounts with the ABFS driver, which is our recommended 
stack for big data analytics on ADLS Gen2. 

*Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
account) with ABFS driver* *?*

This feature does not impact the existing customers who are using ADLS Gen2 
(HNS enabled account) with ABFS driver.

They do not need to make any changes to their workloads or configurations. They 
will still enjoy the benefits of HNS, such as atomic operations, fine-grained 
access control, scalability, and performance. 

*Official recommendation*

Microsoft continues to recommend all Big Data and Analytics customers to use 
Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
optimize this scenario in future, we believe that this new option will help all 
those customers to transition to a supported scenario immediately, while they 
plan to ultimately move to ADLS Gen2 (HNS enabled account).

 *New Authentication options that a WASB to ABFS Driver migrating customer will 
get*

Below auth types that WASB provides will continue to work on the new FNS over 
ABFS Driver over configuration that accepts these SAS types (similar to WASB)
 * SharedKey
 * Account SAS
 * Service/Container SAS

Below authentication types that were not supported by WASB driver but supported 
by ABFS driver will continue to be available for new FNS over ABFS Driver
 * OAuth 2.0 Client Credentials
 * OAuth 2.0: Refresh Token
 * Azure Managed Identity
 * Custom OAuth 2.0 Token Provider

ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS and 
Directly SAS will continue to work only for HNS accounts.

  was:
*WASB Driver*

WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder operations 
hence are mimicked at client side by WASB driver and certain folder operations 
like Rename and Delete can lead to lot of IOPs with client-side enumeration and 
orchestration of rename/delete operation blob by blob. It was not ideal for 
other APIs too as initial checks for path is a file or folder needs to be done 
over multiple metadata calls. These led to a degraded performance.

 

To provide better service to Analytics customers, Microsoft released ADLS Gen2 
which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. ABFS 
driver was designed to overcome the inherent deficiencies of WASB and customers 
were informed to migrate to ABFS driver.

 

*Customers who still use the legacy WASB driver and the challenges they face* 

Some of our customers have not migrated to the ABFS driver yet and continue to 
use the legacy WASB driver with FNS accounts.  

These customers face the following challenges: 
 *  They cannot leverage the optimizations and benefits of the ABFS driver.
 *  They need to deal with the compatibility issues should the files and 
folders were modified with the legacy WASB driver and the ABFS driver 
concurrently in a phased transition situation.
 *  There are differences for supported features for FNS and HNS over ABFS 
Driver
 *  In certain cases, they must perform a significant amount of re-work on 
their workloads to migrate to the ABFS driver, which is available only on HNS 
enabled accounts in a fully tested and supported scenario.

 ** 

*Deprecation plans for WASB* 

We are introducing a new feature that will enable the ABFS driver to support 
FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
enable customers to use the ABFS driver to interact with data stored in GPv2 
(General Purpose v2) storage accounts. 

With this feature, the customers who still use the legacy WASB driver will be 
able to migrate to the ABFS driver without much re-work on their workloads. 
They will however need to change the URIs from the WASB scheme to the ABFS 
scheme. 

Once ABFS driver has built FNS support capability to migrate WASB customers, 
WASB driver will be declared deprecated in OSS documentation and marked for 
removal in next major release. This will remove any ambiguity for new customer 
onboards as there will be only one Microsoft driver for Azure Storage and 
migrating customers will get SLA bound support for driver and service, which 
was not guaranteed over WASB.

 We anticipate that this feature will serve as a stepping stone for customers 
to move to HNS enabled accounts with the ABFS driver, which is our recommended 
stack for big data analytics on ADLS Gen2. 

*Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
account) with ABFS driver* *?*

This feature does not impact the existing customers who are using ADLS Gen2 
(HNS enabled account) with ABFS driver. 

They do not need to make any changes to their workloads or configurations. They 
will still enjoy the benefits of HNS, such as atomic operations, fine-grained 
access control, scalability, and performance. 

*Official recommendation*

Microsoft continues to recommend all Big Data and Analytics customers to use 
Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
optimize this scenario in future, we believe that this new option will help all 
those customers to transition to a supported scenario immediately, while they 
plan to ultimately move to ADLS Gen2 (HNS enabled account).

 *New Authentication options that a WASB to ABFS Driver migrating customer will 
get*

Below auth types that WASB provides will continue to work on the new FNS over 
ABFS Driver over configuration that accepts these SAS types (similar to WASB)
 * SharedKey
 * Account SAS
 * Service/Container SAS

Below authentication types that were not supported by WASB driver but supported 
by ABFS driver will continue to be available for new FNS over ABFS Driver
 * OAuth 2.0 Client Credentials
 * OAuth 2.0: Refresh Token
 * Azure Managed Identity
 * Custom OAuth 2.0 Token Provider

 

ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS and 
Directly SAS will continue to work only for HNS accounts.


> WASB Driver Deprecation and eventual removal
> --------------------------------------------
>
>                 Key: HADOOP-19178
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19178
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.4.0
>            Reporter: Sneha Vijayarajan
>            Assignee: Sneha Vijayarajan
>            Priority: Major
>             Fix For: 3.4.1
>
>
> *WASB Driver*
> WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
> accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder 
> operations hence are mimicked at client side by WASB driver and certain 
> folder operations like Rename and Delete can lead to lot of IOPs with 
> client-side enumeration and orchestration of rename/delete operation blob by 
> blob. It was not ideal for other APIs too as initial checks for path is a 
> file or folder needs to be done over multiple metadata calls. These led to a 
> degraded performance.
> To provide better service to Analytics customers, Microsoft released ADLS 
> Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. 
> ABFS driver was designed to overcome the inherent deficiencies of WASB and 
> customers were informed to migrate to ABFS driver.
> *Customers who still use the legacy WASB driver and the challenges they face* 
> Some of our customers have not migrated to the ABFS driver yet and continue 
> to use the legacy WASB driver with FNS accounts.  
> These customers face the following challenges: 
>  * They cannot leverage the optimizations and benefits of the ABFS driver.
>  * They need to deal with the compatibility issues should the files and 
> folders were modified with the legacy WASB driver and the ABFS driver 
> concurrently in a phased transition situation.
>  * There are differences for supported features for FNS and HNS over ABFS 
> Driver
>  * In certain cases, they must perform a significant amount of re-work on 
> their workloads to migrate to the ABFS driver, which is available only on HNS 
> enabled accounts in a fully tested and supported scenario.
> *Deprecation plans for WASB*
> We are introducing a new feature that will enable the ABFS driver to support 
> FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
> enable customers to use the ABFS driver to interact with data stored in GPv2 
> (General Purpose v2) storage accounts. 
> With this feature, the customers who still use the legacy WASB driver will be 
> able to migrate to the ABFS driver without much re-work on their workloads. 
> They will however need to change the URIs from the WASB scheme to the ABFS 
> scheme. 
> Once ABFS driver has built FNS support capability to migrate WASB customers, 
> WASB driver will be declared deprecated in OSS documentation and marked for 
> removal in next major release. This will remove any ambiguity for new 
> customer onboards as there will be only one Microsoft driver for Azure 
> Storage and migrating customers will get SLA bound support for driver and 
> service, which was not guaranteed over WASB.
>  We anticipate that this feature will serve as a stepping stone for customers 
> to move to HNS enabled accounts with the ABFS driver, which is our 
> recommended stack for big data analytics on ADLS Gen2. 
> *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
> account) with ABFS driver* *?*
> This feature does not impact the existing customers who are using ADLS Gen2 
> (HNS enabled account) with ABFS driver.
> They do not need to make any changes to their workloads or configurations. 
> They will still enjoy the benefits of HNS, such as atomic operations, 
> fine-grained access control, scalability, and performance. 
> *Official recommendation*
> Microsoft continues to recommend all Big Data and Analytics customers to use 
> Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
> optimize this scenario in future, we believe that this new option will help 
> all those customers to transition to a supported scenario immediately, while 
> they plan to ultimately move to ADLS Gen2 (HNS enabled account).
>  *New Authentication options that a WASB to ABFS Driver migrating customer 
> will get*
> Below auth types that WASB provides will continue to work on the new FNS over 
> ABFS Driver over configuration that accepts these SAS types (similar to WASB)
>  * SharedKey
>  * Account SAS
>  * Service/Container SAS
> Below authentication types that were not supported by WASB driver but 
> supported by ABFS driver will continue to be available for new FNS over ABFS 
> Driver
>  * OAuth 2.0 Client Credentials
>  * OAuth 2.0: Refresh Token
>  * Azure Managed Identity
>  * Custom OAuth 2.0 Token Provider
> ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS 
> and Directly SAS will continue to work only for HNS accounts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to