Sneha Vijayarajan created HADOOP-19178:
------------------------------------------

             Summary: WASB Driver Deprecation and eventual removal
                 Key: HADOOP-19178
                 URL: https://issues.apache.org/jira/browse/HADOOP-19178
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/azure
    Affects Versions: 3.4.0
            Reporter: Sneha Vijayarajan
            Assignee: Sneha Vijayarajan
             Fix For: 3.4.1


*WASB Driver*

WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder operations 
hence are mimicked at client side by WASB driver and certain folder operations 
like Rename and Delete can lead to lot of IOPs with client-side enumeration and 
orchestration of rename/delete operation blob by blob. It was not ideal for 
other APIs too as initial checks for path is a file or folder needs to be done 
over multiple metadata calls. These led to a degraded performance.

 

To provide better service to Analytics customers, Microsoft released ADLS Gen2 
which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. ABFS 
driver was designed to overcome the inherent deficiencies of WASB and customers 
were informed to migrate to ABFS driver.

 

*Customers who still use the legacy WASB driver and the challenges they face* 

Some of our customers have not migrated to the ABFS driver yet and continue to 
use the legacy WASB driver with FNS accounts.  

These customers face the following challenges: 
 *  They cannot leverage the optimizations and benefits of the ABFS driver.
 *  They need to deal with the compatibility issues should the files and 
folders were modified with the legacy WASB driver and the ABFS driver 
concurrently in a phased transition situation.
 *  There are differences for supported features for FNS and HNS over ABFS 
Driver
 *  In certain cases, they must perform a significant amount of re-work on 
their workloads to migrate to the ABFS driver, which is available only on HNS 
enabled accounts in a fully tested and supported scenario.

 ** 

*Deprecation plans for WASB* 

We are introducing a new feature that will enable the ABFS driver to support 
FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
enable customers to use the ABFS driver to interact with data stored in GPv2 
(General Purpose v2) storage accounts. 

With this feature, the customers who still use the legacy WASB driver will be 
able to migrate to the ABFS driver without much re-work on their workloads. 
They will however need to change the URIs from the WASB scheme to the ABFS 
scheme. 

Once ABFS driver has built FNS support capability to migrate WASB customers, 
WASB driver will be declared deprecated in OSS documentation and marked for 
removal in next major release. This will remove any ambiguity for new customer 
onboards as there will be only one Microsoft driver for Azure Storage and 
migrating customers will get SLA bound support for driver and service, which 
was not guaranteed over WASB.

 We anticipate that this feature will serve as a stepping stone for customers 
to move to HNS enabled accounts with the ABFS driver, which is our recommended 
stack for big data analytics on ADLS Gen2. 

*Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
account) with ABFS driver* *?*

This feature does not impact the existing customers who are using ADLS Gen2 
(HNS enabled account) with ABFS driver. 

They do not need to make any changes to their workloads or configurations. They 
will still enjoy the benefits of HNS, such as atomic operations, fine-grained 
access control, scalability, and performance. 

*Official recommendation*

Microsoft continues to recommend all Big Data and Analytics customers to use 
Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
optimize this scenario in future, we believe that this new option will help all 
those customers to transition to a supported scenario immediately, while they 
plan to ultimately move to ADLS Gen2 (HNS enabled account).

 *New Authentication options that a WASB to ABFS Driver migrating customer will 
get*

Below auth types that WASB provides will continue to work on the new FNS over 
ABFS Driver over configuration that accepts these SAS types (similar to WASB)
 * SharedKey
 * Account SAS
 * Service/Container SAS

Below authentication types that were not supported by WASB driver but supported 
by ABFS driver will continue to be available for new FNS over ABFS Driver
 * OAuth 2.0 Client Credentials
 * OAuth 2.0: Refresh Token
 * Azure Managed Identity
 * Custom OAuth 2.0 Token Provider

 

ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS and 
Directly SAS will continue to work only for HNS accounts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to