[
https://issues.apache.org/jira/browse/HADOOP-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-19178.
-------------------------------------
Fix Version/s: 3.3.9
3.5.0
Assignee: Anuj Modi (was: Sneha Vijayarajan)
Resolution: Fixed
> WASB Driver Deprecation and eventual removal
> --------------------------------------------
>
> Key: HADOOP-19178
> URL: https://issues.apache.org/jira/browse/HADOOP-19178
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.4.0
> Reporter: Sneha Vijayarajan
> Assignee: Anuj Modi
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> *WASB Driver*
> WASB driver was developed to support FNS (FlatNameSpace) Azure Storage
> accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder
> operations hence are mimicked at client side by WASB driver and certain
> folder operations like Rename and Delete can lead to lot of IOPs with
> client-side enumeration and orchestration of rename/delete operation blob by
> blob. It was not ideal for other APIs too as initial checks for path is a
> file or folder needs to be done over multiple metadata calls. These led to a
> degraded performance.
> To provide better service to Analytics customers, Microsoft released ADLS
> Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store.
> ABFS driver was designed to overcome the inherent deficiencies of WASB and
> customers were informed to migrate to ABFS driver.
> *Customers who still use the legacy WASB driver and the challenges they face*
> Some of our customers have not migrated to the ABFS driver yet and continue
> to use the legacy WASB driver with FNS accounts.
> These customers face the following challenges:
> * They cannot leverage the optimizations and benefits of the ABFS driver.
> * They need to deal with the compatibility issues should the files and
> folders were modified with the legacy WASB driver and the ABFS driver
> concurrently in a phased transition situation.
> * There are differences for supported features for FNS and HNS over ABFS
> Driver
> * In certain cases, they must perform a significant amount of re-work on
> their workloads to migrate to the ABFS driver, which is available only on HNS
> enabled accounts in a fully tested and supported scenario.
> *Deprecation plans for WASB*
> We are introducing a new feature that will enable the ABFS driver to support
> FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will
> enable customers to use the ABFS driver to interact with data stored in GPv2
> (General Purpose v2) storage accounts.
> With this feature, the customers who still use the legacy WASB driver will be
> able to migrate to the ABFS driver without much re-work on their workloads.
> They will however need to change the URIs from the WASB scheme to the ABFS
> scheme.
> Once ABFS driver has built FNS support capability to migrate WASB customers,
> WASB driver will be declared deprecated in OSS documentation and marked for
> removal in next major release. This will remove any ambiguity for new
> customer onboards as there will be only one Microsoft driver for Azure
> Storage and migrating customers will get SLA bound support for driver and
> service, which was not guaranteed over WASB.
> We anticipate that this feature will serve as a stepping stone for customers
> to move to HNS enabled accounts with the ABFS driver, which is our
> recommended stack for big data analytics on ADLS Gen2.
> *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled
> account) with ABFS driver* *?*
> This feature does not impact the existing customers who are using ADLS Gen2
> (HNS enabled account) with ABFS driver.
> They do not need to make any changes to their workloads or configurations.
> They will still enjoy the benefits of HNS, such as atomic operations,
> fine-grained access control, scalability, and performance.
> *Official recommendation*
> Microsoft continues to recommend all Big Data and Analytics customers to use
> Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to
> optimize this scenario in future, we believe that this new option will help
> all those customers to transition to a supported scenario immediately, while
> they plan to ultimately move to ADLS Gen2 (HNS enabled account).
> *New Authentication options that a WASB to ABFS Driver migrating customer
> will get*
> Below auth types that WASB provides will continue to work on the new FNS over
> ABFS Driver over configuration that accepts these SAS types (similar to WASB)
> * SharedKey
> * Account SAS
> * Service/Container SAS
> Below authentication types that were not supported by WASB driver but
> supported by ABFS driver will continue to be available for new FNS over ABFS
> Driver
> * OAuth 2.0 Client Credentials
> * OAuth 2.0: Refresh Token
> * Azure Managed Identity
> * Custom OAuth 2.0 Token Provider
> ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS
> and Directly SAS will continue to work only for HNS accounts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]