[
https://issues.apache.org/jira/browse/HADOOP-19343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905010#comment-17905010
]
Arunkumar C commented on HADOOP-19343:
--------------------------------------
Thank you [[email protected]] [~mthakur] for your feedback.
I am a Google employee and is part of the GCS Connector team.
What we will be contributing is a stripped down and simplified version with
feature parity of the existing open source GCS connector. If we end up merging
the change, this will be become the primary HCFS Connector for GCS and we will
sunset the existing connector. This will be simplified version since the
existing connector has two separate implementations based on two different SDKs
from GCS.
[JSON]([https://github.com/googleapis/google-api-java-client-services/tree/main/clients/google-api-services-storage/v1])
and [Java Storage Client]([https://github.com/googleapis/java-storage).] Java
Storage Client has richer features and we get several features (e.g. retries)
for free, thereby simplifying GCS connector. The implementation using GCS Java
Storage client is already there is the Open source GCS connector. We were
thinking that we will refactor and take only that part and contribute.
I agree that there is risk of bugs since this is new code. Please note that we
are not adding any addition risk from GCS connector perspective since we anyway
plan to do this refactoring (i.e. removing implementation based of older GCS
client) in the open source if this plan does not work out. We will reduce the
risk by:
# Making use of our existing benchmarking/testing frameworks to catch bugs. I
think this itself should get us a good quality connector.
# Making sure that we have 80% plus code coverage using unit tests
# Making sure that all the integration test scenarios in the OSS GCS connector
are tested.
CC: [~abmodi] [~cnauroth]
> Add native support for GCS connector
> ------------------------------------
>
> Key: HADOOP-19343
> URL: https://issues.apache.org/jira/browse/HADOOP-19343
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Affects Versions: 3.5.0
> Reporter: Abhishek Modi
> Priority: Major
> Attachments: GCS connector for Hadoop.pdf
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]