[
https://issues.apache.org/jira/browse/HADOOP-19343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17901218#comment-17901218
]
Steve Loughran commented on HADOOP-19343:
-----------------------------------------
[~anujmodi] [~arunchacko]
I don't want you to think your offer is being dismissed -only that given the
stability and quality of the Google connector, I don't think the effort is
justified. That is effort in the original code base, all the testing which will
be required, maintaining and continually testing the code – and fielding
escalations.
* Mukund and I support production services deployed in GCS and google's
connector has been very stable.
* They release at a faster rate than Hadoop does, so can be more agile in
features and fixes.
* we gain from all the testing they do.
Getting broad and deep test coverage of the cloud connectors is a real
challenge. Those of us are working on the ABFS connector run locally or in
Azure, with Microsoft engineers and build system doing the majority of it.
For S3A, it is split between Amazon engineers and ourselves.
I don't know about the cloud provider engineers, but we test the code through
Hive, Spark, HBase, Impala and more, working with data in ORC, Parquet, Iceberg
files/tables, along with other things. And, because we ship, we get to discover
one of the under-tested troublespots: the rare and usually transient failures
of the remote store services highlight where both the libraries we use and our
own code lack resilience. HADOOP-19317 is an example of this.
Putting this altogether then, I just do not want to see this in the codebase.
Do not feel disappointed. I am confident there were things you could've done
that would've been different from Google solutions and could contribute. What
I'm going to suggest is that you actually look at the Google connector and it's
outstanding issues and think about how it can be improved. I know it is
google's project, but it is a very small team and I'm sure they would welcome
extra contributions.
> Add native support for GCS connector
> ------------------------------------
>
> Key: HADOOP-19343
> URL: https://issues.apache.org/jira/browse/HADOOP-19343
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Affects Versions: 3.5.0
> Reporter: Abhishek Modi
> Priority: Major
> Attachments: GCS connector for Hadoop.pdf
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]