[ 
https://issues.apache.org/jira/browse/HADOOP-19343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17901218#comment-17901218
 ] 

Steve Loughran commented on HADOOP-19343:
-----------------------------------------

[~anujmodi] [~arunchacko]

I don't want you to think your offer is being dismissed -only that given the 
stability and quality of the Google connector, I don't think the effort is 
justified. That is effort in the original code base, all the testing which will 
be required, maintaining and continually testing the code – and fielding 
escalations.

* Mukund and I support production services deployed in GCS and google's 
connector has been very stable.
* They release at a faster rate than Hadoop does, so can be more agile in 
features and fixes.
* we gain from all the testing they do.

Getting broad and deep test coverage of the cloud connectors is a real 
challenge. Those of us are working on the ABFS connector run locally or in 
Azure, with Microsoft engineers and build system doing the majority of it.

For S3A, it is split between Amazon engineers and ourselves. 

I don't know about the cloud provider engineers, but we test the code through 
Hive, Spark, HBase, Impala and more, working with data in ORC, Parquet, Iceberg 
files/tables, along with other things. And, because we ship, we get to discover 
one of the under-tested troublespots: the rare and usually transient failures 
of the remote store services highlight where both the libraries we use and our 
own code lack resilience. HADOOP-19317 is an example of this.

Putting this altogether then, I just do not want to see this in the codebase.

Do not feel disappointed. I am confident there were things you could've done 
that would've been different from Google solutions and could contribute. What 
I'm going to suggest is that you actually look at the Google connector and it's 
outstanding issues and think about how it can be improved. I know it is 
google's project, but it is a very small team and I'm sure they would welcome 
extra contributions.


> Add native support for GCS connector
> ------------------------------------
>
>                 Key: HADOOP-19343
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19343
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 3.5.0
>            Reporter: Abhishek Modi
>            Priority: Major
>         Attachments: GCS connector for Hadoop.pdf
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to