Hi,

Thanks for reaching out. It would be great to see this in the Hadoop ecosystem.

In Hadoop we have AWS S3 support. IMO they address similar use cases
thus I think that it should be relatively straightforward to adopt the
code.

The only catch in my head right now is to properly isolate dependency.
Not only the code needs to be put into a separate module, but many
Hadoop applications also depend on different versions of Guava. I
think it might be a problem that needs some attentions at the very
beginning.

Please feel free to reach out if you have any other questions.

Regards,
Haohui


On Mon, Dec 7, 2015 at 2:35 PM, James Malone
<jamesmal...@google.com.invalid> wrote:
> Hello,
>
> We're from a team within Google Cloud Platform focused on OSS and data
> technologies, especially Hadoop (and Spark.) Before we cut a JIRA for
> something we’d like to do, we wanted to reach out to this list to ask a two
> quick questions, describe our proposed action, and check for any major
> objections.
>
> Proposed action:
> We have a Hadoop connector[1] (more info[2]) for Google Cloud Storage (GCS)
> which we have been building and maintaining for some time. After we clean
> up our code and tests to conform (to these[3] and other requirements) we
> would like to contribute it to Hadoop. We have many customers using the
> connector in high-throughput production Hadoop clusters; we’d like to make
> it easier and faster to use Hadoop and GCS.
>
> Timeline:
> Presently, we are working on the beta of Google Cloud Dataproc[4] which
> limits our time a bit, so we’re targeting late Q1 2016 for creating a JIRA
> issue and adapting our connector code as needed.
>
> Our (quick) questions:
> * Do we need to take any (non-coding) action for this beyond submitting a
> JIRA when we are ready?
> * Are there any up-front concerns or questions which we can (or will need
> to) address?
>
> Thank you!
>
> James Malone
> On behalf of the Google Big Data OSS Engineering Team
>
> Links:
> [1] - https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs
> [2] - https://cloud.google.com/hadoop/google-cloud-storage-connector
> [3] - https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs
> [4] - https://cloud.google.com/dataproc

Reply via email to