Hello, We're from a team within Google Cloud Platform focused on OSS and data technologies, especially Hadoop (and Spark.) Before we cut a JIRA for something we’d like to do, we wanted to reach out to this list to ask a two quick questions, describe our proposed action, and check for any major objections.
Proposed action: We have a Hadoop connector[1] (more info[2]) for Google Cloud Storage (GCS) which we have been building and maintaining for some time. After we clean up our code and tests to conform (to these[3] and other requirements) we would like to contribute it to Hadoop. We have many customers using the connector in high-throughput production Hadoop clusters; we’d like to make it easier and faster to use Hadoop and GCS. Timeline: Presently, we are working on the beta of Google Cloud Dataproc[4] which limits our time a bit, so we’re targeting late Q1 2016 for creating a JIRA issue and adapting our connector code as needed. Our (quick) questions: * Do we need to take any (non-coding) action for this beyond submitting a JIRA when we are ready? * Are there any up-front concerns or questions which we can (or will need to) address? Thank you! James Malone On behalf of the Google Big Data OSS Engineering Team Links: [1] - https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs [2] - https://cloud.google.com/hadoop/google-cloud-storage-connector [3] - https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs [4] - https://cloud.google.com/dataproc