Hi all, I would like to propose a new and more community friendly governance model for community contributed and maintained extensions for the datafusion project.
Over the last year, many datafusion extensions have been proposed and created by the community including the java binding, s3 and hdfs[1] object storage implementations, etc. Right now these code are or will be hosted in individual github namespaces due to the following reasons: * Most of these extensions are not considered part of the Datafusion core, so the current maintainers prefer to not have them managed in the main repository. The current python binding and ballista code base is already adding a decent amount of overhead to our development process. Adding more dependent crates will slow us down further without much upside. * Considering the overhead of the official Apache release process, current Datafusion PMCs don't have the bandwidth to manage individual releases for these extensions. All of the authors of these extensions are not Arrow PMC members, so they won't have the access to drive the Apache releases by themselves. Therefore, I am proposing that we create an unofficial shared Github organization to host these Datafusion contrib type projects that are only maintained by non-PMC community members. I think this is strictly better than hosting these extensions projects in personal github namespaces. If any of these extensions end up getting significant involvements or interests from Datafusion committers, then we can promote them into official projects and provide official Apache style release support. Other alternatives I have considered are: * Keep these projects under personal namespaces and only link them in Datafusion's documentation. * Manage these extensions using experimental repos. But as far as I know, the code owners still need to be a PMC member in order to perform crates.io releases and it's not intended for long running projects without no goal for eventual archival. * Create a dedicated mono repo named apache/datafusion-contrib to host these extensions. However, this approach also requires PMC members to get involved for crates.io releases if I understand it correctly. Am I curious if this is something that could be done under the Apache governance model? My main goal is to create an unofficial incubator type space for community members to develop and collaborate on extensions that may or may not be adopted as official extensions in the future. [1]: https://github.com/apache/arrow-datafusion/pull/1223 Thanks, QP