[ https://issues.apache.org/jira/browse/ARROW-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083433#comment-17083433 ]
Wes McKinney edited comment on ARROW-8451 at 4/14/20, 4:57 PM: --------------------------------------------------------------- The issues cited are orthogonal to whether or not DataFusion is developed as part of Apache Arrow. * **longer build times**: DataFusion need not be built as part of the core Rust Arrow library * **more frequent updates (creates noise)**: don't understand this point. If it relates to builds then these can be built separately * **its roadmap can be quite independent of that of Arrow**: don't understand this either. The project contains many components and there are dozens of active developers working toward agendas that are focused in different programming languages. I see no problem with different components of the problem evolving as there is community will Apache projects are primarily about community and governance. We have a saying "Community over Code". Issues relating to the code (which are what cited here) are not a good reason to fracture or fragment the community, or cause a part of the community to leave the Apache project. The project's open, community governance model attracts contributors and enterprises to the project because they know that the PMC has an obligation to protect them and their contributions from bad actors. If the DataFusion code is a nuisance to Rust core users, then its build process can be made more separate from the core project. was (Author: wesmckinn): The issues cited are orthogonal to whether or not DataFusion is developed as part of Apache Arrow. * **longer build times**: DataFusion need not be built as part of the core Rust Arrow library * **more frequent updates (creates noise)**: don't understand this point. If it relates to builds then these can be built separately * **its roadmap can be quite independent of that of Arrow**: don't understand this either. The project contains many components and there are dozens of active developers working toward agendas that are focused in different programming languages. I see no problem with different components of the problem evolving as there is community will Apache projects are primarily about community and governance. We have a saying "Community over Code". Issues relating to the code (which are what cited here) are not a good reason to fracture or fragment the community, or cause a part of the community to leave the Apache projects. The project's open, community governance model attracts contributors and enterprises to the project because they know that the PMC has an obligation to protect them and their contributions from bad actors. If the DataFusion code is a nuisance to Rust core users, then its build process can be made more separate from the core project. > [Rust] [Datafusion] Why is DataFusion part of the Arrow repo? > ------------------------------------------------------------- > > Key: ARROW-8451 > URL: https://issues.apache.org/jira/browse/ARROW-8451 > Project: Apache Arrow > Issue Type: Wish > Components: Rust - DataFusion > Reporter: Remi Dettai > Assignee: Andy Grove > Priority: Minor > > Datafusion is a great example of how to use Arrow. But having Datafusion > inside the Arrow project has several drawbacks: > * longer build times (rust build already slooooow) > * more frequent updates (creates noise) > * its roadmap can be quite independent of that of Arrow > What is the actual benefit of having Datafusion inside the Arrow repo? -- This message was sent by Atlassian Jira (v8.3.4#803005)