[ 
https://issues.apache.org/jira/browse/ARROW-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083433#comment-17083433
 ] 

Wes McKinney edited comment on ARROW-8451 at 4/14/20, 4:57 PM:
---------------------------------------------------------------

The issues cited are orthogonal to whether or not DataFusion is developed as 
part of Apache Arrow.

* **longer build times**: DataFusion need not be built as part of the core Rust 
Arrow library
* **more frequent updates (creates noise)**: don't understand this point. If it 
relates to builds then these can be built separately
* **its roadmap can be quite independent of that of Arrow**: don't understand 
this either. The project contains many components and there are dozens of 
active developers working toward agendas that are focused in different 
programming languages. I see no problem with different components of the 
problem evolving as there is community will

Apache projects are primarily about community and governance. We have a saying 
"Community over Code". Issues relating to the code (which are what cited here) 
are not a good reason to fracture or fragment the community, or cause a part of 
the community to leave the Apache project. The project's open, community 
governance model attracts contributors and enterprises to the project because 
they know that the PMC has an obligation to protect them and their 
contributions from bad actors. 

If the DataFusion code is a nuisance to Rust core users, then its build process 
can be made more separate from the core project. 


was (Author: wesmckinn):
The issues cited are orthogonal to whether or not DataFusion is developed as 
part of Apache Arrow.

* **longer build times**: DataFusion need not be built as part of the core Rust 
Arrow library
* **more frequent updates (creates noise)**: don't understand this point. If it 
relates to builds then these can be built separately
* **its roadmap can be quite independent of that of Arrow**: don't understand 
this either. The project contains many components and there are dozens of 
active developers working toward agendas that are focused in different 
programming languages. I see no problem with different components of the 
problem evolving as there is community will

Apache projects are primarily about community and governance. We have a saying 
"Community over Code". Issues relating to the code (which are what cited here) 
are not a good reason to fracture or fragment the community, or cause a part of 
the community to leave the Apache projects. The project's open, community 
governance model attracts contributors and enterprises to the project because 
they know that the PMC has an obligation to protect them and their 
contributions from bad actors. 

If the DataFusion code is a nuisance to Rust core users, then its build process 
can be made more separate from the core project. 

> [Rust] [Datafusion] Why is DataFusion part of the Arrow repo?
> -------------------------------------------------------------
>
>                 Key: ARROW-8451
>                 URL: https://issues.apache.org/jira/browse/ARROW-8451
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: Rust - DataFusion
>            Reporter: Remi Dettai
>            Assignee: Andy Grove
>            Priority: Minor
>
> Datafusion is a great example of how to use Arrow. But having Datafusion 
> inside the Arrow project has several drawbacks:
>  * longer build times (rust build already slooooow)
>  * more frequent updates (creates noise)
>  * its roadmap can be quite independent of that of Arrow
> What is the actual benefit of having Datafusion inside the Arrow repo?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to