alamb opened a new issue, #16886:
URL: https://github.com/apache/datafusion/issues/16886

   ### Is your feature request related to a problem or challenge?
   
   
   @viirya says in 
https://github.com/apache/datafusion/issues/16800#issuecomment-3084789737:
   
   > Sometimes, I feel that some important proposals in DataFusion lack 
sufficient context, or that the relevant context is scattered across various 
issues and PR comments. This makes it difficult to fully understand the 
proposals or to trace their motivations and evaluate their soundness. As a 
result, we sometimes see large PRs — hundreds or even thousands of lines — that 
are based on these proposals, making the review process even more challenging. 
Only the author or those who were involved in the initial discussions seem to 
be in a position to effectively review them.
   >
   > For example, Spark has the SPIP (Spark Project Improvement Proposal) 
mechanism, where contributors submit formal documents for review when proposing 
significant changes. These documents typically consolidate the technical 
details, motivation, and background of the proposal into a single place. This 
approach helps the community better understand and participate in discussions 
around major changes.
   >
   > I wonder if it would be beneficial for DataFusion to adopt a similar 
lightweight proposal process for major design changes — something that allows 
ideas and context to be collected and reviewed before implementation begins. It 
could help improve transparency, facilitate broader community involvement, and 
make the review process more accessible.
   >
   > If the full SPIP process — including voting and formal approval — feels 
too heavy or unnecessary for our context, perhaps we could at least establish a 
lightweight template for major change proposals. This template could include 
sections for motivation, background, technical details, and other relevant 
context. Having a consistent format would make it easier for the community to 
follow and engage with significant design discussions.
   
   My opinions:
   1. Finding the outstanding proposals and discussions is difficult. They are 
all public but there is lots of them going on
   2. The context for proposals is often scattered across issues and PRs
   3. It is hard to know when "enough" communication has been done for a 
proposal to move forward and when it needs more work
   5. Improving the communication around major changes is becoming more 
important as the project grows and we have more users and contributors
   
   For example, there are several recent discussions that could benefit from 
this mor formalproposal process, including but not limited to the discussion 
itself above
   - https://github.com/apache/datafusion/issues/16800 itself (along with 
actually this one(
   - https://github.com/apache/datafusion/pull/16625 from @findepi
   - https://github.com/apache/datafusion/issues/13704 in general , and 
https://github.com/apache/datafusion/issues/13704#issuecomment-3109180176 
recently with @berkasynnada
   - https://github.com/apache/datafusion/issues/16841 from @gabotechs,
   - https://github.com/apache/datafusion/issues/16677 with @findepi
   
   
   
   ### Describe the solution you'd like
   
   Some sort of "process" that
   
   1. Makes it easy to find outstanding community improvement proposals 
   2. Makes it easy to know the steps to create a new improvement proposal
   3. Is documented
   
   ### Describe alternatives you've considered
   
   Here is a strawman (for discussion) proposal:
   
   1. Add a new tag in the DataFusion repo ("DIP - DataFusion Improvement 
Proposal")
   2. Add a new 
[ISSUE_TEMPLATE](https://github.com/apache/datafusion/tree/main/.github/ISSUE_TEMPLATE)
 for proposals issues based on the SPIP one and current DataFusion issue 
template
   3. Add a section to the site documentation describing the process
   
   I personally worry that DataFusion is not at a point I where formal voting / 
formal approval would add a lot of value, but I do think formalizing the 
proposal format and making them easier to find would be beneficial. 
   
   I propose starting with more formalization around the communication of 
proposals and we can add more explicitly approval / consensus standards if and 
when they become necessary.
   
   ### Additional context
   
   
   Here is the documentation for the spark process: The 
https://spark.apache.org/improvement-proposals.html
   
   I looked through the [list of 
SPIPs](https://issues.apache.org/jira/browse/SPARK-51162?jql=project%20%3D%20SPARK%20AND%20status%20in%20(Open%2C%20Reopened%2C%20%22In%20Progress%22)%20AND%20(labels%20%3D%20SPIP%20OR%20summary%20~%20%22SPIP%22)%20ORDER%20BY%20createdDate%20DESC)
 in Spark and the few I looked at  didn't have huge amounts of discussion. They 
often linked to a google doc with more details.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to