This is an automated email from the ASF dual-hosted git repository. alamb pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push: new a2f84f3a8d Docs: Consolidate feature proposal content into roadmap (#17156) a2f84f3a8d is described below commit a2f84f3a8d3b939cb824ebd51bfae611df25e1e5 Author: Andrew Lamb <and...@nerdnetworks.org> AuthorDate: Fri Aug 15 13:36:02 2025 -0700 Docs: Consolidate feature proposal content into roadmap (#17156) * Docs: Consolidate feature proposal content into roadmap * demote to proper headings --- docs/source/contributor-guide/index.md | 71 ++----------------------------- docs/source/contributor-guide/roadmap.md | 72 +++++++++++++++++++++++++++++++- 2 files changed, 73 insertions(+), 70 deletions(-) diff --git a/docs/source/contributor-guide/index.md b/docs/source/contributor-guide/index.md index a4e3e4cb94..383827893c 100644 --- a/docs/source/contributor-guide/index.md +++ b/docs/source/contributor-guide/index.md @@ -37,14 +37,15 @@ You can find how to setup build and testing environment [here](https://datafusio ## Finding and Creating Issues to Work On You can find a curated [good-first-issue] list to help you get started. +You can read about how we plan larger projects in the [Roadmap and Improvement Proposals](roadmap.md) section. [good-first-issue]: https://github.com/apache/datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 ### Open Contribution and Assigning tickets DataFusion is an open contribution project, and thus there is no particular -project imposed deadline for completing any issue or any restriction on who can -work on an issue, nor how many people can work on an issue at the same time. +project imposed deadline for completing issues or restrictions on who can +work on an issue, nor limits to how many people can work on an issue at the same time. Contributors drive the project forward based on their own priorities and interests and thus you are free to work on any issue that interests you. @@ -62,72 +63,6 @@ unable to make progress you should unassign the issue by using the `unassign me` link at the top of the issue page (and ask for help if are stuck) so that someone else can get involved in the work. -### Discussing New Features - -If you plan to work on a new feature that doesn't have an existing ticket, it is -a good idea to open a ticket to discuss the feature. Advanced discussion often -helps avoid wasted effort by determining early if the feature is a good fit for -DataFusion before too much time is invested. Discussion on a ticket can help -gather feedback from the community and is likely easier to discuss than a 1000 -line PR. - -If you open a ticket and it doesn't get any response, you can try `@`-mentioning -recently active community members in the ticket to get their attention. - -### What Contributions are Good Fits? - -DataFusion is designed to be highly extensible, and many features can be -implemented as extensions without changes or additions to the core. Support for -new functions, data formats, and similar functionality can be added using those -extension APIs, and there are already many existing community supported -extensions listed in the [extensions list]. - -Query engines are complex pieces of software to develop and maintain. Given our -limited maintenance bandwidth, we try to keep the DataFusion core as simple and -focused as possible, while still satisfying the [design goal] of an easy to -start initial experience. - -With that in mind, contributions that meet the following criteria are more likely -to be accepted: - -1. Bug fixes for existing features -2. Test coverage for existing features -3. Documentation improvements / examples -4. Performance improvements to existing features (with benchmarks) -5. "Small" functional improvements to existing features (if they don't change existing behavior) -6. Additional APIs for extending DataFusion's capabilities -7. CI improvements - -Contributions that will likely involve more discussion (see Discussing New -Features above) prior to acceptance include: - -1. Major new functionality (even if it is part of the "standard SQL") -2. New functions, especially if they aren't part of "standard SQL" -3. New data sources (e.g. support for Apache ORC) - -[extensions list]: ../library-user-guide/extensions.md -[design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals - -### Design Build vs. Big Up Front Design - -Typically, the DataFusion community attacks large problems by solving them bit -by bit and refining a solution iteratively on the `main` branch as a series of -Pull Requests. This is different from projects which front-load the effort -with a more comprehensive design process. - -By "advancing the front" the community always makes tangible progress, and the strategy is -especially effective in a project that relies on individual contributors who may -not have the time or resources to invest in a large upfront design effort. -However, this "bit by bit approach" doesn't always succeed, and sometimes we get -stuck or go down the wrong path and then change directions. - -Our process necessarily results in imperfect solutions being the "state of the -code" in some cases, and larger visions are not yet fully realized. However, the -community is good at driving things to completion in the long run. If you see -something that needs improvement or an area that is not yet fully realized, -please consider submitting an issue or PR to improve it. We are always looking -for more contributions. - # Developer's guide ## Pull Request Overview diff --git a/docs/source/contributor-guide/roadmap.md b/docs/source/contributor-guide/roadmap.md index 79add1b86f..7dad7fe5e9 100644 --- a/docs/source/contributor-guide/roadmap.md +++ b/docs/source/contributor-guide/roadmap.md @@ -17,7 +17,7 @@ specific language governing permissions and limitations under the License. --> -# Roadmap +# Roadmap and Improvement Proposals The [project introduction](../user-guide/introduction) explains the overview and goals of DataFusion, and our development efforts largely @@ -44,7 +44,7 @@ make review efficient and avoid surprises. [The current list of `EPIC`s can be found here](https://github.com/apache/datafusion/issues?q=is%3Aissue+is%3Aopen+epic). -# Quarterly Roadmap +## Quarterly Roadmap The DataFusion roadmap is driven by the priorities of contributors rather than any single organization or coordinating committee. We typically discuss our @@ -56,3 +56,71 @@ For more information: 1. [Search for issues labeled `roadmap`](https://github.com/apache/datafusion/issues?q=is%3Aissue%20%20%20roadmap) 2. [DataFusion Road Map: Q3-Q4 2025](https://github.com/apache/datafusion/issues/15878) 3. [2024 Q4 / 2025 Q1 Roadmap](https://github.com/apache/datafusion/issues/13274) + +## Improvement Proposals + +### Discussing New Features + +If you plan to work on a new feature that doesn't have an existing ticket, it is +a good idea to open a ticket to discuss the feature. Advanced discussion often +helps avoid wasted effort by determining early if the feature is a good fit for +DataFusion before too much time is invested. Discussion on a ticket can help +gather feedback from the community and is likely easier to discuss than a 1000 +line PR. + +If you open a ticket and it doesn't get any response, you can try `@`-mentioning +recently active community members in the ticket to get their attention. + +### What Contributions are Good Fits? + +DataFusion is designed to be highly extensible, and many features can be +implemented as extensions without changes or additions to the core. Support for +new functions, data formats, and similar functionality can be added using those +extension APIs, and there are already many existing community supported +extensions listed in the [extensions list]. + +Query engines are complex pieces of software to develop and maintain. Given our +limited maintenance bandwidth, we try to keep the DataFusion core as simple and +focused as possible, while still satisfying the [design goal] of an easy to +start initial experience. + +With that in mind, contributions that meet the following criteria are more likely +to be accepted: + +1. Bug fixes for existing features +2. Test coverage for existing features +3. Documentation improvements / examples +4. Performance improvements to existing features (with benchmarks) +5. "Small" functional improvements to existing features (if they don't change existing behavior) +6. Additional APIs for extending DataFusion's capabilities +7. CI improvements + +Contributions that will likely involve more discussion (see Discussing New +Features above) prior to acceptance include: + +1. Major new functionality (even if it is part of the "standard SQL") +2. New functions, especially if they aren't part of "standard SQL" +3. New data sources (e.g. support for Apache ORC) + +[extensions list]: ../library-user-guide/extensions.md +[design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals + +### Design Build vs. Big Up Front Design + +Typically, the DataFusion community attacks large problems by solving them bit +by bit and refining a solution iteratively on the `main` branch as a series of +Pull Requests. This is different from projects which front-load the effort +with a more comprehensive design process. + +By "advancing the front" the community always makes tangible progress, and the strategy is +especially effective in a project that relies on individual contributors who may +not have the time or resources to invest in a large upfront design effort. +However, this "bit by bit approach" doesn't always succeed, and sometimes we get +stuck or go down the wrong path and then change directions. + +Our process necessarily results in imperfect solutions being the "state of the +code" in some cases, and larger visions are not yet fully realized. However, the +community is good at driving things to completion in the long run. If you see +something that needs improvement or an area that is not yet fully realized, +please consider submitting an issue or PR to improve it. We are always looking +for more contributions. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org