Re: [PR] Add design process section to the docs [datafusion]
comphead merged PR #16397: URL: https://github.com/apache/datafusion/pull/16397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
alamb commented on code in PR #16397: URL: https://github.com/apache/datafusion/pull/16397#discussion_r2149689081 ## docs/source/contributor-guide/index.md: ## @@ -108,6 +108,26 @@ Features above) prior to acceptance include: [extensions list]: ../library-user-guide/extensions.md [design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals +### Design Build vs. Big Up Front Design + +Typically, the DataFusion community attacks large problems by solving them bit +by bit and refining a solution iteratively on the `main` branch as a series of +Pull Requests. This is different from projects which front-load the effort +with a more comprehensive design process. + +By "advancing the front" we always make tangible progress, and the strategy is +especially effective in a project that relies on individual contributors who may +not have the time or resources to invest in a large upfront design effort. +However, this "bit by bit approach" doesn't always succeed, and sometimes we get Review Comment: I think the idea behind this sentence is to acknowledge the tradeoffs inherent in "design / build" vs "big design all upfront" (it is this tension that actually sparked the original comment in the first place)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
comphead commented on code in PR #16397: URL: https://github.com/apache/datafusion/pull/16397#discussion_r2147429880 ## docs/source/contributor-guide/index.md: ## @@ -108,6 +108,26 @@ Features above) prior to acceptance include: [extensions list]: ../library-user-guide/extensions.md [design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals +### Design Build vs. Big Up Front Design + +Typically, the DataFusion community attacks large problems by solving them bit +by bit and refining a solution iteratively on the `main` branch as a series of +Pull Requests. This is different from projects which front-load the effort +with a more comprehensive design process. + +By "advancing the front" we always make tangible progress, and the strategy is +especially effective in a project that relies on individual contributors who may +not have the time or resources to invest in a large upfront design effort. +However, this "bit by bit approach" doesn't always succeed, and sometimes we get Review Comment: ```suggestion However, this "bit by bit approach" doesn't always succeed, and sometimes we get ``` wondering if this is needed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
comphead commented on code in PR #16397: URL: https://github.com/apache/datafusion/pull/16397#discussion_r2147429475 ## docs/source/contributor-guide/index.md: ## @@ -108,6 +108,26 @@ Features above) prior to acceptance include: [extensions list]: ../library-user-guide/extensions.md [design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals +### Design Build vs. Big Up Front Design + +Typically, the DataFusion community attacks large problems by solving them bit +by bit and refining a solution iteratively on the `main` branch as a series of +Pull Requests. This is different from projects which front-load the effort +with a more comprehensive design process. + +By "advancing the front" we always make tangible progress, and the strategy is Review Comment: ```suggestion By "advancing the front" the community always makes tangible progress, and the strategy is ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
alamb commented on PR #16397: URL: https://github.com/apache/datafusion/pull/16397#issuecomment-2971466770 > Relevant question to this text and the project is what the project's stance is wrt API stability? Merging fast means you're likely to ship something a little bit too quickly every now and then. I'm not saying it's a bad strategy, just wondering how you balance the tension between stability and velocity. I would say we "try not to do API churn but it happens every release". Indeed it has come up as a challenge for downstream users, though I would say it has been less of a challenge the last 6 months or so. There is more detail here - https://github.com/apache/datafusion/issues/13525 The policy is documented here: https://datafusion.apache.org/contributor-guide/api-health.html You can get a sense of the kinds of changes required by looking at https://datafusion.apache.org/library-user-guide/upgrading.html Basically at some point I expect users of DataFusion will care enough about non breaking releases that they will want to contribute to helping make some release vehicle that has stable APIs (e.g. backport stuff to a LTS release for example) But until that happens we just keep cranking on the code and change APIs every month -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
pepijnve commented on PR #16397: URL: https://github.com/apache/datafusion/pull/16397#issuecomment-2971445102 Sorry to go a bit off topic for a sec, but there's some context I would like to add. I worked on API design of a commercial software library with tons of extension points for 10+ years where backwards compatibility of the public API was something we stuck to religiously because of the burden API breakage places on the entire user base. Doing that kind of work for an extended period of time makes you think three times about new API and all the hypothetical uses and abuses; perhaps a bit too much. Relevant question to this text and the project is what the project's stance is wrt API stability? Merging fast means you're likely to ship something a little bit too quickly every now and then. I'm not saying it's a bad strategy, just wondering how you balance the tension between stability and velocity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
alamb commented on PR #16397: URL: https://github.com/apache/datafusion/pull/16397#issuecomment-2971392100 > This is really nice, thanks @alamb! Thanks -- I was just channeling @ozankabak :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add design process section to the docs [datafusion]
alamb commented on code in PR #16397: URL: https://github.com/apache/datafusion/pull/16397#discussion_r2145138376 ## docs/source/contributor-guide/index.md: ## @@ -108,6 +108,26 @@ Features above) prior to acceptance include: [extensions list]: ../library-user-guide/extensions.md [design goal]: https://docs.rs/datafusion/latest/datafusion/index.html#design-goals +### Design Build vs. Big Up Front Design + +Typically, the DataFusion community attacks large problems by solving them bit +by bit and refining a solution iteratively on the `main` branch as a series of +Pull Requests. This is different from projects which front-load the effort +with a more comprehensive design process. + +By "advancing the front" we always make tangible progress, and the strategy is +especially effective in a project that relies on individual contributors who may +not have the time or resources to invest in a large upfront design effort. +However, this "bit by bit approach" doesn't always succeed, and sometimes we get +stuck or go down the wrong path and then change directions. + +Our process necessarily results in imperfect solutions being the "state of the +code" in some cases, and larger visions are not yet fully realized. However, the +community is good at driving things to completion in the long run. If you see +something that needs improvement or an area that is not yet fully realized, +please consider submitting an issue or PR to improve it. We are always looking +for more contributions. Review Comment: Of course we always have to be 🎣 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[PR] Add design process section to the docs [datafusion]
alamb opened a new pull request, #16397: URL: https://github.com/apache/datafusion/pull/16397 ## Which issue does this PR close? - part of #7013 ## Rationale for this change While discussing a design for cancellation with @pepijnve and @zhuqi-lucas and myself, @ozankabak wrote a great summary of how the DataFusion community handles larger projects: - https://github.com/apache/datafusion/pull/16196#issuecomment-2956513724 > Look, I see that you are trying to help and we do want to take it. I suspect we might be facing a "culture" challenge here: Typically, DF community attacks large problems by solving them bit by bit and refining a solution iteratively. This is unlike some other projects which front-load the effort by going through a more comprehensive design process. We also do that for some tasks where this iterative approach is not applicable, but it is not very common. > > This "bit by bit approach" doesn't always succeed, every now and then it happens that we get stuck or go down the wrong path for a while, and then change tacks. However, we still typically prefer to "advance the front" and make progress in tangible ways as much as we can (if we see a way). This necessarily results in imperfect solutions being the "state of the code" in some cases, and they survive in the codebase for a while, but we are good at driving things to completion in the long run. I really liked that description and think it captures well the current state of the project, and thus would be valuable to make part of the docs ## What changes are included in this PR? Add a description of the design process to the Datafusion docs site ## Are these changes tested? By CI ## Are there any user-facing changes? New docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
