This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new 8df75c3f04 Document guidance on how to evaluate breaking API changes
(#20584)
8df75c3f04 is described below
commit 8df75c3f043722f989e1936566543595f02801aa
Author: Andrew Lamb <[email protected]>
AuthorDate: Sat Feb 28 09:16:28 2026 -0500
Document guidance on how to evaluate breaking API changes (#20584)
## Which issue does this PR close?
## Rationale for this change
DataFusion does make API changes from time to time, and that is a normal
part of software development. However, it is important to evaluate the
impact of those API changes on downstream users and to ensure that the
benefits of the change are clear to those users.
I found a few times where API changes were made with the justification
that "some APIs in DataFusion are cleaner" or "this is more consistent
with other APIs". While those may be valid justifications, it is painful
for downstream users who have change their code to accommodate the API
change when they get nothing in return
This most recently happened in this PR
-
https://github.com/apache/datafusion/pull/19790#pullrequestreview-3863480182
thus I think the contributor guide should include some guidance on how
to evaluate breaking API changes and to ensure that the benefits of the
change are clear to downstream users.
## What changes are included in this PR?
Polish up the API guidance section
## Are these changes tested?
By CI
## Are there any user-facing changes?
Better / clearer docs
---
docs/source/contributor-guide/api-health.md | 71 ++++++++++++++++++++---------
1 file changed, 50 insertions(+), 21 deletions(-)
diff --git a/docs/source/contributor-guide/api-health.md
b/docs/source/contributor-guide/api-health.md
index ec9314ee82..f950c7cc0b 100644
--- a/docs/source/contributor-guide/api-health.md
+++ b/docs/source/contributor-guide/api-health.md
@@ -19,39 +19,68 @@
# API health policy
-DataFusion is used extensively as a library and has a large public API, thus it
-is important that the API is well maintained. In general, we try to minimize
-breaking API changes, but they are sometimes necessary.
+DataFusion is used extensively as a library in other applications and has a
+large public API. We try to keep the API well maintained and minimize breaking
+changes to avoid issues for downstream users.
-When possible, rather than making breaking API changes, we prefer to deprecate
-APIs to give users time to adjust to the changes.
+## Breaking API Changes
-## Upgrade Guides
-
-When making changes that require DataFusion users to make changes to their code
-as part of an upgrade please consider adding documentation to the version
-specific [Upgrade Guide]
-
-[upgrade guide]: ../library-user-guide/upgrading/index
+### What is the public API and what is a breaking API change?
-## Breaking Changes
-
-In general, a function is part of the public API if it appears on the [docs.rs
page]
+In general, an item is part of the public API if it appears on the [docs.rs
page].
Breaking public API changes are those that _require_ users to change their code
for it to compile and execute, and are listed as "Major Changes" in the [SemVer
-Compatibility Section of the cargo book]. Common examples of breaking changes:
+Compatibility Section of the Cargo Book]. Common examples of breaking changes
include:
- Adding new required parameters to a function (`foo(a: i32, b: i32)` ->
`foo(a: i32, b: i32, c: i32)`)
- Removing a `pub` function
- Changing the return type of a function
+- Adding a new function to a `trait` without a default implementation
+
+Examples of non-breaking changes include:
+
+- Marking a function as deprecated (`#[deprecated]`)
+- Adding a new function to a `trait` with a default implementation
+
+### When to make breaking API changes?
+
+When possible, we prefer to avoid making breaking API changes. One common way
to
+avoid such changes is to deprecate the old API, as described in the
[Deprecation
+Guidelines](#deprecation-guidelines) section below.
+
+If you do want to propose a breaking API change, we must weigh the benefits of
the
+change with the cost (impact on downstream users). It is often frustrating for
+downstream users to change their applications, and it is even more so if they
+do not gain improved capabilities.
+
+Examples of good reasons for making a breaking API change include:
-When making breaking public API changes, please add the `api-change` label to
-the PR so we can highlight the changes in the release notes.
+- The change allows new use cases that were not possible before
+- The change significantly enables improved performance
+
+Examples of potentially weak reasons for making breaking API changes include:
+
+- The change is an internal refactor to make DataFusion more consistent
+- The change is to remove an API that is not widely used but has not been
marked as deprecated
+
+### What to do when making breaking API changes?
+
+When making breaking public API changes, please:
+
+1. Add the `api-change` label to the PR so we can highlight the changes in the
release notes.
+2. Consider adding documentation to the version-specific [Upgrade Guide] if
the required changes are non-trivial.
[docs.rs page]: https://docs.rs/datafusion/latest/datafusion/index.html
[semver compatibility section of the cargo book]:
https://doc.rust-lang.org/cargo/reference/semver.html#change-categories
+## Upgrade Guides
+
+When a change requires DataFusion users to modify their code as part of an
+upgrade, please consider documenting it in the version-specific [Upgrade
Guide].
+
+[upgrade guide]: ../library-user-guide/upgrading/index.rst
+
## Deprecation Guidelines
When deprecating a method:
@@ -59,8 +88,8 @@ When deprecating a method:
- Mark the API as deprecated using `#[deprecated]` and specify the exact
DataFusion version in which it was deprecated
- Concisely describe the preferred API to help the user transition
-The deprecated version is the next version which contains the deprecation. For
-example, if the current version listed in [`Cargo.toml`] is `43.0.0` then the
next
+The deprecated version is the next version that introduces the deprecation. For
+example, if the current version listed in [`Cargo.toml`] is `43.0.0`, then the
next
version will be `44.0.0`.
[`cargo.toml`]: https://github.com/apache/datafusion/blob/main/Cargo.toml
@@ -76,4 +105,4 @@ pub fn api_to_deprecated(a: usize, b: usize) {}
Deprecated methods will remain in the codebase for a period of 6 major
versions or 6 months, whichever is longer, to provide users ample time to
transition away from them.
-Please refer to [DataFusion
releases](https://crates.io/crates/datafusion/versions) to plan ahead API
migration
+Please refer to [DataFusion
releases](https://crates.io/crates/datafusion/versions) to plan API migration
ahead of time.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]