(datafusion-comet) branch asf-site updated: Publish built docs triggered by 0b33b051ddab43d188e3637b635fed18330bccc5

github-bot Tue, 28 Oct 2025 10:23:59 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 682c9a222 Publish built docs triggered by 
0b33b051ddab43d188e3637b635fed18330bccc5
682c9a222 is described below

commit 682c9a2228ae44b30e845af97c4f3e2563f58c39
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Tue Oct 28 17:22:37 2025 +0000

    Publish built docs triggered by 0b33b051ddab43d188e3637b635fed18330bccc5
---
 _sources/user-guide/latest/compatibility.md.txt | 4 ++++
 _sources/user-guide/latest/tuning.md.txt        | 6 ++++++
 searchindex.js                                  | 2 +-
 user-guide/latest/compatibility.html            | 3 +++
 user-guide/latest/tuning.html                   | 6 ++++++
 5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/_sources/user-guide/latest/compatibility.md.txt 
b/_sources/user-guide/latest/compatibility.md.txt
index 562baabfd..6c3bab59d 100644
--- a/_sources/user-guide/latest/compatibility.md.txt
+++ b/_sources/user-guide/latest/compatibility.md.txt
@@ -97,6 +97,10 @@ because they are handled well in Spark (e.g., 
`SQLOrderingUtil.compareFloats`).
 functions of arrow-rs used by DataFusion do not normalize NaN and zero (e.g., 
[arrow::compute::kernels::cmp::eq](https://docs.rs/arrow/latest/arrow/compute/kernels/cmp/fn.eq.html#)).
 So Comet will add additional normalization expression of NaN and zero for 
comparison.
 
+Sorting on floating-point data types (or complex types containing 
floating-point values) is not compatible with 
+Spark if the data contains both zero and negative zero. This is likely an edge 
case that is not of concern for many users
+and sorting on floating-point data can be enabled by setting 
`spark.comet.expression.SortOrder.allowIncompatible=true`.
+
 There is a known bug with using count(distinct) within aggregate queries, 
where each NaN value will be counted
 separately [#1824](https://github.com/apache/datafusion-comet/issues/1824).
 
diff --git a/_sources/user-guide/latest/tuning.md.txt 
b/_sources/user-guide/latest/tuning.md.txt
index cc0109526..21b1df652 100644
--- a/_sources/user-guide/latest/tuning.md.txt
+++ b/_sources/user-guide/latest/tuning.md.txt
@@ -100,6 +100,12 @@ Comet Performance
 
 It may be possible to reduce Comet's memory overhead by reducing batch sizes 
or increasing number of partitions.
 
+## Optimizing Sorting on Floating-Point Values
+
+Sorting on floating-point data types (or complex types containing 
floating-point values) is not compatible with
+Spark if the data contains both zero and negative zero. This is likely an edge 
case that is not of concern for many users
+and sorting on floating-point data can be enabled by setting 
`spark.comet.expression.SortOrder.allowIncompatible=true`.
+
 ## Optimizing Joins
 
 Spark often chooses `SortMergeJoin` over `ShuffledHashJoin` for stability 
reasons. If the build-side of a
diff --git a/searchindex.js b/searchindex.js
index e3cf0355d..cd5455146 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Install Comet": [[16, "install-comet"]], 
"2. Clone Spark and Apply Diff": [[16, "clone-spark-and-apply-diff"]], "3. Run 
Spark SQL Tests": [[16, "run-spark-sql-tests"]], "ANSI Mode": [[19, 
"ansi-mode"], [32, "ansi-mode"], [72, "ansi-mode"]], "ANSI mode": [[45, 
"ansi-mode"], [58, "ansi-mode"]], "API Differences Between Spark Versions": 
[[3, "api-differences-between-spark-versions"]], "ASF Links": [[2, null], [2, 
null]], "Accelerating Apache Iceberg Parque [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Install Comet": [[16, "install-comet"]], 
"2. Clone Spark and Apply Diff": [[16, "clone-spark-and-apply-diff"]], "3. Run 
Spark SQL Tests": [[16, "run-spark-sql-tests"]], "ANSI Mode": [[19, 
"ansi-mode"], [32, "ansi-mode"], [72, "ansi-mode"]], "ANSI mode": [[45, 
"ansi-mode"], [58, "ansi-mode"]], "API Differences Between Spark Versions": 
[[3, "api-differences-between-spark-versions"]], "ASF Links": [[2, null], [2, 
null]], "Accelerating Apache Iceberg Parque [...]
\ No newline at end of file
diff --git a/user-guide/latest/compatibility.html 
b/user-guide/latest/compatibility.html
index ee6f8181d..b55c99b35 100644
--- a/user-guide/latest/compatibility.html
+++ b/user-guide/latest/compatibility.html
@@ -548,6 +548,9 @@ However, one exception is comparison. Spark does not 
normalize NaN and zero when
 because they are handled well in Spark (e.g., <code class="docutils literal 
notranslate"><span class="pre">SQLOrderingUtil.compareFloats</span></code>). 
But the comparison
 functions of arrow-rs used by DataFusion do not normalize NaN and zero (e.g., 
<a class="reference external" 
href="https://docs.rs/arrow/latest/arrow/compute/kernels/cmp/fn.eq.html#";>arrow::compute::kernels::cmp::eq</a>).
 So Comet will add additional normalization expression of NaN and zero for 
comparison.</p>
+<p>Sorting on floating-point data types (or complex types containing 
floating-point values) is not compatible with
+Spark if the data contains both zero and negative zero. This is likely an edge 
case that is not of concern for many users
+and sorting on floating-point data can be enabled by setting <code 
class="docutils literal notranslate"><span 
class="pre">spark.comet.expression.SortOrder.allowIncompatible=true</span></code>.</p>
 <p>There is a known bug with using count(distinct) within aggregate queries, 
where each NaN value will be counted
 separately <a class="reference external" 
href="https://github.com/apache/datafusion-comet/issues/1824";>#1824</a>.</p>
 </section>
diff --git a/user-guide/latest/tuning.html b/user-guide/latest/tuning.html
index caac28b3d..952dc3e9e 100644
--- a/user-guide/latest/tuning.html
+++ b/user-guide/latest/tuning.html
@@ -523,6 +523,12 @@ providing better performance than Spark for half the 
resource</p></li>
 <p>It may be possible to reduce Comet’s memory overhead by reducing batch 
sizes or increasing number of partitions.</p>
 </section>
 </section>
+<section id="optimizing-sorting-on-floating-point-values">
+<h2>Optimizing Sorting on Floating-Point Values<a class="headerlink" 
href="#optimizing-sorting-on-floating-point-values" title="Link to this 
heading">#</a></h2>
+<p>Sorting on floating-point data types (or complex types containing 
floating-point values) is not compatible with
+Spark if the data contains both zero and negative zero. This is likely an edge 
case that is not of concern for many users
+and sorting on floating-point data can be enabled by setting <code 
class="docutils literal notranslate"><span 
class="pre">spark.comet.expression.SortOrder.allowIncompatible=true</span></code>.</p>
+</section>
 <section id="optimizing-joins">
 <h2>Optimizing Joins<a class="headerlink" href="#optimizing-joins" title="Link 
to this heading">#</a></h2>
 <p>Spark often chooses <code class="docutils literal notranslate"><span 
class="pre">SortMergeJoin</span></code> over <code class="docutils literal 
notranslate"><span class="pre">ShuffledHashJoin</span></code> for stability 
reasons. If the build-side of a


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion-comet) branch asf-site updated: Publish built docs triggered by 0b33b051ddab43d188e3637b635fed18330bccc5

Reply via email to