andygrove commented on code in PR #3076:
URL: https://github.com/apache/datafusion-comet/pull/3076#discussion_r2710900388


##########
docs/source/user-guide/latest/compatibility.md:
##########
@@ -69,6 +69,31 @@ this can be overridden by setting 
`spark.comet.regexp.allowIncompatible=true`.
 Comet's support for window functions is incomplete and known to be incorrect. 
It is disabled by default and
 should not be used in production. The feature will be enabled in a future 
release. Tracking issue: 
[#2721](https://github.com/apache/datafusion-comet/issues/2721).
 
+## Round-Robin Partitioning
+
+Comet's native shuffle implementation of round-robin partitioning 
(`df.repartition(n)`) is not compatible with
+Spark's implementation and is disabled by default. It can be enabled by setting
+`spark.comet.native.shuffle.partitioning.roundrobin.enabled=true`.
+
+**Why the incompatibility exists:**
+
+Spark's round-robin partitioning sorts rows by their binary `UnsafeRow` 
representation before assigning them to
+partitions. This ensures deterministic output for fault tolerance (task 
retries produce identical results).
+Comet uses Arrow format internally, which has a completely different binary 
layout than `UnsafeRow`, making it
+impossible to match Spark's exact partition assignments.
+
+**Comet's approach:**
+
+Instead of true round-robin assignment, Comet implements round-robin as hash 
partitioning on ALL columns. This
+achieves the same semantic goals:
+
+- **Even distribution**: Rows are distributed evenly across partitions

Review Comment:
   This documentation needs updating. In some cases, there would be good 
distribution, but this is not guaranteed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to