dzcxzl created SPARK-44240: ------------------------------ Summary: Setting the topKSortFallbackThreshold value may lead to inaccurate results Key: SPARK-44240 URL: https://issues.apache.org/jira/browse/SPARK-44240 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0, 3.3.0, 3.2.0, 3.1.0, 3.0.0, 2.4.0 Reporter: dzcxzl
{code:java} set spark.sql.execution.topKSortFallbackThreshold=10000; SELECT min(id) FROM ( SELECT id FROM range(999999999) ORDER BY id LIMIT 10000) a; {code} If GlobalLimitExec is not the final operator, shuffle read does not guarantee the order, which leads to the limit read data that may be random. TakeOrderedAndProjectExec has ordering, so there is no such problem. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org