lsm1 commented on PR #5591:
URL: https://github.com/apache/kyuubi/pull/5591#issuecomment-1853401462
> Some question:
>
> I wonder that, If the result is order needed, if we save the result into
files and then read from when client fetching result, the result returned to
users is not ordered as expected.
1. spark save ordered result to multiple `part-X` files in the filesystem,
in order of the keys
https://github.com/lsm1/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala#L47
```
org.apache.spark.rdd.OrderedRDDFunctions#sortByKey
/**
* Sort the RDD by key, so that each partition contains a sorted range
of the elements. Calling
* `collect` or `save` on the resulting RDD will return or output an
ordered list of records
* (in the `save` case, they will be written to multiple `part-X` files
in the filesystem, in
* order of the keys).
*/
```
2. When fetchOrcStatement read file, it will be sorted by file name, so it
will return ordered result
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]