Hi Nicholas, Your point
"In SQL, the result order of any query is implementation-dependent without an explicit ORDER BY clause. Technically, you could run `SELECT * FROM table;` 10 times in a row and get 10 different orderings." yes I concur my understanding is the same. In SQL, the result order of any query is implementation-dependent without an explicit ORDER BY clause. Basically this means that the database engine is free to return the results in any order that it sees fit. This is because SQL does not guarantee a specific order for results unless an ORDER BY clause is used. HTH Mich Talebzadeh, Distinguished Technologist, Solutions Architect & Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 18 Sept 2023 at 16:58, Reynold Xin <r...@databricks.com.invalid> wrote: > It should be the same as SQL. Otherwise it takes away a lot of potential > future optimization opportunities. > > > On Mon, Sep 18 2023 at 8:47 AM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> I’ve always considered DataFrames to be logically equivalent to SQL >> tables or queries. >> >> In SQL, the result order of any query is implementation-dependent without >> an explicit ORDER BY clause. Technically, you could run `SELECT * FROM >> table;` 10 times in a row and get 10 different orderings. >> >> I thought the same applied to DataFrames, but the docstring for the >> recently added method DataFrame.offset >> <https://github.com/apache/spark/pull/40873/files#diff-4ff57282598a3b9721b8d6f8c2fea23a62e4bc3c0f1aa5444527549d1daa38baR1293-R1301> >> implies >> otherwise. >> >> This example will work fine in practice, of course. But if DataFrames are >> technically unordered without an explicit ordering clause, then in theory a >> future implementation change may result in “Bob" being the “first” row in >> the DataFrame, rather than “Tom”. That would make the example incorrect. >> >> Is that not the case? >> >> Nick >> >