I’ve always considered DataFrames to be logically equivalent to SQL tables or queries.
In SQL, the result order of any query is implementation-dependent without an explicit ORDER BY clause. Technically, you could run `SELECT * FROM table;` 10 times in a row and get 10 different orderings. I thought the same applied to DataFrames, but the docstring for the recently added method DataFrame.offset <https://github.com/apache/spark/pull/40873/files#diff-4ff57282598a3b9721b8d6f8c2fea23a62e4bc3c0f1aa5444527549d1daa38baR1293-R1301> implies otherwise. This example will work fine in practice, of course. But if DataFrames are technically unordered without an explicit ordering clause, then in theory a future implementation change may result in “Bob" being the “first” row in the DataFrame, rather than “Tom”. That would make the example incorrect. Is that not the case? Nick