Hi Nicholas,

Your point

"In SQL, the result order of any query is implementation-dependent without
an explicit ORDER BY clause. Technically, you could run `SELECT * FROM
table;` 10 times in a row and get 10 different orderings."

yes I concur my understanding is the same.

In SQL, the result order of any query is implementation-dependent without
an explicit ORDER BY clause. Basically this means that the database engine
is free to return the results in any order that it sees fit. This is
because SQL does not guarantee a specific order for results unless an ORDER
BY clause is used.

HTH

Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 18 Sept 2023 at 16:58, Reynold Xin <r...@databricks.com.invalid>
wrote:

> It should be the same as SQL. Otherwise it takes away a lot of potential
> future optimization opportunities.
>
>
> On Mon, Sep 18 2023 at 8:47 AM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> I’ve always considered DataFrames to be logically equivalent to SQL
>> tables or queries.
>>
>> In SQL, the result order of any query is implementation-dependent without
>> an explicit ORDER BY clause. Technically, you could run `SELECT * FROM
>> table;` 10 times in a row and get 10 different orderings.
>>
>> I thought the same applied to DataFrames, but the docstring for the
>> recently added method DataFrame.offset
>> <https://github.com/apache/spark/pull/40873/files#diff-4ff57282598a3b9721b8d6f8c2fea23a62e4bc3c0f1aa5444527549d1daa38baR1293-R1301>
>>  implies
>> otherwise.
>>
>> This example will work fine in practice, of course. But if DataFrames are
>> technically unordered without an explicit ordering clause, then in theory a
>> future implementation change may result in “Bob" being the “first” row in
>> the DataFrame, rather than “Tom”. That would make the example incorrect.
>>
>> Is that not the case?
>>
>> Nick
>>
>

Reply via email to