Am 30.05.2008 um 12:55 schrieb Jens Miltner:

>
> Am 30.05.2008 um 12:45 schrieb Jens Miltner:
>
>> Hi,
>>
>> I've got a question and I don't know exactly whether the behavior  
>> is standard or expected or not.
>>
>> Assuming the following schema and data:
>> CREATE TABLE t1 (id, name);
>> INSERT INTO t1 values (1, 'John');
>> INSERT INTO t1 values (2, 'Arnold');
>> INSERT INTO t1 values (3, 'Zork');
>>
>> We have some queries that do roughly this:
>>
>> CREATE VIEW v1 AS SELECT id FROM t1 ORDER BY name;
>> CREATE TABLE t2 AS SELECT id FROM v1;
>> SELECT id from t2;
>>
>> This produces the ids in the expected order:
>>
>> 2
>> 1
>> 3
>>
>>
>> However, when fetching DISTINCT values, the values are returned in  
>> sorted order, not in the 'native' order in the table:
>>
>> SELECT DISTINCT id from t2;
>>
>> 1
>> 2
>> 3
>>
>>
>>
>> Up until version 3.5.1 (well actually that was the last version  
>> before 3.5.9 that we used), sqlite would return the distinct values  
>> in the same order as in the query without the DISTINCT keyword.
>>
>> I'm suspecting that there is nothing that guarantees the order in  
>> which results are returned unless I explicitely specify an "ORDER  
>> BY" term, so this may be according to the standards, but it's  
>> nonetheless a change that hurts us.
>> What we did so far was to create a temporary table that contained  
>> record IDs in a specific order, e.g.
>>
>> CREATE TEMP TABLE foo AS SELECT record_id FROM bar WHERE <some  
>> condition> ORDER BY lastname;
>>
>> then, after some further processing, we'd do something like
>>
>> SELECT DISTINCT record_id FROM foo;
>>
>> to get the record IDs in the defined order and eventually something  
>> like
>>
>> SELECT lastname, firstname, contact_info
>> FROM foo
>> LEFT JOIN bar ON foo.record_id=bar.record_id
>> LEFT JOIN contacts ON contacts.bar_id=bar.record_id
>> WHERE foo.record_id=5;
>>
>> to get the information for a single record.
>>
>> The reasoning behind putting this sorted list of record IDs into a  
>> temporary table is not really obvious from this example, but our  
>> real code does a lot more processing and reuses some classes that  
>> use temporary or real tables to store record selections. These are  
>> mainly used as sets of records to display in lists for certain  
>> [dynamically constructed] filter conditions. The display code will  
>> then pick the record id and run a simple query to fetch the  
>> displayed properties for that record. Since we can't cache the  
>> information for all records in the displayed set (might be a really  
>> large set initially), we came up with this approach to reduce the  
>> size of the "cache" to the table containing just the record IDs.  
>> However, in order to get a fast access, we have to rely on the  
>> order of the record IDs in the table to match the order induced by  
>> the "ORDER BY" expression...
>>
>> So, to cut a long story short: is this change in the returned order  
>> for DISTINCT queries between version 3.5.1 and 3.5.9 a side effect  
>> of switching the query engine? If so, is it an undesired side  
>> effect or is it just something we have to live with?
>
> As a follow-up:
> One idea that occurred to me to work around this problem was to add  
> an "ORDER BY rowid" to my "SELECT DISTINCT record_id ..." statement  
> to enforce the "old" order, but apparently, one cannot order by  
> rowid (although it is a pseudo-column that can be SELECTed)?

Ahem - sorry for the wasted bandwidth, but I actually tried this on a  
joined statement and forgot to specify which table to fetch the rowid  
from.
Of course, ordering by rowid works (once I got the syntax right) and  
it produces the expected (i.e. 3.5.1-compatible) result list.
Mea culpa...

However, I'd still like to know whether the new behavior of returning  
the DISTINCT results in 'arbitrary' order is expected.

Thanks
</jum>


_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to