Suppose you have a table like this:

    CREATE TABLE tab(ID, A, B, C, ..., MTIME);

The ID entries are not unique.  For each ID occurrence, the values for A,
B, C, and so forth might (or might not) be different.  We do queries like
this:

    SELECT id, a, b, ..., mtime
    FROM tab
    ORDER BY mtime DESC;

*The problem to solve* is this:  We want to restrict the output of the
query above to show only the entry with the smallest MTIME for each
distinct ID.

As of SQLite version 3.7.11
<http://www.sqlite.org/releaselog/3_7_11.html>the problem can be
solved like this:

    SELECT id, a, b, ..., min(mtime)
    FROM tab
    GROUP BY id
    ORDER BY min(mtime) DESC;

SQLite, unique among SQL database engines as far as I know, will choose for
A, B, C, ... the values of those columns that correspond to the row that
satisfies the min(mtime) aggregate.  And so this actually works.  On SQLite
versions prior to 3.7.11, the values of A, B, C, .. would be from any
arbitrary row having the corresponding ID.  And on more strictly conforming
SQL implementations (ex: PostgreSQL) the query will not even parse because
the GROUP BY clause does not contain every column that is not within an
aggregate function.  But the query does work, and works efficiently and
well, in SQLite and seems very convenient.

*Question*:  Is there a (better) standard SQL way to do the same thing?



*Context*:  The actual application here is Fossil, and the display of all
changes to a particular file in a repository.  Queries like the above are
used to generate, for example, a complete history of the changes to the
date/time function logic in SQLite:

     http://www.sqlite.org/src/finfo?name=src/date.c

In this example, the "ID" column corresponds to a particular version (SHA1
hash) of a file.  That same version might appear in various branches and so
there are duplicate entries in the graph for the same file, which can be
confusing.  (Note, the actual
query<http://www.fossil-scm.org/fossil/artifact/65f708f825e?ln=245-270>to
render the "finfo" page is slightly more complex; "tab" is really a
join
of two tables, and the "a, b, c, .." columns are subqueries.  But these
complications do not bare upon the question of this email.)  To work around
the confused output, we'd like to only see a single row for each ID,
specifically the row with the smallest value of mtime, which means the
first time that a particular version of a file appears in the repository.
For example:

    http://www.sqlite.org/src/finfo?name=src/date.c&fco=1

The recent change to Fossil to add the "fco" query parameter to the "finfo"
page is the first time I have actually needed SQLite's min()/max() query
behavior myself, and I now find it *very* convenient.  But, I am slightly
troubled by depending on SQL behavior that is only found in SQLite in not
in other database engines.
-- 
D. Richard Hipp
d...@sqlite.org
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to