select * from tab id | a | mtime | +--+----+-------+-- 1 | 1 | 3 | 1 | 2 | 2 | 1 | 3 | 4 | 1 | 4 | 2 | 1 | 5 | 7 | 2 | 6 | 1 | 2 | 7 | 0 | 2 | 8 | 4 | 2 | 9 | 1 | 2 | 10 | 2 |
select id, a, min(mtime) over(partition by id order by mtime) m from tab qualify row_number() over(partition by id order by mtime) = 1 id | a | m | +--+---+---+-- 1 | 2 | 2 | 2 | 7 | 0 | qualify isnt't in standad. instead subquery can be used. > Message: 14 > Date: Wed, 5 Sep 2012 09:03:35 -0400 > From: Richard Hipp > To: General Discussion of SQLite Database > Subject: [sqlite] Finding rows with MIN(MTIME) for all IDs > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Suppose you have a table like this: > > CREATE TABLE tab(ID, A, B, C, ..., MTIME); > > The ID entries are not unique. For each ID occurrence, the values for A, > B, C, and so forth might (or might not) be different. We do queries like > this: > > SELECT id, a, b, ..., mtime > FROM tab > ORDER BY mtime DESC; > > *The problem to solve* is this: We want to restrict the output of the > query above to show only the entry with the smallest MTIME for each > distinct ID. > > As of SQLite version 3.7.11 > the problem can be > solved like this: > > SELECT id, a, b, ..., min(mtime) > FROM tab > GROUP BY id > ORDER BY min(mtime) DESC; > > SQLite, unique among SQL database engines as far as I know, will choose for > A, B, C, ... the values of those columns that correspond to the row that > satisfies the min(mtime) aggregate. And so this actually works. On SQLite > versions prior to 3.7.11, the values of A, B, C, .. would be from any > arbitrary row having the corresponding ID. And on more strictly conforming > SQL implementations (ex: PostgreSQL) the query will not even parse because > the GROUP BY clause does not contain every column that is not within an > aggregate function. But the query does work, and works efficiently and > well, in SQLite and seems very convenient. > > *Question*: Is there a (better) standard SQL way to do the same thing? > > > > *Context*: The actual application here is Fossil, and the display of all > changes to a particular file in a repository. Queries like the above are > used to generate, for example, a complete history of the changes to the > date/time function logic in SQLite: > > http://www.sqlite.org/src/finfo?name=src/date.c > > In this example, the "ID" column corresponds to a particular version (SHA1 > hash) of a file. That same version might appear in various branches and so > there are duplicate entries in the graph for the same file, which can be > confusing. (Note, the actual > queryto > render the "finfo" page is slightly more complex; "tab" is really a > join > of two tables, and the "a, b, c, .." columns are subqueries. But these > complications do not bare upon the question of this email.) To work around > the confused output, we'd like to only see a single row for each ID, > specifically the row with the smallest value of mtime, which means the > first time that a particular version of a file appears in the repository. > For example: > > http://www.sqlite.org/src/finfo?name=src/date.c&fco=1 > > The recent change to Fossil to add the "fco" query parameter to the "finfo" > page is the first time I have actually needed SQLite's min()/max() query > behavior myself, and I now find it *very* convenient. But, I am slightly > troubled by depending on SQL behavior that is only found in SQLite in not > in other database engines. > -- > D. Richard Hipp > d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users