I have a dataset of events with timestamps, and want to extract a subset of them so that there is at least, say, an hour between items in the result. It would be trivial to solve this in an imperative language, e.g. the solution in Python would be:
result = [table[0]] for row in table[1:]: if row['timestamp'] >= result[-1]['timestamp'] + timedelta(hours=1): result.append(row) But I would like to have a solution in pure SQL. The closest I came to a solution was this code, with integers instead of datetimes, and a minimum distance of 3: CREATE TABLE test (x INT) INSERT INTO test VALUES(1) INSERT INTO test VALUES(3) INSERT INTO test VALUES(5) INSERT INTO test VALUES(6) INSERT INTO test VALUES(7) INSERT INTO test VALUES(8) WITH RECURSIVE breaks(t) AS ( SELECT 1 UNION SELECT min(x) FROM test, breaks WHERE x > t + 2 AND x < 10 ) SELECT t FROM breaks The result I would expect would be 1,5,8. The above query worked with the sqlite3 executable that ships with OSX (I think, its version is around 3.8.11), but when a tried it on a different machine with 3.9.2, there was an error having to do with aggregate function (min) in the recursive subquery. The solution I thought would work was WITH RECURSIVE breaks(t) AS ( SELECT 1 UNION SELECT x FROM test, breaks WHERE x > t + 2 LIMIT 1 ) SELECT t FROM breaks with the reasoning being that the recursive subquery would only return the first value that satisfies the condition of being greater than the last one plus 2, but "LIMIT 1" gets applied the the whole statement instead of just the subquery, limiting the output to just one item. A solution can be approximated by grouping items into bins and then selecting one value from each bin, but this will lead to some wrong results, e.g. in the example above, if we build bins by the division result of 3, and then take the smallest value from each bin, we'll have: [1,3], [5,6], [7,8] -> 1,5,7 with 5 and 7 being too close to each other. Can this be done?