On Wed, Jan 17, 2018 at 8:08 AM, Jacek Kołodziej <kolodzi...@gmail.com> wrote:
> Hello everyone, > > this is my first post here and I'm starting with asking a question about > data consistency between two consecutive SELECTs using PostgreSQL 9.6. > > I'm sorry if that's something that was already discussed - I couldn't find > it either in archives, nor in _general internet_. If it is, I would > appreciate pointing it out. > > I have an "append-only" events table - only INSERT and SELECT queries are > issued to it. It has an integer (sequence) ID as a primary key. > > I'm issuing following two queries (within the same READ COMMITTED > transaction) to that table: > - first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1 > - I'm saving the result as a "max ID" for the second query > - second query (B): SELECT id, ... FROM events WHERE id > "min ID" AND id > <= "max ID" AND ... > - I won't dig into what's "min ID" but you can assume it's at most 100 > less than "max ID" > > Concurrently, rows are being added to that table. > > Please note that there's enough data in the index for the first query (A) > to perform an index-only scan. I'm not sure if that's relevant but "B" > query does an index scan because of other conditions I haven't mentioned > but still needs to fetch data from the table's heap. > > Here's what happening to me: the "A" query occasionally (in my case: on > the order of tenths per day) returns an ID _higher_ than any ID present in > second query's result (other conditions I haven't specified do _not_ filter > any more rows than "id <= max ID") - as if some entries were visible for > the first query, but not for the second one. This is an inconsistency that > is very problematic for me. > > All I can think of is that it might be caused by the index-only-scan and > READ COMMITTED transaction isolation level but from what I gather from > documentation, it should not be possible due to the use of visibility > map[0][1][2]. > > And yet it happens, likely for some other reason but I can't think of any. > I've tried switching transaction isolation level to REPEATABLE READ (on the > basis that it's an faulty phenomenon occurring during to some bug) but that > didn't help. > > Where am I wrong? What am I missing? What information may I provide to > help with investigating this? > > [0] https://www.postgresql.org/docs/9.6/static/indexes-index- > only-scans.html > [1] https://www.postgresql.org/docs/9.6/static/storage-vm.html > [2] https://wiki.postgresql.org/wiki/Index-only_scans > > -- > Kind regards, > Jacek Kołodziej > http://kolodziejj.info > *>- first query (A): SELECT id FROM events ORDER BY events.id <http://events.id> DESC LIMIT 1 - I'm saving the result as a "max ID" for the second query* *Just a suggestion. The first query is not really needed.* *You can simply do:* *second query (B): * *SELECT id, ... FROM events WHERE id > MIN(ID) AND id <= MAX(ID) AND ...* *See https://www.postgresql.org/docs/9.6/static/functions-aggregate.html <https://www.postgresql.org/docs/9.6/static/functions-aggregate.html> * *MAX and MIN functions* -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.