> From: t...@sss.pgh.pa.us > To: klaussfre...@gmail.com > CC: hlinnakan...@vmware.com; johnlu...@hotmail.com; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal > and patch > Date: Thu, 29 May 2014 17:56:57 -0400 > > Claudio Freire <klaussfre...@gmail.com> writes: > > On Thu, May 29, 2014 at 6:43 PM, Claudio Freire <klaussfre...@gmail.com> > > wrote: > >> On Thu, May 29, 2014 at 6:19 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >>> "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe > >>> for non-MVCC snapshots (read about vacuum vs indexscan interlocks in > >>> nbtree/README). > > >> It's not really the tuple, just the tid > > > And, furthermore, it's used only to do prefetching, so even if the tid > > was invalid when the tuple needs to be accessed, it wouldn't matter, > > because the indexam wouldn't use the result of ampeeknexttuple to do > > anything at that time. > > Nonetheless, getting the next tid out of the index may involve stepping > to the next index page, at which point you've lost your interlock
I think we are ok as peeknexttuple (yes bad name, sorry, can change it ... never advances to another page : * btpeeknexttuple() -- peek at the next tuple different from any blocknum in pfch_list * without reading a new index page * and without causing any side-effects such as altering values in control blocks * if found, store blocknum in next element of pfch_list > guaranteeing that the *previous* tid will still mean something by the time > you arrive at its heap page. I presume that the ampeeknexttuple call is > issued before trying to visit the heap (otherwise you're not actually > getting much I/O overlap), so I think there's a real risk here. > > Having said that, it's probably OK as long as this mode is only invoked > for user queries (with MVCC snapshots) and not for system indexscans. > > regards, tom lane