On Fri, 29 Mar 2019 at 16:12, Andres Freund <and...@anarazel.de> wrote:
> On 2019-03-29 15:58:14 +0000, Simon Riggs wrote: > > On Fri, 29 Mar 2019 at 15:29, Andres Freund <and...@anarazel.de> wrote: > > > That's far from a trivial feature imo. It seems quite possible that > we'd > > > end up with increased overhead, because the current logic can get away > > > with only doing hint bit style writes - but would that be true if we > > > started actually replacing the item pointers? Because I don't see any > > > guarantee they couldn't cross a page boundary etc? So I think we'd need > > > to do WAL logging during index searches, which seems prohibitively > > > expensive. > > > > > > > Don't see that. > > > > I was talking about reusing the first 4 bytes of an index tuple's > > ItemPointerData, > > which is the first field of an index tuple. Index tuples are MAXALIGNed, > so > > I can't see how that would ever cross a page boundary. > > They're 8 bytes, and MAXALIGN often is 4 bytes: > xids are 4 bytes, so we're good. If MAXALIGN could ever be 2 bytes, we'd have a problem. So as a whole they definitely can cross sector boundaries. You might be > able to argue your way out of that by saying that the blkid is going to > be aligned, but that's not that trivial, as t_info isn't guaranteed > that. > > But even so, you can't have unlogged changes that you then rely on. Even > if there's no torn page issue. Currently BTP_HAS_GARBAGE and > ItemIdMarkDead() are treated as hints - if we want to guarantee all > these are accurate, I don't quite see how we'd get around WAL logging > those. > You can have unlogged changes that you rely on - that is exactly how hints work. If the hint is lost, we do the I/O. Worst case it would be the same as what you have now. I'm talking about saving many I/Os - this doesn't need to provably avoid all I/Os to work, its incremental benefit all the way. -- Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services