Hi,
On 2026-03-11 01:00:34 +0100, Tomas Vondra wrote:
> On 3/10/26 21:41, Andres Freund wrote:
> > On 2026-03-10 19:41:54 +0100, Tomas Vondra wrote:
> >> On 2/5/26 16:38, Peter Geoghegan wrote:
> >>> On Wed, Jan 14, 2026 at 1:31 AM Andreas Karlsson <[email protected]>
> >>> wrote:
> >>>> Yeah, that was a quite big thinko. I have a attached a patch with the
> >>>> thinko fixed but I am still not happy with it. I think I will try to use
> >>>> atomics.h and see if that makes the code nicer to read.
> >>>>
> >>>> Also will after that see what I can do about pageinspect.
> >>>
> >>> I believe that pageinspect's heap_page_items function needs to use
> >>> get_page_from_raw -- see commit 14e9b18fed.
> >
> > Maybe I'm slow today, but how's that related to the patch at hand?
> >
> > Regardless of that, it seems a bit confusing to fold the pageinspect changes
> > into the main commit.
> >
>
> That's because of alignment requirements, per this comment:
>
> * On machines with MAXALIGN = 8, the payload of a bytea is not
> * maxaligned, since it will start 4 bytes into a palloc'd value.
> * PageHeaderData requires 8 byte alignment, so always use this
> * function when accessing page header fields from a raw page bytea.
I guess we are now reading 8 bytes as one.
> AFAIK proper alignment is required to make the load atomic.
Sure, but it's a copy of the page, so that'd itself would not matter.
> >> +static inline PageXLogRecPtr
> >> +PageXLogRecPtrGet(PageXLogRecPtr pd_lsn)
> >> {
> >> - return (uint64) val.xlogid << 32 | val.xrecoff;
> >> +#ifdef WORDS_BIGENDIAN
> >> + return pd_lsn;
> >> +#else
> >> + return (pd_lsn << 32) | (pd_lsn >> 32);
> >> +#endif
> >> }
> >>
> >> -#define PageXLogRecPtrSet(ptr, lsn) \
> >> - ((ptr).xlogid = (uint32) ((lsn) >> 32), (ptr).xrecoff = (uint32) (lsn))
> >> -
> >> /*
> >> * disk page organization
> >> *
> >> @@ -387,10 +390,11 @@ PageGetLSN(const PageData *page)
> >> {
> >> return PageXLogRecPtrGet(((const PageHeaderData *) page)->pd_lsn);
> >> }
> >
> > I think this may need to actually use a volatile to force the read to happen
> > as the 64bit value, otherwise the compiler would be entirely free to
> > implement
> > it as two 32bit reads.
> >
>
> I did that in v5 (I think). But at the same time I'm now rather confused
> about the meaning of PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY. Because if a
> well-aligned load/store of 8B can be implemented as two 32-bit reads
> (with a "sequence point" in between), then how is that atomic?
All PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY says is that the hardware can do 8
byte reads/writes without tearing. But that still requires 8 bytes
reads/writes to be done as one, not multiple instructions.
> I'm also a bit ... unsure about "volatile" in general. Is that actually
> something the specification says is needed here, or is it just an
> attempt to "disable optimizations" (like the split into two stores)?
It's definitely suboptimal. We should use something C11's
atomic_load()/atomic_store(). But we can't rely on that yet (it's not enabled
without extra flags in at least msvc).
The volatile does prevent the compiler from just doing a read/write twice or
never, just because it'd be nicer for code generation. But it doesn't
actually guarantee that the right instructions for reading writing are
generated :(
Greetings,
Andres Freund