Re: XLog size reductions: Reduced XLog record header size for PG17

2024-06-05 Thread Mark Dilger



> On Jun 20, 2023, at 1:01 PM, Matthias van de Meent 
>  wrote:
> 
> 0001 is copied essentially verbatim from [1] and reduces overhead in
> the registered block's length field where possible. It is included to
> improve code commonality between varcoded integer fields. See [1] for
> more details.

Hi Matthias!  I am interested in seeing this patch move forward.  We seem to be 
stuck.

The disagreement on the other thread seems to be about whether we can 
generalize and reuse variable integer encoding.  Could you comment on whether 
perhaps we just need a few versions of that?  Perhaps one version where the 
number of length bytes is encoded in the length itself (such as is used for 
varlena and by Andres' patch) and one where the number of length bytes is 
stored elsewhere?  You are clearly using the "elsewhere" form, but perhaps you 
could pull out the logic of that into src/common?  In struct 
XLogRecordBlockHeader.id , you are reserving 
two bits for the size class.  (The code comments aren't clear about this, by 
the way.)  Perhaps if the generalized length encoding logic could take a couple 
arguments to represent where and how the size class bits are to be stored, and 
where the length itself is stored?  I doubt you need to sacrifice any 
performance gains of this patch to make that happen.  You'd just need to 
restructure the patch.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XLog size reductions: Reduced XLog record header size for PG17

2024-04-06 Thread Andrey M. Borodin



> On 3 Jan 2024, at 15:15, Pavel Borisov  wrote:
> 
> Hi and Happy New Year!
> 
> I've looked through the patches and the change seems quite small and 
> justified. But at the second round, some doubt arises on whether this long 
> patchset indeed introduces enough performance gain? I may be wrong, but it 
> saves only several bytes and the performance gain would be only in some 
> specific artificial workload.  Did you do some measurements? Do we have 
> several percent performance-wise?
> 
> Kind regards,
> Pavel Borisov

Hi Matthias!

This is a kind reminder that the thread is waiting for your reply. Are you 
interesting in CF entry [0]?

Thanks!


Best regards, Andrey Borodin.
[0] https://commitfest.postgresql.org/47/4386/



Re: XLog size reductions: Reduced XLog record header size for PG17

2024-01-03 Thread Pavel Borisov
Hi and Happy New Year!

I've looked through the patches and the change seems quite small and
justified. But at the second round, some doubt arises on whether this long
patchset indeed introduces enough performance gain? I may be wrong, but it
saves only several bytes and the performance gain would be only in some
specific artificial workload.  Did you do some measurements? Do we have
several percent performance-wise?

Kind regards,
Pavel Borisov


Re: XLog size reductions: Reduced XLog record header size for PG17

2023-09-26 Thread Michael Paquier
On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote:
> On Wed, 20 Sept 2023 at 07:06, Michael Paquier  wrote:
>>  #define COPY_HEADER_FIELD(_dst, _size)\
>>  do {\
>> -if (remaining < _size)\
>> +if (remaining < (_size))\
>>  goto shortdata_err;\
>>
>> There are a couple of stylistic changes like this one, that I guess
>> could just use their own patch to make these macros easier to use.
> 
> They actually fix complaints of my IDE, but are otherwise indeed stylistic.

Oh, OK.  I just use an old-school terminal, but no objections in
changing these if they make life easier for some hackers.  Still, that
feels independant of what you are proposing here.
--
Michael


signature.asc
Description: PGP signature


Re: XLog size reductions: Reduced XLog record header size for PG17

2023-09-25 Thread Matthias van de Meent
On Wed, 20 Sept 2023 at 07:06, Michael Paquier  wrote:
>
> On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:
> > V5 is a rebased version of v4, and includes the latest patch from
> > "smaller XLRec block header" [0] as 0001.
>
> 0001 and 0007 are the meat of the changes.

Correct.

> -#define XLR_CHECK_CONSISTENCY  0x02
> +#define XLR_CHECK_CONSISTENCY  (0x20)
>
> I can't help but notice that there are a few stylistic choices like
> this one that are part of the patch.  Using parenthesis in the case of
> hexa values is inconsistent with the usual practices I've seen in the
> tree.

Yes, I'll take another look at that.

>  #define COPY_HEADER_FIELD(_dst, _size)\
>  do {\
> -if (remaining < _size)\
> +if (remaining < (_size))\
>  goto shortdata_err;\
>
> There are a couple of stylistic changes like this one, that I guess
> could just use their own patch to make these macros easier to use.

They actually fix complaints of my IDE, but are otherwise indeed stylistic.

> -#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
> +#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & 
> XLR_INFO_MASK)
> +#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & 
> XLR_RMGR_INFO_MASK)
>
> This stuff in 0002 is independent of 0001, am I right?  Doing this
> split with an extra macro is okay by me, reducing the presence of
> XLR_INFO_MASK and bitwise operations based on it.

Yes, that change is to stop making use of (~XLR_INFO_MASK) where
XLR_RMGR_INFO_MASK is the correct bitmask (whilst also being quite
useful in the later patch).

> 0003 is also mechanical, but if you begin to enforce the use of
> XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
> identity callback, we should have at least a validity check to make
> sure that nothing, even custom RMGRs, pass down unexpected bits?

I think that's already handled in XLogInsert(), but I'll make sure to
add more checks if they're not in place yet.

> I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
> I fear that people are going to forget to set it.  Wouldn't it be
> better to use an option where the XID is excluded instead, making the
> inclusing the an XID the default?

Most rmgrs don't actually use the XID. Only XACT, MULTIXACT, HEAP,
HEAP2, and LOGICALMSG use the xid, so I thought it would be easier to
just find the places where those RMGR's records were being logged than
to update all other places.

I don't mind changing how we decide to log the XID, but I don't think
EXCLUDE_XID is a good alternative: most records just don't need the
transaction ID. There are many more index AMs with logging than table
AMs, so I don't think it is that weird to default to 'not included'.

> > The resource manager has ID = 0, thus requiring some special
> > handling in other code. Apart from being generally useful, it is
> > used in future patches to detect the end of wal in lieu of a zero-ed
> > fixed-size xl_tot_len field.
>
> Err, no, that may not be true.  See for example this thread where the
> topic of improving the checks of xl_tot_len and rely on this value on
> when a record header has been validated, even across page borders:
> https://www.postgresql.org/message-id/17928-aa92416a70ff4...@postgresql.org

Yes, there are indeed exceptions when reusing WAL segments, but it's
still a good canary, like xl_tot_len before this patch.

> Except that, in which cases could an invalid RMGR be useful?

A sentinel value that is obviously invalid is available for several
types, e.g. BlockNumber, TransactionId, XLogRecPtr, Buffer, and this
is quite useful if you want to check if something is definitely
invalid. I think that's fine in principle, we're already "wasting"
some IDs in the gap between RM_MAX_BUILTIN_ID and RM_MIN_CUSTOM_ID.

In the current xlog infrastructure, we use xl_tot_len as that sentinel
to detect whether a new record may exist, but in this patch that can't
be used because the field may not exist and depends on other bytes. So
I used xl_rmgr_id as the field to base the 'may a next record exist'
checks on, which required the 0 rmgr ID to be invalid.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Re: XLog size reductions: Reduced XLog record header size for PG17

2023-09-20 Thread Michael Paquier
On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:
> V5 is a rebased version of v4, and includes the latest patch from
> "smaller XLRec block header" [0] as 0001.

0001 and 0007 are the meat of the changes.

-#define XLR_CHECK_CONSISTENCY  0x02
+#define XLR_CHECK_CONSISTENCY  (0x20)

I can't help but notice that there are a few stylistic choices like
this one that are part of the patch.  Using parenthesis in the case of
hexa values is inconsistent with the usual practices I've seen in the
tree.

 #define COPY_HEADER_FIELD(_dst, _size)\
 do {\
-if (remaining < _size)\
+if (remaining < (_size))\
 goto shortdata_err;\

There are a couple of stylistic changes like this one, that I guess
could just use their own patch to make these macros easier to use.

-#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
+#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & 
XLR_INFO_MASK)
+#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & 
XLR_RMGR_INFO_MASK)

This stuff in 0002 is independent of 0001, am I right?  Doing this
split with an extra macro is okay by me, reducing the presence of
XLR_INFO_MASK and bitwise operations based on it.

0003 is also mechanical, but if you begin to enforce the use of
XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
identity callback, we should have at least a validity check to make
sure that nothing, even custom RMGRs, pass down unexpected bits?

I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and 
I fear that people are going to forget to set it.  Wouldn't it be
better to use an option where the XID is excluded instead, making the
inclusing the an XID the default?

> The resource manager has ID = 0, thus requiring some special
> handling in other code. Apart from being generally useful, it is
> used in future patches to detect the end of wal in lieu of a zero-ed
> fixed-size xl_tot_len field.

Err, no, that may not be true.  See for example this thread where the
topic of improving the checks of xl_tot_len and rely on this value on
when a record header has been validated, even across page borders:
https://www.postgresql.org/message-id/17928-aa92416a70ff4...@postgresql.org

Except that, in which cases could an invalid RMGR be useful?
--
Michael


signature.asc
Description: PGP signature