Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-31 Thread Palak Chaturvedi
Hii,
Thanks for your feedback. We have decided to add a role for the
extension to solve that problem.
And concerning to pg_unwarm table I think we can create a new function
to do that but I think a general user would not require to clear a
table from buffercache.
We can use bufferid and where statements to do the same if a
superuser/(specific role) requests it.

Thanks.

On Sat, 29 Jul 2023 at 02:55, Cary Huang  wrote:
>
>  Hello
>
> I had a look at the patch and tested it on CI bot, it compiles and tests fine 
> both autoconf and meson. I noticed that the v2 patch contains the v1 patch 
> file as well. Not sure if intended but put there my accident.
>
> > I don't think "invalidating" is the right terminology. Note that we already
>  > have InvalidateBuffer() - but it's something we can't allow users to do, 
> as it
>  > throws away dirty buffer contents (it's used for things like dropping a
>  > table).
>  >
>  > How about using "discarding" for this functionality?
>
> I think "invalidating" is the right terminology here, it is exactly what the 
> feature is doing, it tries to invalidate a buffer ID by calling 
> InvalidateBuffer() routine inside buffer manager and calls FlushBuffer() 
> before invalidating if marked dirty.
>
> The problem here is that InvalidateBuffer() could be dangerous because it 
> allows a user to invalidate buffer that may have data in other tables not 
> owned by the current user,
>
> I think it all comes down to the purpose of this feature. Based on the 
> description in this email thread, I feel like this feature should be 
> categorized as a developer-only feature, to be used by PG developer to 
> experiment and observe some development works by invalidating one more more 
> specific buffers. If this is the case, it may be helpful to add a 
> "DEVELOPER_OPTIONS" in GUC, which allows or disallows the 
> TryInvalidateBuffer() to run or to return error if user does not have this 
> developer option enabled.
>
> If the purpose of this feature is for general users, then it would make sense 
> to have something like pg_unwarm (exactly opposite of pg_prewarm) that takes 
> table name (instead of buffer ID) and drop all buffers associated with that 
> table name. There will be permission checks as well so a user cannot 
> pg_unwarm a table owned by someone else. User in this case won't be able to 
> invalidate a particular buffer, but he/she should not have to as a regular 
> user anyway.
>
> thanks!
>
> Cary Huang
> -
> HighGo Software Inc. (Canada)
> cary.hu...@highgo.ca
> www.highgo.ca
>




Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-11 Thread Palak Chaturvedi
Can you please review the new patch of the extension with implemented
force variable.

On Tue, 11 Jul 2023 at 18:08, Palak Chaturvedi
 wrote:
>
> Hey Nitin,
> >Will
> >there be a scenario where the buffer is dirty and its reference count
> >is zero?
> There might be a buffer that has been dirtied but is not pinned or
> being used currently by a process. So checking the refcount and then
> dirty buffers helps.
> >First, The TryInvalidateBuffer() tries to flush the buffer if it is
> dirty and then tries to invalidate it if it meets the requirement.
> Instead of directly doing this can we provide an option to the caller
> to mention whether to invalidate the dirty buffers or not.
> Yes that can be implemented with a default value of force. Will
> implement it in the next patch.
>
> On Wed, 5 Jul 2023 at 17:53, Nitin Jadhav  
> wrote:
> >
> > +1 for the idea. It's going to be more useful to test and understand
> > the buffer management of PostgreSQL and it can be used to explicitly
> > free up the buffers if there are any such requirements.
> >
> > I had a quick look over the patch. Following are the comments.
> >
> > First, The TryInvalidateBuffer() tries to flush the buffer if it is
> > dirty and then tries to invalidate it if it meets the requirement.
> > Instead of directly doing this can we provide an option to the caller
> > to mention whether to invalidate the dirty buffers or not. For
> > example, TryInvalidateBuffer(Buffer bufnum, bool force), if the force
> > is set to FALSE, then ignore invalidating dirty buffers. Otherwise,
> > flush the dirty buffer and try to invalidate.
> >
> > Second, In TryInvalidateBuffer(), it first checks if the reference
> > count is greater than zero and then checks for dirty buffers. Will
> > there be a scenario where the buffer is dirty and its reference count
> > is zero? Can you please provide more information on this or adjust the
> > code accordingly.
> >
> > > +/*
> > > +Try Invalidating a buffer using bufnum.
> > > +If the buffer is invalid, the function returns false.
> > > +The function checks for dirty buffer and flushes the dirty buffer before 
> > > invalidating.
> > > +If the buffer is still dirty it returns false.
> > > +*/
> > > +bool
> >
> > The star(*) and space are missing here. Please refer to the style of
> > function comments and change accordingly.
> >
> > Thanks & Regards,
> > Nitin Jadhav
> >
> > On Fri, Jun 30, 2023 at 4:17 PM Palak Chaturvedi
> >  wrote:
> > >
> > > I hope this email finds you well. I am excited to share that I have
> > > extended the functionality of the `pg_buffercache` extension by
> > > implementing buffer invalidation capability, as requested by some
> > > PostgreSQL contributors for improved testing scenarios.
> > >
> > > This marks my first time submitting a patch to pgsql-hackers, and I am
> > > eager to receive your expert feedback on the changes made. Your
> > > insights are invaluable, and any review or comments you provide will
> > > be greatly appreciated.
> > >
> > > The primary objective of this enhancement is to enable explicit buffer
> > > invalidation within the `pg_buffercache` extension. By doing so, we
> > > can simulate scenarios where buffers are invalidated and observe the
> > > resulting behavior in PostgreSQL.
> > >
> > > As part of this patch, a new function or mechanism has been introduced
> > > to facilitate buffer invalidation. I would like to hear your thoughts
> > > on whether this approach provides a good user interface for this
> > > functionality. Additionally, I seek your evaluation of the buffer
> > > locking protocol employed in the extension to ensure its correctness
> > > and efficiency.
> > >
> > > Please note that I plan to add comprehensive documentation once the
> > > details of this enhancement are agreed upon. This documentation will
> > > serve as a valuable resource for users and contributors alike. I
> > > believe that your expertise will help uncover any potential issues and
> > > opportunities for further improvement.
> > >
> > > I have attached the patch file to this email for your convenience.
> > > Your valuable time and consideration in reviewing this extension are
> > > sincerely appreciated.
> > >
> > > Thank you for your continued support and guidance. I am looking
> > > forward to your feedback and collaboration in enhancing the Postgre

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-11 Thread Palak Chaturvedi
Hey Nitin,
>Will
>there be a scenario where the buffer is dirty and its reference count
>is zero?
There might be a buffer that has been dirtied but is not pinned or
being used currently by a process. So checking the refcount and then
dirty buffers helps.
>First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not.
Yes that can be implemented with a default value of force. Will
implement it in the next patch.

On Wed, 5 Jul 2023 at 17:53, Nitin Jadhav  wrote:
>
> +1 for the idea. It's going to be more useful to test and understand
> the buffer management of PostgreSQL and it can be used to explicitly
> free up the buffers if there are any such requirements.
>
> I had a quick look over the patch. Following are the comments.
>
> First, The TryInvalidateBuffer() tries to flush the buffer if it is
> dirty and then tries to invalidate it if it meets the requirement.
> Instead of directly doing this can we provide an option to the caller
> to mention whether to invalidate the dirty buffers or not. For
> example, TryInvalidateBuffer(Buffer bufnum, bool force), if the force
> is set to FALSE, then ignore invalidating dirty buffers. Otherwise,
> flush the dirty buffer and try to invalidate.
>
> Second, In TryInvalidateBuffer(), it first checks if the reference
> count is greater than zero and then checks for dirty buffers. Will
> there be a scenario where the buffer is dirty and its reference count
> is zero? Can you please provide more information on this or adjust the
> code accordingly.
>
> > +/*
> > +Try Invalidating a buffer using bufnum.
> > +If the buffer is invalid, the function returns false.
> > +The function checks for dirty buffer and flushes the dirty buffer before 
> > invalidating.
> > +If the buffer is still dirty it returns false.
> > +*/
> > +bool
>
> The star(*) and space are missing here. Please refer to the style of
> function comments and change accordingly.
>
> Thanks & Regards,
> Nitin Jadhav
>
> On Fri, Jun 30, 2023 at 4:17 PM Palak Chaturvedi
>  wrote:
> >
> > I hope this email finds you well. I am excited to share that I have
> > extended the functionality of the `pg_buffercache` extension by
> > implementing buffer invalidation capability, as requested by some
> > PostgreSQL contributors for improved testing scenarios.
> >
> > This marks my first time submitting a patch to pgsql-hackers, and I am
> > eager to receive your expert feedback on the changes made. Your
> > insights are invaluable, and any review or comments you provide will
> > be greatly appreciated.
> >
> > The primary objective of this enhancement is to enable explicit buffer
> > invalidation within the `pg_buffercache` extension. By doing so, we
> > can simulate scenarios where buffers are invalidated and observe the
> > resulting behavior in PostgreSQL.
> >
> > As part of this patch, a new function or mechanism has been introduced
> > to facilitate buffer invalidation. I would like to hear your thoughts
> > on whether this approach provides a good user interface for this
> > functionality. Additionally, I seek your evaluation of the buffer
> > locking protocol employed in the extension to ensure its correctness
> > and efficiency.
> >
> > Please note that I plan to add comprehensive documentation once the
> > details of this enhancement are agreed upon. This documentation will
> > serve as a valuable resource for users and contributors alike. I
> > believe that your expertise will help uncover any potential issues and
> > opportunities for further improvement.
> >
> > I have attached the patch file to this email for your convenience.
> > Your valuable time and consideration in reviewing this extension are
> > sincerely appreciated.
> >
> > Thank you for your continued support and guidance. I am looking
> > forward to your feedback and collaboration in enhancing the PostgreSQL
> > ecosystem.
> >
> > The working of the extension:
> >
> > 1. Creating the extension pg_buffercache and then call select query on
> > a table and note the buffer to be cleared.
> > pgbench=# create extension pg_buffercache;
> > CREATE EXTENSION
> > pgbench=# select count(*) from pgbench_accounts;
> >  count
> > 
> >  10
> > (1 row)
> >
> > pgbench=# SELECT *
> > FROM pg_buffercache
> > WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
> >  buff

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-04 Thread Palak Chaturvedi
hi,
I don't think we need to check the usage count. Because we are
clearing all the buffers that are not pinned.
Checking the usage count is for buffer replacement since we are not
replacing it does not matter.
On Mon, 3 Jul 2023 at 21:16, jian he  wrote:
>
> On Mon, Jul 3, 2023 at 4:26 PM Palak Chaturvedi
>  wrote:
> >
> > Hi Thomas,
> > Thank you for your suggestions. I have added the sql in the meson
> > build as well.
> >
> > On Sat, 1 Jul 2023 at 03:39, Thomas Munro  wrote:
> > >
> > > On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
> > >  wrote:
> > > > pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
> > > > pg_buffercache where relfilenode =
> > > > pg_relation_filenode('pgbench_accounts'::regclass);
> > >
> > > Hi Palak,
> > >
> > > Thanks for working on this!  I think this will be very useful for
> > > testing existing workloads but also for testing future work on
> > > prefetching with AIO (and DIO), work on putting SLRUs (or anything
> > > else) into the buffer pool, nearby proposals for caching buffer
> > > mapping information, etc etc.
> > >
> > > Palak and I talked about this idea a bit last week (stimulated by a
> > > recent thread[1], but the topic has certainly come up before), and we
> > > discussed some different ways one could specify which pages are
> > > dropped.  For example, perhaps the pg_prewarm extension could have an
> > > 'unwarm' option instead.  I personally thought the buffer ID-based
> > > approach was quite good because it's extremely simple, while giving
> > > the user the full power of SQL to say which buffers.   Half a table?
> > > Visibility map?  Everything?  Root page of an index?  I think that's
> > > probably better than something that requires more code and
> > > complication but is less flexible in the end.  It feels like the right
> > > level of rawness for something primarily of interest to hackers and
> > > advanced users.  I don't think it matters that there is a window
> > > between selecting a buffer ID and invalidating it, for the intended
> > > use cases.  That's my vote, anyway, let's see if others have other
> > > ideas...
> > >
> > > We also talked a bit about how one might control the kernel page cache
> > > in more fine-grained ways for testing purposes, but it seems like the
> > > pgfincore project has that covered with its pgfadvise_willneed() and
> > > pgfadvise_dontneed().  IMHO that project could use more page-oriented
> > > operations (instead of just counts and coarse grains operations) but
> > > that's something that could be material for patches to send to the
> > > extension maintainers.  This work, in contrast, is more tangled up
> > > with bufmgr.c internals, so it feels like this feature belongs in a
> > > core contrib module.
> > >
> > > Some initial thoughts on the patch:
> > >
> > > I wonder if we should include a simple exercise in
> > > contrib/pg_buffercache/sql/pg_buffercache.sql.  One problem is that
> > > it's not guaranteed to succeed in general.  It doesn't wait for pins
> > > to go away, and it doesn't retry cleaning dirty buffers after one
> > > attempt, it just returns false, which I think is probably the right
> > > approach, but it makes the behaviour too non-deterministic for simple
> > > tests.  Perhaps it's enough to include an exercise where we call it a
> > > few times to hit a couple of cases, but not verify what effect it has.
> > >
> > > It should be restricted by role, but I wonder which role it should be.
> > > Testing for superuser is now out of fashion.
> > >
> > > Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
> > > to do the same.  That's because PostgreSQL is currently in transition
> > > from autoconf/gmake to meson/ninja[2], so for now we have to maintain
> > > both build systems.  That's why it fails to build in some CI tasks[3].
> > > You can enable CI in your own GitHub account if you want to run test
> > > builds on several operating systems, see [4] for info.
> > >
> > > [1] 
> > > https://www.postgresql.org/message-id/flat/CAFSGpE3y_oMK1uHhcHxGxBxs%2BKrjMMdGrE%2B6HHOu0vttVET0UQ%40mail.gmail.com
> > > [2] https://wiki.postgresql.org/wiki/Meson
> > > [3] http://cfbot.cputube.org/palak-chaturvedi.html
> > > [4] 
> > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-07-03 Thread Palak Chaturvedi
Hi Thomas,
Thank you for your suggestions. I have added the sql in the meson
build as well.

On Sat, 1 Jul 2023 at 03:39, Thomas Munro  wrote:
>
> On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
>  wrote:
> > pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
> > pg_buffercache where relfilenode =
> > pg_relation_filenode('pgbench_accounts'::regclass);
>
> Hi Palak,
>
> Thanks for working on this!  I think this will be very useful for
> testing existing workloads but also for testing future work on
> prefetching with AIO (and DIO), work on putting SLRUs (or anything
> else) into the buffer pool, nearby proposals for caching buffer
> mapping information, etc etc.
>
> Palak and I talked about this idea a bit last week (stimulated by a
> recent thread[1], but the topic has certainly come up before), and we
> discussed some different ways one could specify which pages are
> dropped.  For example, perhaps the pg_prewarm extension could have an
> 'unwarm' option instead.  I personally thought the buffer ID-based
> approach was quite good because it's extremely simple, while giving
> the user the full power of SQL to say which buffers.   Half a table?
> Visibility map?  Everything?  Root page of an index?  I think that's
> probably better than something that requires more code and
> complication but is less flexible in the end.  It feels like the right
> level of rawness for something primarily of interest to hackers and
> advanced users.  I don't think it matters that there is a window
> between selecting a buffer ID and invalidating it, for the intended
> use cases.  That's my vote, anyway, let's see if others have other
> ideas...
>
> We also talked a bit about how one might control the kernel page cache
> in more fine-grained ways for testing purposes, but it seems like the
> pgfincore project has that covered with its pgfadvise_willneed() and
> pgfadvise_dontneed().  IMHO that project could use more page-oriented
> operations (instead of just counts and coarse grains operations) but
> that's something that could be material for patches to send to the
> extension maintainers.  This work, in contrast, is more tangled up
> with bufmgr.c internals, so it feels like this feature belongs in a
> core contrib module.
>
> Some initial thoughts on the patch:
>
> I wonder if we should include a simple exercise in
> contrib/pg_buffercache/sql/pg_buffercache.sql.  One problem is that
> it's not guaranteed to succeed in general.  It doesn't wait for pins
> to go away, and it doesn't retry cleaning dirty buffers after one
> attempt, it just returns false, which I think is probably the right
> approach, but it makes the behaviour too non-deterministic for simple
> tests.  Perhaps it's enough to include an exercise where we call it a
> few times to hit a couple of cases, but not verify what effect it has.
>
> It should be restricted by role, but I wonder which role it should be.
> Testing for superuser is now out of fashion.
>
> Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
> to do the same.  That's because PostgreSQL is currently in transition
> from autoconf/gmake to meson/ninja[2], so for now we have to maintain
> both build systems.  That's why it fails to build in some CI tasks[3].
> You can enable CI in your own GitHub account if you want to run test
> builds on several operating systems, see [4] for info.
>
> [1] 
> https://www.postgresql.org/message-id/flat/CAFSGpE3y_oMK1uHhcHxGxBxs%2BKrjMMdGrE%2B6HHOu0vttVET0UQ%40mail.gmail.com
> [2] https://wiki.postgresql.org/wiki/Meson
> [3] http://cfbot.cputube.org/palak-chaturvedi.html
> [4] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD


v1-0001-Invalidate-Buffer-By-Bufnum.patch
Description: Binary data


Extension Enhancement: Buffer Invalidation in pg_buffercache

2023-06-30 Thread Palak Chaturvedi
I hope this email finds you well. I am excited to share that I have
extended the functionality of the `pg_buffercache` extension by
implementing buffer invalidation capability, as requested by some
PostgreSQL contributors for improved testing scenarios.

This marks my first time submitting a patch to pgsql-hackers, and I am
eager to receive your expert feedback on the changes made. Your
insights are invaluable, and any review or comments you provide will
be greatly appreciated.

The primary objective of this enhancement is to enable explicit buffer
invalidation within the `pg_buffercache` extension. By doing so, we
can simulate scenarios where buffers are invalidated and observe the
resulting behavior in PostgreSQL.

As part of this patch, a new function or mechanism has been introduced
to facilitate buffer invalidation. I would like to hear your thoughts
on whether this approach provides a good user interface for this
functionality. Additionally, I seek your evaluation of the buffer
locking protocol employed in the extension to ensure its correctness
and efficiency.

Please note that I plan to add comprehensive documentation once the
details of this enhancement are agreed upon. This documentation will
serve as a valuable resource for users and contributors alike. I
believe that your expertise will help uncover any potential issues and
opportunities for further improvement.

I have attached the patch file to this email for your convenience.
Your valuable time and consideration in reviewing this extension are
sincerely appreciated.

Thank you for your continued support and guidance. I am looking
forward to your feedback and collaboration in enhancing the PostgreSQL
ecosystem.

The working of the extension:

1. Creating the extension pg_buffercache and then call select query on
a table and note the buffer to be cleared.
pgbench=# create extension pg_buffercache;
CREATE EXTENSION
pgbench=# select count(*) from pgbench_accounts;
 count

 10
(1 row)

pgbench=# SELECT *
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
 bufferid | relfilenode | reltablespace | reldatabase | relforknumber
| relblocknumber | isdirty | usagecount | pinning_backends
--+-+---+-+---++-++--
  233 |   16397 |  1663 |   16384 | 0
|  0 | f   |  1 |0
  234 |   16397 |  1663 |   16384 | 0
|  1 | f   |  1 |0
  235 |   16397 |  1663 |   16384 | 0
|  2 | f   |  1 |0
  236 |   16397 |  1663 |   16384 | 0
|  3 | f   |  1 |0
  237 |   16397 |  1663 |   16384 | 0
|  4 | f   |  1 |0


2. Clearing a single buffer by entering the bufferid.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
 count
---
  1660
(1 row)

pgbench=# select pg_buffercache_invalidate(233);
 pg_buffercache_invalidate
---
 t
(1 row)

pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
 count
---
  1659
(1 row)

3. Clearing the entire buffer for a relation using the function.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
 count
---
  1659
(1 row)

pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
 count
---
  1659
(1 row)

pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
 count
---
 0
(1 row)


Best regards,
Palak


v1-0001-Invalidate-Buffer-By-Bufnum.patch
Description: Binary data