Re: The ultimate extension hook.

2020-10-25 Thread Daniel Wood
> On 10/23/2020 9:31 AM Jehan-Guillaume de Rorthais wrote: > [...] > * useless with encrypted traffic > > So, +1 for such hooks. > > Regards, Ultimately Postgresql is supposed to be extensible. I don't see an API hook as being some crazy idea even if some may not like what I might want to use

Re: The ultimate extension hook.

2020-09-23 Thread Daniel Wood
> On 09/23/2020 9:26 PM Tom Lane wrote: > ... > > The hook I'd like to see would be in the PostgresMain() loop > > for the API "firstchar" messages. > > What, to invent your own protocol? Where will you find client libraries > buying into that? No API/client changes are needed for: 1)

The ultimate extension hook.

2020-09-23 Thread Daniel Wood
Hooks exist all over PG for extensions to cover various specific usages. The hook I'd like to see would be in the PostgresMain() loop for the API "firstchar" messages. While I started just wanting the hook for the absolute minimum overhead to execute a function, even faster than fastpath, and

Re: Reduce/eliminate the impact of FPW

2020-08-03 Thread Daniel Wood
> On 08/03/2020 8:26 AM Robert Haas wrote: ... > I think this is what's called a double-write buffer, or what was tried > some years ago under that name. A significant problem is that you > have to fsync() the double-write buffer before you can write the WAL. I don't think it does need to be

Reduce/eliminate the impact of FPW

2020-08-03 Thread Daniel Wood
I thought that the biggest reason for the pgbench RW slowdown during a checkpoint was the flood of dirty page writes increasing the COMMIT latency. It turns out that the documentation which states that FPW's start "after a checkpoint" really means after a CKPT starts. And this is the really

Wait profiling

2020-07-10 Thread Daniel Wood
After nearly 5 years does something like the following yet exist? https://www.postgresql.org/message-id/559d4729.9080...@postgrespro.ru I feel that it would be useful to have the following two things. One PG enhancement and one standard extension. 1) An option to "explain" to produce a wait

Re: 'Invalid lp' during heap_xlog_delete

2019-12-10 Thread Daniel Wood
> On December 6, 2019 at 3:06 PM Andres Freund wrote: ... > > crash > > smgrtruncate - Not reached > > This seems like a somewhat confusing description to me, because > smgrtruncate() is what calls DropRelFileNodeBuffers(). I assume what you > mean by "smgrtruncate" is not the function, but the

Re: 'Invalid lp' during heap_xlog_delete

2019-11-18 Thread Daniel Wood
> mailto:mich...@paquier.xyz > wrote: > > > On Thu, Nov 14, 2019 at 07:38:19PM -0800, Daniel Wood wrote: > > > > Sorry I missed one thing. Turn off full page writes. > > > > > Hmm. Linux FSes use typically 4kB pages. I'

Re: 'Invalid lp' during heap_xlog_delete

2019-11-14 Thread Daniel Wood
Sorry I missed one thing. Turn off full page writes. I'm running in an env. with atomic 8K writes. > On November 12, 2019 at 6:23 PM Daniel Wood wrote: > > It's been tedious to get it exactly right but I think I got it. FYI, I > was delayed because today we had yet another

Re: 'Invalid lp' during heap_xlog_delete

2019-11-12 Thread Daniel Wood
:mich...@paquier.xyz > wrote: > > > On Fri, Nov 08, 2019 at 06:44:08PM -0800, Daniel Wood wrote: > > > > I repro'ed on PG11 and PG10 STABLE but several months old. > > I looked at 6d05086 but it doesn't address the core issue

Re: 'Invalid lp' during heap_xlog_delete

2019-11-08 Thread Daniel Wood
the truncate seems to plug the hole. > On November 8, 2019 at 5:39 PM Michael Paquier < mich...@paquier.xyz > mailto:mich...@paquier.xyz > wrote: > > > On Fri, Nov 08, 2019 at 12:46:51PM -0800, Daniel Wood wrote: > > > > Is DropRelFileN

'Invalid lp' during heap_xlog_delete

2019-11-08 Thread Daniel Wood
Page on disk has empty lp 1 * Insert into page lp 1 checkpoint START.Redo eventually starts here. ** Delete all rows on page. autovac truncate DropRelFileNodeBuffers - dirty page NOT written. lp 1 on disk still empty checkpoint completes crash smgrtruncate - Not reached heap_xlog_delete

Re: BTP_DELETED leaf still in tree

2019-10-10 Thread Daniel Wood
> On October 10, 2019 at 1:18 PM Peter Geoghegan wrote: > > > On Thu, Oct 10, 2019 at 12:48 PM Daniel Wood wrote: > > Update query stuck in a loop. Looping in _bt_moveright(). > > You didn't say which PostgreSQL versions were involved, and if the > database was eve

pgbench prints suspect tps numbers

2019-06-24 Thread Daniel Wood
Short benchmark runs are bad if the runs aren't long enough to produce consistent results. Having to do long runs because a benchmarking tool 'converges to reality' over time in reporting a tps number, due to miscalculation, is also bad. I want to measure TPS at a particular connection count.

Re: Skylake-S warning

2018-10-03 Thread Daniel Wood
One other thought. Could we update pgxact->xmin less often? What would be the impact of this lower bound being lower than it would normally be with the existing scheme. Yes, it needs to be moved forward "occasionally". FYI, be careful with padding PGXACT's to a full cache line. With 1024

Re: Skylake-S warning

2018-10-03 Thread Daniel Wood
> On October 3, 2018 at 3:55 PM Andres Freund wrote: > In the thread around > https://www.postgresql.org/message-id/20160411214029.ce3fw6zxim5k6...@alap3.anarazel.de > I'd found doing more aggressive padding helped a lot. Unfortunately I > didn't pursue this further :( Interesting. Looks

Skylake-S warning

2018-10-03 Thread Daniel Wood
If running benchmarks or you are a customer which is currently impacted by GetSnapshotData() on high end multisocket systems be wary of Skylake-S. Performance differences of nearly 2X can be seen on select only pgbench due to nothing else but unlucky choices for max_connections. Scale 1000,

GetSnapshotData round two(for me)

2018-09-24 Thread Daniel Wood
I was about to suggest creating a single shared snapshot instead of having multiple backends compute what is essentially the same snapshot. Luckily, before posting, I discovered Avoiding repeated snapshot computation

Re: On the need for a snapshot in exec_bind_message()

2018-09-05 Thread Daniel Wood
> > Queries stop getting re-optimized after 5 times, unless better plans are to > > be had. In the absence of schema changes or changing search path why is > > the snapshot needed? > > The snapshot has little to do with the query plan, usually. It's about > what view of the database the

Re: On the need for a snapshot in exec_bind_message()

2018-09-05 Thread Daniel Wood
> > exec_bind_message() > > PushActiveSnapshot(GetTransactionSnapshot()); > > > If there were no input functions, that needed this, nor reparsing or > > reanalyzing needed, and we knew this up front, it'd be a huge win. > > Unfortunately, that's not the case, so I think trying to get

On the need for a snapshot in exec_bind_message()

2018-09-05 Thread Daniel Wood
In particular: exec_bind_message() PushActiveSnapshot(GetTransactionSnapshot()); Suppressing this I've achieved over 1.9 M TXN's a second on select only pgbench on a 48 core box. It is about 50% faster with this change. The cpu usage of GetSnapshotData drops from about 22% to

First steps to being a contributer

2018-08-27 Thread Daniel Wood
Having quit Amazon, where I was doing Postgres development, I've started looking at various things I might work on for fun. One thought is to start with something easy like the scalability of GetSnapshotData(). :-) I recently found it interesting to examine performance while running near 1