On Tue, May 14, 2024 at 8:19 PM Robert Haas wrote:
>
> I looked at your version and wrote something that is shorter and
> doesn't touch any existing text. Here it is.
Hi Robert, you are a real tactician here - thanks for whatever
references the original problem! :) Maybe just slight hint nearby
Hi Peter!
On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut wrote:
>
> On 07.05.24 09:43, Jakub Wartak wrote:
> > NOTE: in case one will be testing this: one cannot ./configure with
> > --enable-debug as it prevents the compiler optimizations that actually
> > end up
Hi Tom and -hackers!
On Thu, Mar 28, 2024 at 7:36 PM Tom Lane wrote:
>
> Jakub Wartak writes:
> > While chasing some other bug I've learned that backtrace_functions
> > might be misleading with top elog/ereport() address.
>
> That was understood from the beginning
Hi Ashutosh & hackers,
On Mon, Apr 15, 2024 at 9:00 AM Ashutosh Bapat
wrote:
>
> Here's patch with
>
[..]
> Adding to the next commitfest but better to consider this for the next set of
> minor releases.
1. The patch does not pass cfbot -
https://cirrus-ci.com/task/5486258451906560 on master
Hi,
> My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the
> bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is
> that it determines the max size passed to WALRead(), which in turn determines
> how much we read from the OS at once. If the storage
On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier wrote:
>
> On Mon, Apr 22, 2024 at 03:40:01PM +0200, Majid Garoosi wrote:
> > Any news, comments, etc. about this thread?
>
> FWIW, I'd still be in favor of doing a GUC-ification of this part, but
> at this stage I'd need more time to do a proper
On Thu, Apr 4, 2024 at 9:11 PM Tomas Vondra
wrote:
>
> On 4/4/24 19:38, Robert Haas wrote:
> > Hi,
> >
> > Yesterday, Tomas Vondra reported to me off-list that he was seeing
> > what appeared to be data corruption after taking and restoring an
> > increme
fast ability to "restore" the clone rather than copying the
data from somewhere else)
- pg_basebackup without that would be unusuable without space savings
(e.g. imagine daily backups @ 10+TB DWHs)
> On 4/3/24 15:39, Jakub Wartak wrote:
> > On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra
On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra
wrote:
>
> Hi,
>
> I've been running some benchmarks and experimenting with various stuff,
> trying to improve the poor performance on ZFS, and the regression on XFS
> when using copy_file_range. And oh boy, did I find interesting stuff ...
[..]
Hi Andrey,
On Thu, Mar 28, 2024 at 1:09 PM Andrey M. Borodin wrote:
>
>
>
> > On 8 Aug 2023, at 12:31, John Naylor wrote:
> >
> > > > Also the shared counter is the cause of the slowdown, but not the
> > > > reason for the numeric limit.
> > >
> > > Isn't it both? typedef Oid is unsigned int =
f/s f_await aqu-sz %util
nvme0n1 61212.00591.82 0.00 0.000.10 9.90
2.00 0.02 0.00 0.000.0012.000.00 0.00
0.00 0.000.00 0.000.000.006.28 85.20
So in short it looks good to me.
-Jakub Wartak.
Hi -hackers,
While chasing some other bug I've learned that backtrace_functions
might be misleading with top elog/ereport() address.
Reproducer:
# using Tom's reproducer on master:
wget
https://www.postgresql.org/message-id/attachment/112394/ri-collation-bug-example.sql
echo '' >>
On Tue, Mar 26, 2024 at 7:03 PM Tomas Vondra
wrote:
[..]
>
> That's really strange.
Hi Tomas, but it looks like it's fixed now :)
> > --manifest-checksums=NONE --copy-file-range without v20240323-2-0002:
> > 27m23.887s
> > --manifest-checksums=NONE --copy-file-range with v20240323-2-0002 and
>
On Sat, Mar 23, 2024 at 6:57 PM Tomas Vondra
wrote:
> On 3/23/24 14:47, Tomas Vondra wrote:
> > On 3/23/24 13:38, Robert Haas wrote:
> >> On Fri, Mar 22, 2024 at 8:26 PM Thomas Munro
> >> wrote:
[..]
> > Yeah, that's in write_reconstructed_file() and the patch does not touch
> > that at all. I
Hi Tomas,
> I took a quick look at the remaining part adding copy_file_range to
> pg_combinebackup. The patch no longer applies, so I had to rebase it.
> Most of the issues were trivial, but I had to fix a couple missing
> prototypes - I added them to copy_file.h/c, mostly.
>
> 0001 is the
On Fri, Mar 1, 2024 at 3:58 PM Tomas Vondra
wrote:
[..]
> TBH I don't have a clear idea what to do. It'd be cool to have at least
> some benefits in v17, but I don't know how to do that in a way that
> would be useful in the future.
>
> For example, the v20240124 patch implements this in the
64(45, 335872, 8192, POSIX_FADV_WILLNEED) = 0
pread64(45, "\0\0\0\0\250\233r\4\0\0\4\0\370\1\0\2\0 \4
\0\0\0\0\300\237t\0\200\237t\0"..., 8192, 360448) = 8192
fadvise64(45, 524288, 8192, POSIX_FADV_WILLNEED) = 0
fadvise64(45, 352256, 8192, POSIX_FADV_WILLNEED) = 0
pread64(45, "\0\0\0\0\2
Hi Daniel,
On Tue, Jan 30, 2024 at 3:29 PM Daniel Verite wrote:
> PFA a rebased version.
Thanks for the patch! I've tested it using my original reproducer and
it works great now against the original problem description. I've
taken a quick look at the patch, it looks good for me. I've tested
On Fri, Jan 12, 2024 at 7:33 AM Bharath Rupireddy
wrote:
>
> On Wed, Jan 10, 2024 at 11:43 AM Tom Lane wrote:
> >
> > Bharath Rupireddy writes:
> > > On Wed, Jan 10, 2024 at 10:00 AM Tom Lane wrote:
> > >> Maybe. I bet just bumping up the constant by 2X or 4X or so would get
> > >> most of
Hi Robert,
On Tue, Dec 19, 2023 at 9:36 PM Robert Haas wrote:
>
> On Fri, Dec 15, 2023 at 5:36 AM Jakub Wartak
> wrote:
> > I've played with with initdb/pg_upgrade (17->17) and i don't get DBID
> > mismatch (of course they do differ after initdb), bu
Hi Robert,
On Wed, Dec 13, 2023 at 2:16 PM Robert Haas wrote:
>
>
> > > not even in case of an intervening
> > > timeline switch. So, all of the errors in this function are warning
> > > you that you've done something that you really should not have done.
> > > In this particular case, you've
Hi Robert,
On Mon, Dec 11, 2023 at 6:08 PM Robert Haas wrote:
>
> On Fri, Dec 8, 2023 at 5:02 AM Jakub Wartak
> wrote:
> > While we are at it, maybe around the below in PrepareForIncrementalBackup()
> >
> > if (tlep[i] == NULL)
> >
On Thu, Dec 7, 2023 at 4:15 PM Robert Haas wrote:
Hi Robert,
> On Thu, Dec 7, 2023 at 9:42 AM Jakub Wartak
> wrote:
> > Comment: I was wondering if it wouldn't make some sense to teach
> > pg_resetwal to actually delete all WAL summaries after any any
> > WAL/contr
On Tue, Dec 5, 2023 at 7:11 PM Robert Haas wrote:
[..v13 patchset]
The results with v13 patchset are following:
* - requires checkpoint on primary when doing incremental on standby
when it's too idle, this was explained by Robert in [1], something AKA
too-fast-incremental backup due to
On Mon, Nov 20, 2023 at 4:43 PM Robert Haas wrote:
>
> On Fri, Nov 17, 2023 at 5:01 AM Alvaro Herrera
> wrote:
> > I made a pass over pg_combinebackup for NLS. I propose the attached
> > patch.
>
> This doesn't quite compile for me so I changed a few things and
> incorporated it. Hopefully I
Hi Robert,
[..spotted the v9 patchset..]
so I've spent some time playing still with patchset v8 (without the
6/6 testing patch related to wal_level=minimal), with the exception of
- patchset v9 - marked otherwise.
1. On compile time there were 2 warnings to shadowing variable (at
least with gcc
On Mon, Oct 30, 2023 at 6:46 PM Robert Haas wrote:
>
> On Thu, Sep 28, 2023 at 6:22 AM Jakub Wartak
> wrote:
> > If that is still an area open for discussion: wouldn't it be better to
> > just specify LSN as it would allow resyncing standby across major lag
> > wh
Hi Robert,
On Wed, Oct 4, 2023 at 10:09 PM Robert Haas wrote:
>
> On Tue, Oct 3, 2023 at 2:21 PM Robert Haas wrote:
> > Here's a new patch set, also addressing Jakub's observation that
> > MINIMUM_VERSION_FOR_WAL_SUMMARIES needed updating.
>
> Here's yet another new version.[..]
Okay, so
On Fri, Sep 29, 2023 at 4:00 AM Michael Paquier wrote:
>
> On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
> > v3 attached. I had a problem coming out with a better error message,
> > so suggestions are welcome. The cast still needs to be present as per
> >
On Wed, Aug 30, 2023 at 4:50 PM Robert Haas wrote:
[..]
I've played a little bit more this second batch of patches on
e8d74ad625f7344f6b715254d3869663c1569a51 @ 31Aug (days before wait
events refactor):
test_across_wallevelminimal.sh
test_many_incrementals_dbcreate.sh
test_many_incrementals.sh
On Thu, Sep 28, 2023 at 12:53 AM Michael Paquier wrote:
>
> On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
> > I don't think going for size_t is a viable path for fixing this. I'm pretty
> > sure the initial patch would trigger a type mismatch from guc_tables.c - we
> > don't have
On Wed, Sep 27, 2023 at 10:08 AM Michael Paquier wrote:
>
> On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
> > Attached patch adjusts pgstat_track_activity_query_size to be of
> > size_t from int and fixes the issue.
>
> This cannot be backpatched, and us
to (int) * (int), while
MemoryContextAllocHuge() allows taking Size(size_t) as parameter. I
get similar behaviour with:
size_t val = (int)1048576 * (int)3022;
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.
Regards,
-Jakub Wartak.
0001-Adjust
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation:tested, passed
I've tested the patched on 17devel/master and it is my feeling -
On Mon, Jul 10, 2023 at 6:24 PM Andres Freund wrote:
>
> Hi,
>
> On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote:
> > Out of curiosity I've tried and it is reproducible as you have stated : XFS
> > @ 4.18.0-425.10.1.el8_7.x86_64:
> >...
> > According t
On Tue, Jun 13, 2023 at 10:20 AM John Naylor
wrote:
Hi John,
v3 is attached for review.
> >
> >-
> >+ see note below on TOAST
>
> Maybe:
> "further limited by the number of TOAST-ed values; see note below"
Fixed.
> > I've wrongly put it, I've meant that pg_largeobject also
Hi Masahiko,
Out of curiosity I've tried and it is reproducible as you have stated : XFS
@ 4.18.0-425.10.1.el8_7.x86_64:
[root@rockyora ~]# time ./test test.1 1
total 20
fallocate 20
filewrite 0
real0m5.868s
user0m0.035s
sys 0m5.716s
[root@rockyora ~]# time
to hack a palloc()
a little, but that has probably too big overhead, right? (just
thinking loud).
-Jakub Wartak.
tch passed all my very limited tests along with
make check-world. Patch looks good to me on the surface from a
usability point of view. I haven't looked at the code, so the patch
might still need an in-depth review.
Regards,
-Jakub Wartak.
t pg_largeobject also consume OID
and as such are subject to 32TB limit.
>
> +
> + large objects number
>
> "large objects per database"
Fixed.
> + subject to the same limitations as rows per
> table
>
> That implies table size is the only factor. Max OID is also a factor, which
> was your stated reason to include LOs here in the first place.
Exactly..
Regards,
-Jakub Wartak.
v2-0001-doc-Add-some-OID-TOAST-related-limitations-to-the.patch
Description: Binary data
Hi,
>> These 2 discussions show that it's a painful experience to run into
>> this problem, and that the hackers have ideas on how to fix it, but
>> those fixes haven't materialized for years. So I would say that, yes,
>> this info belongs in the hard-limits section, because who knows how
>> long
Hi -hackers,
I would like to ask if it wouldn't be good idea to copy the
https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit
discussion (out-of-line OID usage per TOAST-ed columns / potential
limitation) to the official "Appendix K. PostgreSQL Limits" with also
little bonus mentioning
Hi, I've tested the attached patch by Justin and it applied almost
cleanly to the master, but there was a tiny typo and make
postgres-A4.pdf didn't want to run:
Note that creating a partition using PARTITION OF
=> (note lack of closing literal) =>
Note that creating a partition using PARTITION OF
On Thu, Feb 2, 2023 at 11:03 AM Tomas Vondra
wrote:
> > I agree that some other concurrent backend's
> > COMMIT could fsync it, but I was wondering if that's sensible
> > optimization to perform (so that issue_fsync() would be called for
> > only commit/rollback records). I can imagine a
On Wed, Feb 1, 2023 at 2:14 PM Tomas Vondra
wrote:
> > Maybe we should avoid calling fsyncs for WAL throttling? (by teaching
> > HandleXLogDelayPending()->XLogFlush()->XLogWrite() to NOT to sync when
> > we are flushing just because of WAL thortting ?) Would that still be
> > safe?
>
> It's not
On Mon, Jan 30, 2023 at 9:16 AM Bharath Rupireddy
wrote:
Hi Bharath, thanks for reviewing.
> I think measuring the number of WAL flushes with and without this
> feature that the postgres generates is great to know this feature
> effects on IOPS. Probably it's even better with variations in
>
Hi Bharath,
On Fri, Jan 27, 2023 at 12:04 PM Bharath Rupireddy
wrote:
>
> On Fri, Jan 27, 2023 at 2:03 PM Alvaro Herrera
> wrote:
> >
> > On 2023-Jan-27, Bharath Rupireddy wrote:
> >
> > > Looking at the patch, the feature, in its current shape, focuses on
> > > improving replication lag (by
Hi,
v2 is attached.
On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote:
> Huh? Why did you remove the GUC?
After reading previous threads, my optimism level of getting it ever
in shape of being widely accepted degraded significantly (mainly due
to the discussion of wider category of 'WAL I/O
> On 1/25/23 20:05, Andres Freund wrote:
> > Hi,
> >
> > Such a feature could be useful - but I don't think the current place of
> > throttling has any hope of working reliably:
[..]
> > You're blocking in the middle of an XLOG insertion.
[..]
> Yeah, I agree the sleep would have to happen
0
Time: 22737.729 ms (00:22.738)
Without this feature (or with synchronous_commit_flush_wal_after=0)
the TCP's SendQ on socket walsender-->walreceiver is growing and as
such any next sendto() by OLTP backends/walwriter ends being queued
too much causing stalls of activity.
-Jakub Warta
priority of this, how about
adding it as a TODO wiki item then and maybe adding just some warning
instead? I've intentionally avoided parsing grammar and regexp so it's
not perfect (not that I do care about this too much either, as web
crawlers already have indexed this $thread). BTW I've found tw
at('a', 100) ||
data.Total || repeat('b', 800) as total_pat from data;" | wc -l
2000
postgres@hive:~$
Regards,
-Jakub Wartak.
0001-psql-allow-CTE-queries-to-be-executed-also-using-cur.patch
Description: Binary data
d=indexRelationId@entry=0,
parentIndexId=parentIndexId@entry=0
-Jakub Wartak.
On Fri, Nov 25, 2022 at 9:48 AM Tomas Vondra
wrote:
>
>
>
> On 11/18/22 15:43, Tom Lane wrote:
> > David Geier writes:
> >> On a different note: are we frequently running our tests suites with
> >
Hi all,
apologies the patch was rushed too quickly - my bad. I'm attaching a
fixed one as v0004 (as it is the 4th patch floating around here).
-Jakub Wartak
On Mon, Nov 21, 2022 at 9:55 PM Robert Haas wrote:
>
> On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote:
> > On Novembe
Hi,
Draft version of the patch attached (it is based on Simon's)
I would be happier if we could make that #define into GUC (just in
case), although I do understand the effort to reduce the number of
various knobs (as their high count causes their own complexity).
-Jakub Wartak.
On Mon, Nov 21
hints cleaning was not kicking in, but maybe it was, but given
the scale of the problem it was not helping much).
-Jakub Wartak.
[1] -
https://www.postgresql.org/message-id/flat/54446AE2.6080909%40BlueTreble.com#f436bb41cf044b30eeec29472a13631e
[2] -
https://www.postgresql.org/message-id/flat/db7111
Hey Andrey,
> > 23 июня 2022 г., в 13:50, Jakub Wartak
> написал(а):
> >
> > Thoughts?
> The patch leaves 1st 128KB chunk unprefetched. Does it worth to add and extra
> branch for 120KB after 1st block when readOff==0?
> Or maybe do
> + posix_fadvis
>> > On 21 Jun 2022, at 16:59, Jakub Wartak wrote:
>> Oh, wow, your benchmarks show really impressive improvement.
>>
>> > I think that 1 additional syscall is not going to be cheap just for
>> > non-standard OS configurations
>> Also we ca
> On Tue, Jun 21, 2022 at 10:33 PM Jakub Wartak
> wrote:
> > > > Maybe the important question is why would be readahead mechanism
> > > > be
> > > disabled in the first place via /sys | blockdev ?
> > >
> > > Because database should kn
> > Maybe the important question is why would be readahead mechanism be
> disabled in the first place via /sys | blockdev ?
>
> Because database should know better than OS which data needs to be
> prefetched and which should not. Big OS readahead affects index scan
> performance.
OK fair point,
>> > On 21 Jun 2022, at 12:35, Amit Kapila wrote:
>> >
>> > I wonder if the newly introduced "recovery_prefetch" [1] for PG-15 can
>> > help your case?
>>
>> AFAICS recovery_prefetch tries to prefetch main fork, but does not try to
>> prefetch WAL itself before reading it. Kirill is trying to
> On 6/9/22 13:23, Jakub Wartak wrote:
> >>>>>>> The really
> >>>>>> puzzling thing is why is the filesystem so much slower for
> >>>>>> smaller pages. I mean, why would writing 1K be 1/3 of writing 4K?
> >>
> > The really
> puzzling thing is why is the filesystem so much slower for smaller
> pages. I mean, why would writing 1K be 1/3 of writing 4K?
> Why would a filesystem have such effect?
> >>>
> >>> Ha! I don't care at this point as 1 or 2kB seems too small to handle
> >>> many
Hi, got some answers!
TL;DR for fio it would make sense to use many stressfiles (instead of 1) and
same for numjobs ~ VCPU to avoid various pitfails.
> >> The really
> >> puzzling thing is why is the filesystem so much slower for smaller
> >> pages. I mean, why would writing 1K be 1/3 of
> >> The attached patch is a trivial version that waits until we're at
> >> least
> >> 32 pages behind the target, and then prefetches all of them. Maybe give it
> >> a
> try?
> >> (This pretty much disables prefetching for e_i_c below 32, but for an
> >> experimental patch that's enough.)
> >
>
Hi,
> The really
> puzzling thing is why is the filesystem so much slower for smaller pages. I
> mean,
> why would writing 1K be 1/3 of writing 4K?
> Why would a filesystem have such effect?
Ha! I don't care at this point as 1 or 2kB seems too small to handle many real
world scenarios ;)
> >
Hi Tomas,
> > I have a machine here with 1 x PCIe 3.0 NVMe SSD and also 1 x PCIe 4.0
> > NVMe SSD. I ran a few tests to see how different values of
> > effective_io_concurrency would affect performance. I tried to come up
> > with a query that did little enough CPU processing to ensure that I/O
>
[..]
>I doubt we could ever
> make the default smaller than it is today as it would nobody would be able to
> insert rows larger than 4 kilobytes into a table anymore.
Add error "values larger than 1/3 of a buffer page cannot be indexed" to that
list...
-J.
Hi Tomas,
> Well, there's plenty of charts in the github repositories, including the
> charts I
> think you're asking for:
Thanks.
> I also wonder how is this related to filesystem page size - in all the
> benchmarks I
> did I used the default (4k), but maybe it'd behave if the filesystem
Hi Tomas,
> Hi,
>
> At on of the pgcon unconference sessions a couple days ago, I presented a
> bunch of benchmark results comparing performance with different data/WAL
> block size. Most of the OLTP results showed significant gains (up to 50%) with
> smaller (4k) data pages.
Nice. I just saw
Hi Nathan,
> > NVMe devices have a maximum queue length of 64k:
[..]
> > but our effective_io_concurrency maximum is 1,000:
[..]
> > Should we increase its maximum to 64k? Backpatched? (SATA has a
> > maximum queue length of 256.)
>
> If there are demonstrable improvements with higher values,
Hi Pavel,
> I have not debug symbols, so I have not more details now
> Breakpoint 1 at 0x7f557f0c16c0
> (gdb) c
> Continuing.
> Breakpoint 1, 0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6
> (gdb) bt
> #0 0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6
> #1 0x7f557f04dd91 in
> I do agree that the perf report does indicate that the extra time is taken
> due to
> some large amount of memory being allocated. I just can't quite see how that
> would happen in Memoize given that
> estimate_num_groups() clamps the distinct estimate as the number of input
> rows, which is 91
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation:not tested
I've retested v15 of the patch with everything that came to my
Hi Kyotaro,
> At Tue, 21 Dec 2021 13:07:28 +0000, Jakub Wartak
> wrote in
> > So what's suspicious is that 122880 -> 0 file size truncation. I've
> > investigated WAL and it seems to contain TRUNCATE records after logged
> FPI images, so when the crash recovery would ki
Hi Kyotaro,
> I took a bit too long detour but the patch gets to pass make-world for me.
Good news, v10 passes all the tests for me (including TAP recover ones).
There's major problem I think:
drop table t6;
create unlogged table t6 (id bigint, t text);
create sequence s1;
insert into t6
Hi Kyotaro,
> At Mon, 20 Dec 2021 17:39:27 +0900 (JST), Kyotaro Horiguchi
> wrote in
> > At Mon, 20 Dec 2021 07:59:29 +, Jakub Wartak
> > wrote in
> > > BTW fast feedback regarding that ALTER patch (there were 4 unlogged
> tables):
> > > # ALTER
Hi Kyotaro, I'm glad you are still into this
> I didn't register for some reasons.
Right now in v8 there's a typo in ./src/backend/catalog/storage.c :
storage.c: In function 'RelationDropInitFork':
storage.c:385:44: error: expected statement before ')' token
pending->unlink_forknum !=
> Justin wrote:
> On Fri, Dec 17, 2021 at 09:10:30AM +, Jakub Wartak wrote:
> > As the thread didn't get a lot of traction, I've registered it into current
> commitfest
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitf
> est.postgresql.org%2F36%
dy for review' state.
I think it behaves as almost finished one and apparently after reading all
those discussions that go back over 10years+ time span about this feature, and
lot of failed effort towards wal_level=noWAL I think it would be nice to
finally start getting some of that of it into the core.
-Jakub Wartak.
v7-0001-In-place-table-persistence-change-with-new-comman.patch
Description: v7-0001-In-place-table-persistence-change-with-new-comman.patch
doesn't feel like it is
going to make stuff crash., so again I think it is good idea.
-Jakub Wartak.
On 2021-Sep-25, Alvaro Herrera wrote:
>> On 2021-Sep-24, Alvaro Herrera wrote:
>>
>> > Here's the set for all branches, which I think are really final, in
>> > case somebody wants to play and reproduce their respective problem
>> scenarios.
>>
>> I forgot to mention that I'll wait until 14.0 is
Hi Álvaro, -hackers,
> I attach the patch with the change you suggested.
I've gave a shot to to the v02 patch on top of REL_12_STABLE (already including
5065aeafb0b7593c04d3bc5bc2a86037f32143fc). Previously(yesterday) without the
v02 patch I was getting standby corruption always via
> On Fri, Jul 30, 2021 at 4:00 PM Andres Freund wrote:
> > I don't agree with that? If (user+system) << wall then it is very
> > likely that recovery is IO bound. If system is a large percentage of
> > wall, then shared buffers is likely too small (or we're replacing the
> > wrong
> > buffers)
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind
of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1)
/ mcelog (1) from what's out there (e.g. does it run ECC or not ?)
-J.
> -Original Message-
> From: Alvaro Herrera
> Sent:
Hey David,
> I think you'd have to batch by filenode and transaction in that case. Each
> batch might be pretty small on a typical OLTP workload, so it might not help
> much there, or it might hinder.
True, it is very workload dependent (I was chasing mainly INSERTs multiValues,
Hi David, Alvaro, -hackers
> Hi David,
>
> You're probably aware of this, but just to make it explicit: Jakub Wartak was
> testing performance of recovery, and one of the bottlenecks he found in
> some of his cases was dynahash as used by SMgr. It seems quite possible
> th
Hi Ray,
> So can we delete the limit of ArchiveRecoveryRequested, and enable launch
> bgwriter in master node ?
Please take a look on https://commitfest.postgresql.org/29/2706/ and the
related email thread.
-J.
> > I'm drawing a blank on trivial candidate uses for preadv(), without
> > infrastructure from later patches.
>
> Can't immediately think of something either.
This might be not that trivial , but maybe acquire_sample_rows() from analyze.c
?
Please note however there's patch
posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall.
-J.
From: Stephen Frost
Sent: Tuesday, November 3, 2020 6:47 PM
To: Jakub Wartak
Cc: pgsql-hackers
Subject: Re: automatic analyze: readahead - add "IO read time" log message
Greetings,
* Jak
Hi Stephen, hackers,
>> > With all those 'readahead' calls it certainly makes one wonder if the
>> > Linux kernel is reading more than just the block we're looking for
>> > because it thinks we're doing a sequential read and will therefore want
>> > the next few blocks when, in reality, we're
Hi Stephen, hackers,
> The analyze is doing more-or-less random i/o since it's skipping through
> the table picking out select blocks, not doing regular sequential i/o.
VS
>> Breakpoint 1, heapam_scan_analyze_next_block (scan=0x10c8098,
>> blockno=19890910, bstrategy=0x1102278) at
Greetings hackers,
I have I hope interesting observation (and nano patch proposal) on system where
statistics freshness is a critical factor. Autovacuum/autogathering statistics
was tuned to be pretty very aggressive:
autovacuum_vacuum_cost_delay=0 (makes autovacuum_vacuum_cost_limit
Li Japin wrote:
> If we can improve the efficiency of replay, then we can shorten the database
> recovery time (streaming replication or database crash recovery).
(..)
> For streaming replication, we may need to improve the transmission of WAL
> logs to improve the entire recovery process.
>
David Rowley wrote:
> I've attached patches in git format-patch format. I'm proposing to commit
> these in about 48 hours time unless there's some sort of objection before
> then.
Hi David, no objections at all, I've just got reaffirming results here, as per
[1] (SLRU thread but combined
ybe on NUMA boxes), not just WAL recovery as it
seems relatively easy to improve.
-J.
[1] - https://github.com/macdice/redo-bench
[2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74
From: Thomas Munro
Sent: Tuesday, September 8, 2020 2:55 AM
To: Alvaro Herrera
untEntry() -> dynahash that could be called pretty
often I have no idea what kind of pgbench stresstest could be used to
demonstrate the gain (or lack of it).
-Jakub Wartak.
L at once (b) then issuing
preadv() to get all the DB blocks into s_b going from the same rel/fd (c)
applying WAL. Sounds like a major refactor just to save syscalls :(
- mmap() - even more unrealistic
- IO_URING - gives a lot of promise here I think, is it even planned to be
shown for PgSQL14
ore complex to reproduce what I'm
after and involves a lot of reading about LogStandbySnapshot() / standby
recovery points on my side.
Now, back to smgropen() hash_search_by_values() reproducer...
-Jakub Wartak.
SERTs wih plenty of data in VALUES() thrown as one commit, real
primary->hot-standby replication [not closed DB in recovery], sorted not random
UUIDs) - I'm going to try nail down these differences and maybe I manage to
produce more realistic "pgbench reproducer" (this may take some time though).
-Jakub Wartak.
1 - 100 of 102 matches
Mail list logo