Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2024-05-20 Thread Jakub Wartak
On Tue, May 14, 2024 at 8:19 PM Robert Haas wrote: > > I looked at your version and wrote something that is shorter and > doesn't touch any existing text. Here it is. Hi Robert, you are a real tactician here - thanks for whatever references the original problem! :) Maybe just slight hint nearby

Re: elog/ereport VS misleading backtrace_function function address

2024-05-14 Thread Jakub Wartak
Hi Peter! On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut wrote: > > On 07.05.24 09:43, Jakub Wartak wrote: > > NOTE: in case one will be testing this: one cannot ./configure with > > --enable-debug as it prevents the compiler optimizations that actually > > end up

Re: elog/ereport VS misleading backtrace_function function address

2024-05-07 Thread Jakub Wartak
Hi Tom and -hackers! On Thu, Mar 28, 2024 at 7:36 PM Tom Lane wrote: > > Jakub Wartak writes: > > While chasing some other bug I've learned that backtrace_functions > > might be misleading with top elog/ereport() address. > > That was understood from the beginning

Re: apply_scanjoin_target_to_paths and partitionwise join

2024-05-06 Thread Jakub Wartak
Hi Ashutosh & hackers, On Mon, Apr 15, 2024 at 9:00 AM Ashutosh Bapat wrote: > > Here's patch with > [..] > Adding to the next commitfest but better to consider this for the next set of > minor releases. 1. The patch does not pass cfbot - https://cirrus-ci.com/task/5486258451906560 on master

Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-24 Thread Jakub Wartak
Hi, > My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the > bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is > that it determines the max size passed to WALRead(), which in turn determines > how much we read from the OS at once. If the storage

Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-23 Thread Jakub Wartak
On Tue, Apr 23, 2024 at 2:24 AM Michael Paquier wrote: > > On Mon, Apr 22, 2024 at 03:40:01PM +0200, Majid Garoosi wrote: > > Any news, comments, etc. about this thread? > > FWIW, I'd still be in favor of doing a GUC-ification of this part, but > at this stage I'd need more time to do a proper

Re: incremental backup breakage in BlockRefTableEntryGetBlocks

2024-04-05 Thread Jakub Wartak
On Thu, Apr 4, 2024 at 9:11 PM Tomas Vondra wrote: > > On 4/4/24 19:38, Robert Haas wrote: > > Hi, > > > > Yesterday, Tomas Vondra reported to me off-list that he was seeing > > what appeared to be data corruption after taking and restoring an > > increme

Re: pg_combinebackup --copy-file-range

2024-04-04 Thread Jakub Wartak
fast ability to "restore" the clone rather than copying the data from somewhere else) - pg_basebackup without that would be unusuable without space savings (e.g. imagine daily backups @ 10+TB DWHs) > On 4/3/24 15:39, Jakub Wartak wrote: > > On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra

Re: pg_combinebackup --copy-file-range

2024-04-03 Thread Jakub Wartak
On Mon, Apr 1, 2024 at 9:46 PM Tomas Vondra wrote: > > Hi, > > I've been running some benchmarks and experimenting with various stuff, > trying to improve the poor performance on ZFS, and the regression on XFS > when using copy_file_range. And oh boy, did I find interesting stuff ... [..]

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2024-04-03 Thread Jakub Wartak
Hi Andrey, On Thu, Mar 28, 2024 at 1:09 PM Andrey M. Borodin wrote: > > > > > On 8 Aug 2023, at 12:31, John Naylor wrote: > > > > > > Also the shared counter is the cause of the slowdown, but not the > > > > reason for the numeric limit. > > > > > > Isn't it both? typedef Oid is unsigned int =

Re: Use streaming read API in ANALYZE

2024-04-03 Thread Jakub Wartak
f/s f_await aqu-sz %util nvme0n1 61212.00591.82 0.00 0.000.10 9.90 2.00 0.02 0.00 0.000.0012.000.00 0.00 0.00 0.000.00 0.000.000.006.28 85.20 So in short it looks good to me. -Jakub Wartak.

elog/ereport VS misleading backtrace_function function address

2024-03-28 Thread Jakub Wartak
Hi -hackers, While chasing some other bug I've learned that backtrace_functions might be misleading with top elog/ereport() address. Reproducer: # using Tom's reproducer on master: wget https://www.postgresql.org/message-id/attachment/112394/ri-collation-bug-example.sql echo '' >>

Re: pg_combinebackup --copy-file-range

2024-03-27 Thread Jakub Wartak
On Tue, Mar 26, 2024 at 7:03 PM Tomas Vondra wrote: [..] > > That's really strange. Hi Tomas, but it looks like it's fixed now :) > > --manifest-checksums=NONE --copy-file-range without v20240323-2-0002: > > 27m23.887s > > --manifest-checksums=NONE --copy-file-range with v20240323-2-0002 and >

Re: pg_upgrade --copy-file-range

2024-03-26 Thread Jakub Wartak
On Sat, Mar 23, 2024 at 6:57 PM Tomas Vondra wrote: > On 3/23/24 14:47, Tomas Vondra wrote: > > On 3/23/24 13:38, Robert Haas wrote: > >> On Fri, Mar 22, 2024 at 8:26 PM Thomas Munro > >> wrote: [..] > > Yeah, that's in write_reconstructed_file() and the patch does not touch > > that at all. I

Re: pg_upgrade --copy-file-range

2024-03-20 Thread Jakub Wartak
Hi Tomas, > I took a quick look at the remaining part adding copy_file_range to > pg_combinebackup. The patch no longer applies, so I had to rebase it. > Most of the issues were trivial, but I had to fix a couple missing > prototypes - I added them to copy_file.h/c, mostly. > > 0001 is the

Re: index prefetching

2024-03-05 Thread Jakub Wartak
On Fri, Mar 1, 2024 at 3:58 PM Tomas Vondra wrote: [..] > TBH I don't have a clear idea what to do. It'd be cool to have at least > some benefits in v17, but I don't know how to do that in a way that > would be useful in the future. > > For example, the v20240124 patch implements this in the

Re: index prefetching

2024-03-01 Thread Jakub Wartak
64(45, 335872, 8192, POSIX_FADV_WILLNEED) = 0 pread64(45, "\0\0\0\0\250\233r\4\0\0\4\0\370\1\0\2\0 \4 \0\0\0\0\300\237t\0\200\237t\0"..., 8192, 360448) = 8192 fadvise64(45, 524288, 8192, POSIX_FADV_WILLNEED) = 0 fadvise64(45, 352256, 8192, POSIX_FADV_WILLNEED) = 0 pread64(45, "\0\0\0\0\2

Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs

2024-02-08 Thread Jakub Wartak
Hi Daniel, On Tue, Jan 30, 2024 at 3:29 PM Daniel Verite wrote: > PFA a rebased version. Thanks for the patch! I've tested it using my original reproducer and it works great now against the original problem description. I've taken a quick look at the patch, it looks good for me. I've tested

Re: Make NUM_XLOGINSERT_LOCKS configurable

2024-01-15 Thread Jakub Wartak
On Fri, Jan 12, 2024 at 7:33 AM Bharath Rupireddy wrote: > > On Wed, Jan 10, 2024 at 11:43 AM Tom Lane wrote: > > > > Bharath Rupireddy writes: > > > On Wed, Jan 10, 2024 at 10:00 AM Tom Lane wrote: > > >> Maybe. I bet just bumping up the constant by 2X or 4X or so would get > > >> most of

Re: trying again to get incremental backup

2023-12-20 Thread Jakub Wartak
Hi Robert, On Tue, Dec 19, 2023 at 9:36 PM Robert Haas wrote: > > On Fri, Dec 15, 2023 at 5:36 AM Jakub Wartak > wrote: > > I've played with with initdb/pg_upgrade (17->17) and i don't get DBID > > mismatch (of course they do differ after initdb), bu

Re: trying again to get incremental backup

2023-12-15 Thread Jakub Wartak
Hi Robert, On Wed, Dec 13, 2023 at 2:16 PM Robert Haas wrote: > > > > > not even in case of an intervening > > > timeline switch. So, all of the errors in this function are warning > > > you that you've done something that you really should not have done. > > > In this particular case, you've

Re: trying again to get incremental backup

2023-12-13 Thread Jakub Wartak
Hi Robert, On Mon, Dec 11, 2023 at 6:08 PM Robert Haas wrote: > > On Fri, Dec 8, 2023 at 5:02 AM Jakub Wartak > wrote: > > While we are at it, maybe around the below in PrepareForIncrementalBackup() > > > > if (tlep[i] == NULL) > >

Re: trying again to get incremental backup

2023-12-08 Thread Jakub Wartak
On Thu, Dec 7, 2023 at 4:15 PM Robert Haas wrote: Hi Robert, > On Thu, Dec 7, 2023 at 9:42 AM Jakub Wartak > wrote: > > Comment: I was wondering if it wouldn't make some sense to teach > > pg_resetwal to actually delete all WAL summaries after any any > > WAL/contr

Re: trying again to get incremental backup

2023-12-07 Thread Jakub Wartak
On Tue, Dec 5, 2023 at 7:11 PM Robert Haas wrote: [..v13 patchset] The results with v13 patchset are following: * - requires checkpoint on primary when doing incremental on standby when it's too idle, this was explained by Robert in [1], something AKA too-fast-incremental backup due to

Re: trying again to get incremental backup

2023-11-21 Thread Jakub Wartak
On Mon, Nov 20, 2023 at 4:43 PM Robert Haas wrote: > > On Fri, Nov 17, 2023 at 5:01 AM Alvaro Herrera > wrote: > > I made a pass over pg_combinebackup for NLS. I propose the attached > > patch. > > This doesn't quite compile for me so I changed a few things and > incorporated it. Hopefully I

Re: trying again to get incremental backup

2023-11-15 Thread Jakub Wartak
Hi Robert, [..spotted the v9 patchset..] so I've spent some time playing still with patchset v8 (without the 6/6 testing patch related to wal_level=minimal), with the exception of - patchset v9 - marked otherwise. 1. On compile time there were 2 warnings to shadowing variable (at least with gcc

Re: trying again to get incremental backup

2023-11-01 Thread Jakub Wartak
On Mon, Oct 30, 2023 at 6:46 PM Robert Haas wrote: > > On Thu, Sep 28, 2023 at 6:22 AM Jakub Wartak > wrote: > > If that is still an area open for discussion: wouldn't it be better to > > just specify LSN as it would allow resyncing standby across major lag > > wh

Re: trying again to get incremental backup

2023-10-20 Thread Jakub Wartak
Hi Robert, On Wed, Oct 4, 2023 at 10:09 PM Robert Haas wrote: > > On Tue, Oct 3, 2023 at 2:21 PM Robert Haas wrote: > > Here's a new patch set, also addressing Jakub's observation that > > MINIMUM_VERSION_FOR_WAL_SUMMARIES needed updating. > > Here's yet another new version.[..] Okay, so

Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-10-02 Thread Jakub Wartak
On Fri, Sep 29, 2023 at 4:00 AM Michael Paquier wrote: > > On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote: > > v3 attached. I had a problem coming out with a better error message, > > so suggestions are welcome. The cast still needs to be present as per > >

Re: trying again to get incremental backup

2023-09-28 Thread Jakub Wartak
On Wed, Aug 30, 2023 at 4:50 PM Robert Haas wrote: [..] I've played a little bit more this second batch of patches on e8d74ad625f7344f6b715254d3869663c1569a51 @ 31Aug (days before wait events refactor): test_across_wallevelminimal.sh test_many_incrementals_dbcreate.sh test_many_incrementals.sh

Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-09-28 Thread Jakub Wartak
On Thu, Sep 28, 2023 at 12:53 AM Michael Paquier wrote: > > On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote: > > I don't think going for size_t is a viable path for fixing this. I'm pretty > > sure the initial patch would trigger a type mismatch from guc_tables.c - we > > don't have

Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-09-27 Thread Jakub Wartak
On Wed, Sep 27, 2023 at 10:08 AM Michael Paquier wrote: > > On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote: > > Attached patch adjusts pgstat_track_activity_query_size to be of > > size_t from int and fixes the issue. > > This cannot be backpatched, and us

pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-09-27 Thread Jakub Wartak
to (int) * (int), while MemoryContextAllocHuge() allows taking Size(size_t) as parameter. I get similar behaviour with: size_t val = (int)1048576 * (int)3022; Attached patch adjusts pgstat_track_activity_query_size to be of size_t from int and fixes the issue. Regards, -Jakub Wartak. 0001-Adjust

Re: pg_stat_statements and "IN" conditions

2023-09-21 Thread Jakub Wartak
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed I've tested the patched on 17devel/master and it is my feeling -

Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-07-11 Thread Jakub Wartak
On Mon, Jul 10, 2023 at 6:24 PM Andres Freund wrote: > > Hi, > > On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote: > > Out of curiosity I've tried and it is reproducible as you have stated : XFS > > @ 4.18.0-425.10.1.el8_7.x86_64: > >... > > According t

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-07-05 Thread Jakub Wartak
On Tue, Jun 13, 2023 at 10:20 AM John Naylor wrote: Hi John, v3 is attached for review. > > > >- > >+ see note below on TOAST > > Maybe: > "further limited by the number of TOAST-ed values; see note below" Fixed. > > I've wrongly put it, I've meant that pg_largeobject also

Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-07-03 Thread Jakub Wartak
Hi Masahiko, Out of curiosity I've tried and it is reproducible as you have stated : XFS @ 4.18.0-425.10.1.el8_7.x86_64: [root@rockyora ~]# time ./test test.1 1 total 20 fallocate 20 filewrite 0 real0m5.868s user0m0.035s sys 0m5.716s [root@rockyora ~]# time

Re: memory leak in trigger handling (since PG12)

2023-05-24 Thread Jakub Wartak
to hack a palloc() a little, but that has probably too big overhead, right? (just thinking loud). -Jakub Wartak.

Re: In-placre persistance change of a relation

2023-04-27 Thread Jakub Wartak
tch passed all my very limited tests along with make check-world. Patch looks good to me on the surface from a usability point of view. I haven't looked at the code, so the patch might still need an in-depth review. Regards, -Jakub Wartak.

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-04-27 Thread Jakub Wartak
t pg_largeobject also consume OID and as such are subject to 32TB limit. > > + > + large objects number > > "large objects per database" Fixed. > + subject to the same limitations as rows per > table > > That implies table size is the only factor. Max OID is also a factor, which > was your stated reason to include LOs here in the first place. Exactly.. Regards, -Jakub Wartak. v2-0001-doc-Add-some-OID-TOAST-related-limitations-to-the.patch Description: Binary data

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-04-26 Thread Jakub Wartak
Hi, >> These 2 discussions show that it's a painful experience to run into >> this problem, and that the hackers have ideas on how to fix it, but >> those fixes haven't materialized for years. So I would say that, yes, >> this info belongs in the hard-limits section, because who knows how >> long

Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-04-21 Thread Jakub Wartak
Hi -hackers, I would like to ask if it wouldn't be good idea to copy the https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit discussion (out-of-line OID usage per TOAST-ed columns / potential limitation) to the official "Appendix K. PostgreSQL Limits" with also little bonus mentioning

Re: doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

2023-03-14 Thread Jakub Wartak
Hi, I've tested the attached patch by Justin and it applied almost cleanly to the master, but there was a tiny typo and make postgres-A4.pdf didn't want to run: Note that creating a partition using PARTITION OF => (note lack of closing literal) => Note that creating a partition using PARTITION OF

Re: Syncrep and improving latency due to WAL throttling

2023-02-02 Thread Jakub Wartak
On Thu, Feb 2, 2023 at 11:03 AM Tomas Vondra wrote: > > I agree that some other concurrent backend's > > COMMIT could fsync it, but I was wondering if that's sensible > > optimization to perform (so that issue_fsync() would be called for > > only commit/rollback records). I can imagine a

Re: Syncrep and improving latency due to WAL throttling

2023-02-01 Thread Jakub Wartak
On Wed, Feb 1, 2023 at 2:14 PM Tomas Vondra wrote: > > Maybe we should avoid calling fsyncs for WAL throttling? (by teaching > > HandleXLogDelayPending()->XLogFlush()->XLogWrite() to NOT to sync when > > we are flushing just because of WAL thortting ?) Would that still be > > safe? > > It's not

Re: Syncrep and improving latency due to WAL throttling

2023-02-01 Thread Jakub Wartak
On Mon, Jan 30, 2023 at 9:16 AM Bharath Rupireddy wrote: Hi Bharath, thanks for reviewing. > I think measuring the number of WAL flushes with and without this > feature that the postgres generates is great to know this feature > effects on IOPS. Probably it's even better with variations in >

Re: Syncrep and improving latency due to WAL throttling

2023-01-27 Thread Jakub Wartak
Hi Bharath, On Fri, Jan 27, 2023 at 12:04 PM Bharath Rupireddy wrote: > > On Fri, Jan 27, 2023 at 2:03 PM Alvaro Herrera > wrote: > > > > On 2023-Jan-27, Bharath Rupireddy wrote: > > > > > Looking at the patch, the feature, in its current shape, focuses on > > > improving replication lag (by

Re: Syncrep and improving latency due to WAL throttling

2023-01-27 Thread Jakub Wartak
Hi, v2 is attached. On Thu, Jan 26, 2023 at 4:49 PM Andres Freund wrote: > Huh? Why did you remove the GUC? After reading previous threads, my optimism level of getting it ever in shape of being widely accepted degraded significantly (mainly due to the discussion of wider category of 'WAL I/O

Re: Syncrep and improving latency due to WAL throttling

2023-01-26 Thread Jakub Wartak
> On 1/25/23 20:05, Andres Freund wrote: > > Hi, > > > > Such a feature could be useful - but I don't think the current place of > > throttling has any hope of working reliably: [..] > > You're blocking in the middle of an XLOG insertion. [..] > Yeah, I agree the sleep would have to happen

Syncrep and improving latency due to WAL throttling

2023-01-25 Thread Jakub Wartak
0 Time: 22737.729 ms (00:22.738) Without this feature (or with synchronous_commit_flush_wal_after=0) the TCP's SendQ on socket walsender-->walreceiver is growing and as such any next sendto() by OLTP backends/walwriter ends being queued too much causing stalls of activity. -Jakub Warta

Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs

2023-01-10 Thread Jakub Wartak
priority of this, how about adding it as a TODO wiki item then and maybe adding just some warning instead? I've intentionally avoided parsing grammar and regexp so it's not perfect (not that I do care about this too much either, as web crawlers already have indexed this $thread). BTW I've found tw

psql's FETCH_COUNT (cursor) is not being respected for CTEs

2023-01-04 Thread Jakub Wartak
at('a', 100) || data.Total || repeat('b', 800) as total_pat from data;" | wc -l 2000 postgres@hive:~$ Regards, -Jakub Wartak. 0001-psql-allow-CTE-queries-to-be-executed-also-using-cur.patch Description: Binary data

Re: CREATE UNLOGGED TABLE seq faults when debug_discard_caches=1

2022-11-25 Thread Jakub Wartak
d=indexRelationId@entry=0, parentIndexId=parentIndexId@entry=0 -Jakub Wartak. On Fri, Nov 25, 2022 at 9:48 AM Tomas Vondra wrote: > > > > On 11/18/22 15:43, Tom Lane wrote: > > David Geier writes: > >> On a different note: are we frequently running our tests suites with > >

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Jakub Wartak
Hi all, apologies the patch was rushed too quickly - my bad. I'm attaching a fixed one as v0004 (as it is the 4th patch floating around here). -Jakub Wartak On Mon, Nov 21, 2022 at 9:55 PM Robert Haas wrote: > > On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote: > > On Novembe

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Jakub Wartak
Hi, Draft version of the patch attached (it is based on Simon's) I would be happier if we could make that #define into GUC (just in case), although I do understand the effort to reduce the number of various knobs (as their high count causes their own complexity). -Jakub Wartak. On Mon, Nov 21

Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Jakub Wartak
hints cleaning was not kicking in, but maybe it was, but given the scale of the problem it was not helping much). -Jakub Wartak. [1] - https://www.postgresql.org/message-id/flat/54446AE2.6080909%40BlueTreble.com#f436bb41cf044b30eeec29472a13631e [2] - https://www.postgresql.org/message-id/flat/db7111

RE: Use fadvise in wal replay

2022-06-23 Thread Jakub Wartak
Hey Andrey, > > 23 июня 2022 г., в 13:50, Jakub Wartak > написал(а): > > > > Thoughts? > The patch leaves 1st 128KB chunk unprefetched. Does it worth to add and extra > branch for 120KB after 1st block when readOff==0? > Or maybe do > + posix_fadvis

RE: Use fadvise in wal replay

2022-06-23 Thread Jakub Wartak
>> > On 21 Jun 2022, at 16:59, Jakub Wartak wrote: >> Oh, wow, your benchmarks show really impressive improvement. >> >> > I think that 1 additional syscall is not going to be cheap just for >> > non-standard OS configurations >> Also we ca

RE: Use fadvise in wal replay

2022-06-21 Thread Jakub Wartak
> On Tue, Jun 21, 2022 at 10:33 PM Jakub Wartak > wrote: > > > > Maybe the important question is why would be readahead mechanism > > > > be > > > disabled in the first place via /sys | blockdev ? > > > > > > Because database should kn

RE: Use fadvise in wal replay

2022-06-21 Thread Jakub Wartak
> > Maybe the important question is why would be readahead mechanism be > disabled in the first place via /sys | blockdev ? > > Because database should know better than OS which data needs to be > prefetched and which should not. Big OS readahead affects index scan > performance. OK fair point,

RE: Use fadvise in wal replay

2022-06-21 Thread Jakub Wartak
>> > On 21 Jun 2022, at 12:35, Amit Kapila wrote: >> > >> > I wonder if the newly introduced "recovery_prefetch" [1] for PG-15 can >> > help your case? >> >> AFAICS recovery_prefetch tries to prefetch main fork, but does not try to >> prefetch WAL itself before reading it. Kirill is trying to

RE: pgcon unconference / impact of block size on performance

2022-06-10 Thread Jakub Wartak
> On 6/9/22 13:23, Jakub Wartak wrote: > >>>>>>> The really > >>>>>> puzzling thing is why is the filesystem so much slower for > >>>>>> smaller pages. I mean, why would writing 1K be 1/3 of writing 4K? > >>

RE: pgcon unconference / impact of block size on performance

2022-06-09 Thread Jakub Wartak
> > The really > puzzling thing is why is the filesystem so much slower for smaller > pages. I mean, why would writing 1K be 1/3 of writing 4K? > Why would a filesystem have such effect? > >>> > >>> Ha! I don't care at this point as 1 or 2kB seems too small to handle > >>> many

RE: pgcon unconference / impact of block size on performance

2022-06-08 Thread Jakub Wartak
Hi, got some answers! TL;DR for fio it would make sense to use many stressfiles (instead of 1) and same for numjobs ~ VCPU to avoid various pitfails. > >> The really > >> puzzling thing is why is the filesystem so much slower for smaller > >> pages. I mean, why would writing 1K be 1/3 of

RE: effective_io_concurrency and NVMe devices

2022-06-08 Thread Jakub Wartak
> >> The attached patch is a trivial version that waits until we're at > >> least > >> 32 pages behind the target, and then prefetches all of them. Maybe give it > >> a > try? > >> (This pretty much disables prefetching for e_i_c below 32, but for an > >> experimental patch that's enough.) > > >

RE: pgcon unconference / impact of block size on performance

2022-06-07 Thread Jakub Wartak
Hi, > The really > puzzling thing is why is the filesystem so much slower for smaller pages. I > mean, > why would writing 1K be 1/3 of writing 4K? > Why would a filesystem have such effect? Ha! I don't care at this point as 1 or 2kB seems too small to handle many real world scenarios ;) > >

RE: effective_io_concurrency and NVMe devices

2022-06-07 Thread Jakub Wartak
Hi Tomas, > > I have a machine here with 1 x PCIe 3.0 NVMe SSD and also 1 x PCIe 4.0 > > NVMe SSD. I ran a few tests to see how different values of > > effective_io_concurrency would affect performance. I tried to come up > > with a query that did little enough CPU processing to ensure that I/O >

RE: pgcon unconference / impact of block size on performance

2022-06-07 Thread Jakub Wartak
[..] >I doubt we could ever > make the default smaller than it is today as it would nobody would be able to > insert rows larger than 4 kilobytes into a table anymore. Add error "values larger than 1/3 of a buffer page cannot be indexed" to that list... -J.

RE: pgcon unconference / impact of block size on performance

2022-06-07 Thread Jakub Wartak
Hi Tomas, > Well, there's plenty of charts in the github repositories, including the > charts I > think you're asking for: Thanks. > I also wonder how is this related to filesystem page size - in all the > benchmarks I > did I used the default (4k), but maybe it'd behave if the filesystem

RE: pgcon unconference / impact of block size on performance

2022-06-06 Thread Jakub Wartak
Hi Tomas, > Hi, > > At on of the pgcon unconference sessions a couple days ago, I presented a > bunch of benchmark results comparing performance with different data/WAL > block size. Most of the OLTP results showed significant gains (up to 50%) with > smaller (4k) data pages. Nice. I just saw

RE: effective_io_concurrency and NVMe devices

2022-06-02 Thread Jakub Wartak
Hi Nathan, > > NVMe devices have a maximum queue length of 64k: [..] > > but our effective_io_concurrency maximum is 1,000: [..] > > Should we increase its maximum to 64k? Backpatched? (SATA has a > > maximum queue length of 256.) > > If there are demonstrable improvements with higher values,

RE: strange slow query - lost lot of time somewhere

2022-05-05 Thread Jakub Wartak
Hi Pavel, > I have not debug symbols, so I have not more details now > Breakpoint 1 at 0x7f557f0c16c0 > (gdb) c > Continuing. > Breakpoint 1, 0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6 > (gdb) bt > #0  0x7f557f0c16c0 in mmap64 () from /lib64/libc.so.6 > #1  0x7f557f04dd91 in

RE: strange slow query - lost lot of time somewhere

2022-05-04 Thread Jakub Wartak
> I do agree that the perf report does indicate that the extra time is taken > due to > some large amount of memory being allocated. I just can't quite see how that > would happen in Memoize given that > estimate_num_groups() clamps the distinct estimate as the number of input > rows, which is 91

Re: In-placre persistance change of a relation

2022-01-11 Thread Jakub Wartak
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested I've retested v15 of the patch with everything that came to my

RE: In-placre persistance change of a relation

2021-12-22 Thread Jakub Wartak
Hi Kyotaro, > At Tue, 21 Dec 2021 13:07:28 +0000, Jakub Wartak > wrote in > > So what's suspicious is that 122880 -> 0 file size truncation. I've > > investigated WAL and it seems to contain TRUNCATE records after logged > FPI images, so when the crash recovery would ki

RE: In-placre persistance change of a relation

2021-12-21 Thread Jakub Wartak
Hi Kyotaro, > I took a bit too long detour but the patch gets to pass make-world for me. Good news, v10 passes all the tests for me (including TAP recover ones). There's major problem I think: drop table t6; create unlogged table t6 (id bigint, t text); create sequence s1; insert into t6

RE: In-placre persistance change of a relation

2021-12-20 Thread Jakub Wartak
Hi Kyotaro, > At Mon, 20 Dec 2021 17:39:27 +0900 (JST), Kyotaro Horiguchi > wrote in > > At Mon, 20 Dec 2021 07:59:29 +, Jakub Wartak > > wrote in > > > BTW fast feedback regarding that ALTER patch (there were 4 unlogged > tables): > > > # ALTER

RE: In-placre persistance change of a relation

2021-12-19 Thread Jakub Wartak
Hi Kyotaro, I'm glad you are still into this > I didn't register for some reasons. Right now in v8 there's a typo in ./src/backend/catalog/storage.c : storage.c: In function 'RelationDropInitFork': storage.c:385:44: error: expected statement before ')' token pending->unlink_forknum !=

RE: In-placre persistance change of a relation

2021-12-17 Thread Jakub Wartak
> Justin wrote: > On Fri, Dec 17, 2021 at 09:10:30AM +, Jakub Wartak wrote: > > As the thread didn't get a lot of traction, I've registered it into current > commitfest > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitf > est.postgresql.org%2F36%

RE: In-placre persistance change of a relation

2021-12-17 Thread Jakub Wartak
dy for review' state. I think it behaves as almost finished one and apparently after reading all those discussions that go back over 10years+ time span about this feature, and lot of failed effort towards wal_level=noWAL I think it would be nice to finally start getting some of that of it into the core. -Jakub Wartak. v7-0001-In-place-table-persistence-change-with-new-comman.patch Description: v7-0001-In-place-table-persistence-change-with-new-comman.patch

RE: track_io_timing default setting

2021-12-10 Thread Jakub Wartak
doesn't feel like it is going to make stuff crash., so again I think it is good idea. -Jakub Wartak.

RE: prevent immature WAL streaming

2021-10-13 Thread Jakub Wartak
On 2021-Sep-25, Alvaro Herrera wrote: >> On 2021-Sep-24, Alvaro Herrera wrote: >> >> > Here's the set for all branches, which I think are really final, in >> > case somebody wants to play and reproduce their respective problem >> scenarios. >> >> I forgot to mention that I'll wait until 14.0 is

RE: prevent immature WAL streaming

2021-08-25 Thread Jakub Wartak
Hi Álvaro, -hackers, > I attach the patch with the change you suggested. I've gave a shot to to the v02 patch on top of REL_12_STABLE (already including 5065aeafb0b7593c04d3bc5bc2a86037f32143fc). Previously(yesterday) without the v02 patch I was getting standby corruption always via

RE: Background writer and checkpointer in crash recovery

2021-08-02 Thread Jakub Wartak
> On Fri, Jul 30, 2021 at 4:00 PM Andres Freund wrote: > > I don't agree with that? If (user+system) << wall then it is very > > likely that recovery is IO bound. If system is a large percentage of > > wall, then shared buffers is likely too small (or we're replacing the > > wrong > > buffers)

RE: Cosmic ray hits integerset

2021-07-07 Thread Jakub Wartak
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?) -J. > -Original Message- > From: Alvaro Herrera > Sent:

RE: Use simplehash.h instead of dynahash in SMgr

2021-05-05 Thread Jakub Wartak
Hey David, > I think you'd have to batch by filenode and transaction in that case. Each > batch might be pretty small on a typical OLTP workload, so it might not help > much there, or it might hinder. True, it is very workload dependent (I was chasing mainly INSERTs multiValues,

RE: Use simplehash.h instead of dynahash in SMgr

2021-05-05 Thread Jakub Wartak
Hi David, Alvaro, -hackers > Hi David, > > You're probably aware of this, but just to make it explicit: Jakub Wartak was > testing performance of recovery, and one of the bottlenecks he found in > some of his cases was dynahash as used by SMgr. It seems quite possible > th

RE: Improve the performance to create END_OF_RECOVERY checkpoint

2020-12-22 Thread Jakub Wartak
Hi Ray, > So can we delete the limit of ArchiveRecoveryRequested, and enable launch > bgwriter in master node ? Please take a look on https://commitfest.postgresql.org/29/2706/ and the related email thread. -J. 

RE: pg_preadv() and pg_pwritev()

2020-12-20 Thread Jakub Wartak
> > I'm drawing a blank on trivial candidate uses for preadv(), without > > infrastructure from later patches. > > Can't immediately think of something either. This might be not that trivial , but maybe acquire_sample_rows() from analyze.c ? Please note however there's patch

Re: automatic analyze: readahead - add "IO read time" log message

2020-11-04 Thread Jakub Wartak
posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall. -J. From: Stephen Frost Sent: Tuesday, November 3, 2020 6:47 PM To: Jakub Wartak Cc: pgsql-hackers Subject: Re: automatic analyze: readahead - add "IO read time" log message Greetings, * Jak

Re: automatic analyze: readahead - add "IO read time" log message

2020-11-03 Thread Jakub Wartak
Hi Stephen, hackers, >> > With all those 'readahead' calls it certainly makes one wonder if the >> > Linux kernel is reading more than just the block we're looking for >> > because it thinks we're doing a sequential read and will therefore want >> > the next few blocks when, in reality, we're

Re: automatic analyze: readahead - add "IO read time" log message

2020-10-27 Thread Jakub Wartak
Hi Stephen, hackers, > The analyze is doing more-or-less random i/o since it's skipping through > the table picking out select blocks, not doing regular sequential i/o. VS >> Breakpoint 1, heapam_scan_analyze_next_block (scan=0x10c8098, >> blockno=19890910, bstrategy=0x1102278) at

automatic analyze: readahead - add "IO read time" log message

2020-10-26 Thread Jakub Wartak
Greetings hackers, I have I hope interesting observation (and nano patch proposal) on system where statistics freshness is a critical factor. Autovacuum/autogathering statistics was tuned to be pretty very aggressive: autovacuum_vacuum_cost_delay=0 (makes autovacuum_vacuum_cost_limit

Re: Parallelize stream replication process

2020-09-17 Thread Jakub Wartak
Li Japin wrote: > If we can improve the efficiency of replay, then we can shorten the database > recovery time (streaming replication or database crash recovery). (..) > For streaming replication, we may need to improve the transmission of WAL > logs to improve the entire recovery process. >

Re: Optimising compactify_tuples()

2020-09-15 Thread Jakub Wartak
David Rowley wrote: > I've attached patches in git format-patch format. I'm proposing to commit > these in about 48 hours time unless there's some sort of objection before > then. Hi David, no objections at all, I've just got reaffirming results here, as per [1] (SLRU thread but combined

Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-08 Thread Jakub Wartak
ybe on NUMA boxes), not just WAL recovery as it seems relatively easy to improve. -J. [1] - https://github.com/macdice/redo-bench [2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74 From: Thomas Munro Sent: Tuesday, September 8, 2020 2:55 AM To: Alvaro Herrera

Division in dynahash.c due to HASH_FFACTOR

2020-09-04 Thread Jakub Wartak
untEntry() -> dynahash that could be called pretty often I have no idea what kind of pgbench stresstest could be used to demonstrate the gain (or lack of it). -Jakub Wartak.

Re: Handing off SLRU fsyncs to the checkpointer

2020-08-31 Thread Jakub Wartak
L at once (b) then issuing preadv() to get all the DB blocks into s_b going from the same rel/fd (c) applying WAL. Sounds like a major refactor just to save syscalls :( - mmap() - even more unrealistic - IO_URING - gives a lot of promise here I think, is it even planned to be shown for PgSQL14

Re: Handing off SLRU fsyncs to the checkpointer

2020-08-28 Thread Jakub Wartak
ore complex to reproduce what I'm after and involves a lot of reading about LogStandbySnapshot() / standby recovery points on my side. Now, back to smgropen() hash_search_by_values() reproducer... -Jakub Wartak.

Re: Handing off SLRU fsyncs to the checkpointer

2020-08-27 Thread Jakub Wartak
SERTs wih plenty of data in VALUES() thrown as one commit, real primary->hot-standby replication [not closed DB in recovery], sorted not random UUIDs) - I'm going to try nail down these differences and maybe I manage to produce more realistic "pgbench reproducer" (this may take some time though). -Jakub Wartak.

  1   2   >