Re: Large files for relations

2024-05-13 Thread Peter Eisentraut
On 06.03.24 22:54, Thomas Munro wrote: Rebased. I had intended to try to get this into v17, but a couple of unresolved problems came up while rebasing over the new incremental backup stuff. You snooze, you lose. Hopefully we can sort these out in time for the next commitfest: * should pg_comb

Re: Large files for relations

2024-03-06 Thread Thomas Munro
Rebased. I had intended to try to get this into v17, but a couple of unresolved problems came up while rebasing over the new incremental backup stuff. You snooze, you lose. Hopefully we can sort these out in time for the next commitfest: * should pg_combinebasebackup read the control file to fe

Re: Large files for relations

2023-07-03 Thread Thomas Munro
On Mon, Jun 12, 2023 at 8:53 PM David Steele wrote: > + if (strcmp(endptr, "kB") == 0) > > Why kB here instead of KB to match MB, GB, TB below? Those are SI prefixes[1], and we use kB elsewhere too. ("K" was used for kelvins, so they went with "k" for kilo. Obviously these aren't

Re: Large files for relations

2023-06-12 Thread David Steele
On 5/28/23 08:48, Thomas Munro wrote: Alright, since I had some time to kill in an airport, here is a starter patch for initdb --rel-segsize. I've gone through this patch and it looks pretty good to me. A few things: +* rel_setment_size, we will truncate the K+1st se

Re: Large files for relations

2023-05-30 Thread Peter Eisentraut
On 28.05.23 02:48, Thomas Munro wrote: Another potential option name would be --segsize, if we think we're going to use this for temp files too eventually. Maybe it's not so beautiful to have that global variable rel_segment_size (which replaces REL_SEGSIZE everywhere). Another idea would be to

Re: Large files for relations

2023-05-28 Thread Thomas Munro
On Sun, May 28, 2023 at 2:48 AM Thomas Munro wrote: > (you'd need over 2 billion > directories ... directory *entries* (segment files), I meant to write there.

Re: Large files for relations

2023-05-27 Thread Thomas Munro
On Thu, May 25, 2023 at 1:08 PM Stephen Frost wrote: > * Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote: > > On 24.05.23 02:34, Thomas Munro wrote: > > > * pg_upgrade would convert if source and target don't match > > > > This would be good, but it could also be an optional or later fe

Re: Large files for relations

2023-05-25 Thread Stephen Frost
Greetings, * Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote: > On 24.05.23 02:34, Thomas Munro wrote: > > Thanks all for the feedback. It was a nice idea and it *almost* > > works, but it seems like we just can't drop segmented mode. And the > > automatic transition schemes I showed

Re: Large files for relations

2023-05-24 Thread Robert Haas
On Wed, May 24, 2023 at 2:18 AM Peter Eisentraut wrote: > > What I'm hearing is that something simple like this might be more > > acceptable: > > > > * initdb --rel-segsize (cf --wal-segsize), default unchanged > > makes sense +1. > > * pg_upgrade would convert if source and target don't match

Re: Large files for relations

2023-05-23 Thread Peter Eisentraut
On 24.05.23 02:34, Thomas Munro wrote: Thanks all for the feedback. It was a nice idea and it *almost* works, but it seems like we just can't drop segmented mode. And the automatic transition schemes I showed don't make much sense without that goal. What I'm hearing is that something simple li

Re: Large files for relations

2023-05-23 Thread Thomas Munro
Thanks all for the feedback. It was a nice idea and it *almost* works, but it seems like we just can't drop segmented mode. And the automatic transition schemes I showed don't make much sense without that goal. What I'm hearing is that something simple like this might be more acceptable: * init

Re: Large files for relations

2023-05-15 Thread Robert Haas
On Fri, May 12, 2023 at 9:53 AM Stephen Frost wrote: > While I tend to agree that 1GB is too small, 1TB seems like it's > possibly going to end up on the too big side of things, or at least, > if we aren't getting rid of the segment code then it's possibly throwing > away the benefits we have from

Re: Large files for relations

2023-05-15 Thread MARK CALLAGHAN
On Fri, May 12, 2023 at 4:02 PM Thomas Munro wrote: > On Sat, May 13, 2023 at 4:41 AM MARK CALLAGHAN wrote: > > Repeating what was mentioned on Twitter, because I had some experience > with the topic. With fewer files per table there will be more contention on > the per-inode mutex (which might

Re: Large files for relations

2023-05-12 Thread Thomas Munro
On Sat, May 13, 2023 at 11:01 AM Thomas Munro wrote: > On Sat, May 13, 2023 at 4:41 AM MARK CALLAGHAN wrote: > > use XFS and O_DIRECT As for direct I/O, we're only just getting started on that. We currently can't produce more than one concurrent WAL write, and then for relation data, we just go

Re: Large files for relations

2023-05-12 Thread Thomas Munro
On Sat, May 13, 2023 at 4:41 AM MARK CALLAGHAN wrote: > Repeating what was mentioned on Twitter, because I had some experience with > the topic. With fewer files per table there will be more contention on the > per-inode mutex (which might now be the per-inode rwsem). I haven't read > filesyste

Re: Large files for relations

2023-05-12 Thread MARK CALLAGHAN
Repeating what was mentioned on Twitter, because I had some experience with the topic. With fewer files per table there will be more contention on the per-inode mutex (which might now be the per-inode rwsem). I haven't read filesystem source in a long time. Back in the day, and perhaps today, it wa

Re: Large files for relations

2023-05-12 Thread Stephen Frost
Greetings, * Dagfinn Ilmari Mannsåker (ilm...@ilmari.org) wrote: > Thomas Munro writes: > > On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski wrote: > >> On Mon, May 1, 2023 at 9:29 PM Thomas Munro wrote: > >>> I am not aware of any modern/non-historic filesystem[2] that can't do > >>> large files

Re: Large files for relations

2023-05-12 Thread Jim Mlodgenski
On Thu, May 11, 2023 at 7:38 PM Thomas Munro wrote: > On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski wrote: > > On Mon, May 1, 2023 at 9:29 PM Thomas Munro > wrote: > >> I am not aware of any modern/non-historic filesystem[2] that can't do > >> large files with ease. Anyone know of anything to

Re: Large files for relations

2023-05-12 Thread Dagfinn Ilmari Mannsåker
Thomas Munro writes: > On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski wrote: >> On Mon, May 1, 2023 at 9:29 PM Thomas Munro wrote: >>> I am not aware of any modern/non-historic filesystem[2] that can't do >>> large files with ease. Anyone know of anything to worry about on that >>> front? >> >

Re: Large files for relations

2023-05-11 Thread Thomas Munro
On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski wrote: > On Mon, May 1, 2023 at 9:29 PM Thomas Munro wrote: >> I am not aware of any modern/non-historic filesystem[2] that can't do >> large files with ease. Anyone know of anything to worry about on that >> front? > > There is some trouble in the

Re: Large files for relations

2023-05-11 Thread Jim Mlodgenski
On Mon, May 1, 2023 at 9:29 PM Thomas Munro wrote: > > I am not aware of any modern/non-historic filesystem[2] that can't do > large files with ease. Anyone know of anything to worry about on that > front? There is some trouble in the ambiguity of what we mean by "modern" and "large files". Th

Re: Large files for relations

2023-05-09 Thread Stephen Frost
Greetings, * Corey Huinker (corey.huin...@gmail.com) wrote: > On Wed, May 3, 2023 at 1:37 AM Thomas Munro wrote: > > On Wed, May 3, 2023 at 5:21 PM Thomas Munro > > wrote: > > > rsync --link-dest ... rsync isn't really a safe tool to use for PG backups by itself unless you're using it with arch

Re: Large files for relations

2023-05-09 Thread Corey Huinker
On Wed, May 3, 2023 at 1:37 AM Thomas Munro wrote: > On Wed, May 3, 2023 at 5:21 PM Thomas Munro > wrote: > > rsync --link-dest > > I wonder if rsync will grow a mode that can use copy_file_range() to > share blocks with a reference file (= previous backup). Something > like --copy-range-dest.

Re: Large files for relations

2023-05-02 Thread Thomas Munro
On Wed, May 3, 2023 at 5:21 PM Thomas Munro wrote: > rsync --link-dest I wonder if rsync will grow a mode that can use copy_file_range() to share blocks with a reference file (= previous backup). Something like --copy-range-dest. That'd work for large-file relations (assuming a file system that

Re: Large files for relations

2023-05-02 Thread Thomas Munro
On Tue, May 2, 2023 at 3:28 PM Pavel Stehule wrote: > I like this patch - it can save some system sources - I am not sure how much, > because bigger tables usually use partitioning usually. Yeah, if you only use partitions of < 1GB it won't make a difference. Larger partitions are not uncommon,

Re: Large files for relations

2023-05-01 Thread Pavel Stehule
Hi I like this patch - it can save some system sources - I am not sure how much, because bigger tables usually use partitioning usually. Important note - this feature breaks sharing files on the backup side - so before disabling 1GB sized files, this issue should be solved. Regards Pavel

Large files for relations

2023-05-01 Thread Thomas Munro
Big PostgreSQL databases use and regularly open/close huge numbers of file descriptors and directory entries for various anachronistic reasons, one of which is the 1GB RELSEG_SIZE thing. The segment management code is trickier that you might think and also still harbours known bugs. A nearby anal