Re: Enabling B-Tree deduplication by default

2020-07-02 Thread Bruce Momjian
On Thu, Jul 2, 2020 at 02:59:47PM -0700, Peter Geoghegan wrote: > On Thu, Jun 25, 2020 at 4:28 PM Peter Geoghegan wrote: > > It's now time to make a final decision on this. Does anyone have any > > reason to believe that leaving deduplication enabled by default is the > > wrong way to go? > > I

Re: Enabling B-Tree deduplication by default

2020-07-02 Thread Peter Geoghegan
On Thu, Jun 25, 2020 at 4:28 PM Peter Geoghegan wrote: > It's now time to make a final decision on this. Does anyone have any > reason to believe that leaving deduplication enabled by default is the > wrong way to go? I marked the open item resolved just now -- B-Tree deduplication will remain

Re: Enabling B-Tree deduplication by default

2020-06-25 Thread Peter Geoghegan
On Thu, Jan 30, 2020 at 11:40 AM Peter Geoghegan wrote: > I think that I should commit the patch without the GUC tentatively. > Just have the storage parameter, so that everyone gets the > optimization without asking for it. We can then review the decision to > enable deduplication generally

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2020 at 2:13 PM Peter Geoghegan wrote: > My approach to showing the downsides of the patch wasn't particularly > obvious, or easy to come up with. I could have contrived a case like > the insert benchmark, but with more low cardinality non-unique > indexes. Sorry. I meant with

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2020 at 12:57 PM Robert Haas wrote: > That seems reasonable. My approach to showing the downsides of the patch wasn't particularly obvious, or easy to come up with. I could have contrived a case like the insert benchmark, but with more low cardinality non-unique indexes. That

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Robert Haas
On Thu, Jan 30, 2020 at 2:40 PM Peter Geoghegan wrote: > On Thu, Jan 30, 2020 at 11:16 AM Peter Geoghegan wrote: > > I prefer to think of the patch as being about improving the stability > > and predictability of Postgres with certain workloads, rather than > > being about overall throughput.

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2020 at 11:16 AM Peter Geoghegan wrote: > I prefer to think of the patch as being about improving the stability > and predictability of Postgres with certain workloads, rather than > being about overall throughput. Postgres has an ungoing need to VACUUM > indexes, so making

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Peter Geoghegan
On Thu, Jan 30, 2020 at 9:36 AM Robert Haas wrote: > How do things look in a more sympathetic case? I prefer to think of the patch as being about improving the stability and predictability of Postgres with certain workloads, rather than being about overall throughput. Postgres has an ungoing

Re: Enabling B-Tree deduplication by default

2020-01-30 Thread Robert Haas
On Thu, Jan 30, 2020 at 1:45 AM Peter Geoghegan wrote: > There is a regression that is just shy of 2% here, as measured in > insert benchmark "rows/sec" -- this metric goes from "62190.0" > rows/sec on master to "60986.2 rows/sec" with the patch. I think that > this is an acceptable price to pay

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Peter Geoghegan
On Wed, Jan 29, 2020 at 11:50 AM Peter Geoghegan wrote: > I should stop talking about it for now, and go back to reassessing the > extent of the regression in highly unsympathetic cases. The patch has > become faster in a couple of different ways since I last looked at > this question, and it's

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Robert Haas
On Wed, Jan 29, 2020 at 2:50 PM Peter Geoghegan wrote: > It's tempting to try to reason about the state of an index over time > like this, but I don't think that it's ever going to work well. > Imagine a unique index where 50% of all values are NULLs, on an > append-only table. Actually, let's

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Peter Geoghegan
On Wed, Jan 29, 2020 at 10:41 AM Robert Haas wrote: > Yeah, maybe. I'm tempted to advocate for dropping the GUC and keeping > the reloption. If the worst case is a 3% regression and you expect > that to be rare, I don't think a GUC is really worth it, especially > given that the proposed

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Robert Haas
On Wed, Jan 29, 2020 at 1:15 PM Peter Geoghegan wrote: > The good news is that these extra cycles aren't very noticeable even > with a workload where deduplication doesn't help at all (e.g. with > several indexes an append-only table, and few or no duplicates). The > cycles are generally a fixed

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Peter Geoghegan
On Wed, Jan 29, 2020 at 6:56 AM Robert Haas wrote: > This (and the rest of the explanation) don't really address my > concern. I understand that deduplicating in lieu of splitting a page > in a unique index is highly likely to be a win. What I don't > understand is why it shouldn't just be a win,

Re: Enabling B-Tree deduplication by default

2020-01-29 Thread Robert Haas
On Thu, Jan 16, 2020 at 3:05 PM Peter Geoghegan wrote: > The main reason that I am confident about unique indexes is that we > only do a deduplication pass in a unique index when we observe that > the incoming tuple (the one that might end up splitting the page) is a > duplicate of some existing

Re: Enabling B-Tree deduplication by default

2020-01-28 Thread Peter Geoghegan
On Thu, Jan 16, 2020 at 12:05 PM Peter Geoghegan wrote: > > It does seem odd to me to treat them differently, but it's possible > > that this is a reflection of my own lack of understanding. What do > > other database systems do? > > Other database systems treat unique indexes very differently,

Re: Enabling B-Tree deduplication by default

2020-01-16 Thread Peter Geoghegan
On Thu, Jan 16, 2020 at 10:55 AM Robert Haas wrote: > On Wed, Jan 15, 2020 at 6:38 PM Peter Geoghegan wrote: > > There are some outstanding questions about how B-Tree deduplication > > [1] should be configured, and whether or not it should be enabled by > > default. I'm starting this new thread

Re: Enabling B-Tree deduplication by default

2020-01-16 Thread Robert Haas
On Wed, Jan 15, 2020 at 6:38 PM Peter Geoghegan wrote: > There are some outstanding questions about how B-Tree deduplication > [1] should be configured, and whether or not it should be enabled by > default. I'm starting this new thread in the hopes of generating > discussion on these high level

Enabling B-Tree deduplication by default

2020-01-15 Thread Peter Geoghegan
There are some outstanding questions about how B-Tree deduplication [1] should be configured, and whether or not it should be enabled by default. I'm starting this new thread in the hopes of generating discussion on these high level questions. The commit message of the latest version of the patch