Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-04 Thread Peter Geoghegan
...@mail.gmail.com [2] http://www.informatics.jax.org/software.shtml -- Peter Geoghegan

Re: Adversarial case for "many duplicates" nbtree split strategy in v12

2019-07-10 Thread Peter Geoghegan
On Tue, Jul 2, 2019 at 3:51 PM Peter Geoghegan wrote: > I've already written a rough patch that fixes the issue by taking this > second view of the problem. The patch makes nbtsplitloc.c more > skeptical about finishing with the "many duplicates" strategy, > avoiding the

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-10 Thread Peter Geoghegan
On Sat, Jul 6, 2019 at 4:08 PM Peter Geoghegan wrote: > I took a closer look at this patch, and have some general thoughts on > its design, and specific feedback on the implementation. I have some high level concerns about how the patch might increase contention, which could make queries

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
ould definitely have an open mind about unique indexes, even with non-NULL values. If we can prevent a page split by deduplicating the contents of a unique index page, then we'll probably win. Why not try? This will need to be tested. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
t. > Regarding bitmap indexes itself, I think our BRIN could provide them. > However, it would be useful to have opclass parameters to make them > tunable. I thought that we might implement them in nbtree myself. But we don't need to decide now. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
the version number isn't changed. I think that we may be able to get away with not increasing the B-Tree version from 4 to 5, actually. Deduplication is performed lazily when it looks like we might have to split the page, so there isn't any expectation that tuples will either be compressed or uncompressed in any context. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
Database System" provides additional background information (I should have suggested reading both 6.6 and 6.7 together). -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-11 Thread Peter Geoghegan
On Thu, Jul 11, 2019 at 10:42 AM Peter Geoghegan wrote: > > I think unique indexes may benefit from deduplication not only because > > of NULL values. Non-HOT updates produce duplicates of non-NULL values > > in unique indexes. And those duplicates can take significant

Re: Improve search for missing parent downlinks in amcheck

2019-07-08 Thread Peter Geoghegan
On Sun, Jul 7, 2019 at 7:53 PM Thomas Munro wrote: > On Wed, May 1, 2019 at 12:58 PM Peter Geoghegan wrote: > > I will think about a simple fix, but after the upcoming point release. > > There is no hurry. > > A bureaucratic question: What should the status be for this CF

Adversarial case for "many duplicates" nbtree split strategy in v12

2019-07-02 Thread Peter Geoghegan
y of my test cases -- the fix barely affects the splits chosen for my real-world test data, and TPC test data. As far as I know, I already have a comprehensive fix. I will need to think about it much more carefully before proceeding, though. Thoughts? -- Peter Geoghegan

Re: GiST VACUUM

2019-07-03 Thread Peter Geoghegan
derstand old deleted pages, where the deletion XID is > stored in the page opaque field. What Postgres versions will the B-Tree fix end up targeting? Sounds like you plan to backpatch all the way? -- Peter Geoghegan

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Peter Geoghegan
unnecessary impediments in the way of making that > happen, at least IMHO. +1. pg_stat_statements will already lose all the statistics that it aggregated in the event of a hard crash. The trade-off that the query jumbling logic makes is not a bad one, all things considered. -- Peter Geoghegan

Re: [PATCH] Implement uuid_version()

2019-06-28 Thread Peter Geoghegan
gine doing quite a lot better still. Application developers love UUIDs. We should try to meet them where they are. [1] https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/ -- Peter Geoghegan

Re: Calling PrepareTempTablespaces in BufFileCreateTemp

2019-04-22 Thread Peter Geoghegan
layering is confusing in a number of ways IMV. -- Peter Geoghegan

Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-21 Thread Peter Geoghegan
actually care very much about these kinds of space savings, but at the same time it feels more elegant to me. The heap TID may not have a pg_attribute entry, but ISTM that the on-disk representation should not have padding "in the wrong place", on general principle. Thoughts? -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-24 Thread Peter Geoghegan
o disable the trace_sort instrumentation my commenting out the TRACE_SORT entry in pg_config_manual.h. I recall being opposed on this point by Robert Haas. Possibly because he just didn't want to deal with it at the time. -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-24 Thread Peter Geoghegan
uite meeting the traditional definition of a "developer option". -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-24 Thread Peter Geoghegan
seem to be saying that it is, I > think we should just remove the symbol and be done with it. Sounds like a plan. Do you want to take care of it, Joe? -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-24 Thread Peter Geoghegan
le by the new pg_stat_progress_create_index view, but with getrusage() stats. -- Peter Geoghegan

Re: Pathological performance when inserting many NULLs into a unique index

2019-04-23 Thread Peter Geoghegan
On Fri, Apr 19, 2019 at 6:34 PM Peter Geoghegan wrote: > Attached revision does it that way, specifically by adding a new field > to the insertion scankey struct (BTScanInsertData). Pushed. -- Peter Geoghegan

Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-24 Thread Peter Geoghegan
orthwhile to keep the heap TID in the tuple header; it seems inherently necessary to have a MAXALIGN()'d tuple header, so finding a way to consistently put the first MAXALIGN() quantum to good use seems wise. -- Peter Geoghegan

Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-24 Thread Peter Geoghegan
On Wed, Apr 24, 2019 at 10:43 AM Peter Geoghegan wrote: > The hard part is how to do varwidth encoding for space-efficient > partition numbers while continuing to use IndexTuple fields for heap > TID on the leaf level, *and* also having a > BTreeTupleGetHeapTID()-style macro to g

Re: Calling PrepareTempTablespaces in BufFileCreateTemp

2019-04-24 Thread Peter Geoghegan
er the other, though I don't really know how to assess this layering business. I'm glad that either approach will prevent oversights, though. -- Peter Geoghegan

"Routine Reindexing" docs should be updated to reference REINDEX CONCURRENTLY

2019-04-25 Thread Peter Geoghegan
The documentation has a section called "Routine Reindexing", which explains how to simulate REINDEX CONCURRENTLY with a sequence of creation and replacement steps. This should be updated to reference the REINDEX CONCURRENTLY command. -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-25 Thread Peter Geoghegan
hink about or define developer options frames this discussion. -- Peter Geoghegan

Re: TRACE_SORT defined by default

2019-04-25 Thread Peter Geoghegan
On Thu, Apr 25, 2019 at 1:56 PM Tom Lane wrote: > Well, I was suggesting that we ought to consider the alternative of > making it *not* always compiled, and Jeff was pushing back on that. Right. Sorry. -- Peter Geoghegan

Re: Improve search for missing parent downlinks in amcheck

2019-04-25 Thread Peter Geoghegan
On Tue, Apr 16, 2019 at 12:00 PM Peter Geoghegan wrote: > On Mon, Apr 15, 2019 at 7:30 PM Alexander Korotkov > wrote: > > Currently we amcheck supports lossy checking for missing parent > > downlinks. It collects bitmap of downlink hashes and use it to check > > subse

Re: TM format can mix encodings in to_char()

2019-04-22 Thread Peter Geoghegan
nds of bugs in quite a variety of contexts. -- Peter Geoghegan

Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-22 Thread Peter Geoghegan
scheme, or this new one. Having the "real" tuple length available will make it easier to implement "true" suffix truncation, where we truncate *within* a text attribute (i.e. generate a new, shorter value using new opclass infrastructure). -- Peter Geoghegan

Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-22 Thread Peter Geoghegan
ength), plus the usual t_info stuff. We'd almost invariably waste 4 or 5 bytes, which seems like a problem to me. -- Peter Geoghegan

Re: Thoughts on nbtree with logical/varwidth table identifiers, v12 on-disk representation

2019-04-22 Thread Peter Geoghegan
use it will actively try to preserve the "real" tuple size). It's convenient to me that no caller seems to rely on the index_form_tuple() MAXALIGN() that I want to remove. -- Peter Geoghegan

Re: Reducing the runtime of the core regression tests

2019-04-25 Thread Peter Geoghegan
continue; I would expect the "break" statement to have a line count that is no greater than that of the first two lines that immediately precede, and yet it's far far greater (1292 is greater than 4). It looks like there has been some kind of loop transformation. -- Peter Geoghegan

Re: Reducing the runtime of the core regression tests

2019-04-25 Thread Peter Geoghegan
http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf Search the PDF for "-O0" to see numerous references to this. It seems to be impossible to turn off all GCC optimizations. -- Peter Geoghegan

Re: POC: converting Lists into arrays

2019-07-16 Thread Peter Geoghegan
On Tue, Jul 16, 2019 at 9:01 AM Robert Haas wrote: > I cast my vote in the other direction i.e. for sticking with qsort. I do too. -- Peter Geoghegan

Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-01 Thread Peter Geoghegan
as painless as possible. Note that ICU does at least provide a standard way to use multiple versions at once; the symbol names have the ICU version baked in. You're actually calling the functions using the versioned symbol names without realizing it, because there is macro trickery involved

Re: Code comment change

2019-07-01 Thread Peter Geoghegan
urs that I mentioned. > > I think that the whole sentence about "the standard class of race > > conditions" should go. There is no more dance. Nothing in > > _bt_getroot() is surprising to me. The other comments explain things > > comprehensively. > > +1 I'll take care of it soon. -- Peter Geoghegan

Re: Code comment change

2019-07-01 Thread Peter Geoghegan
h sounds like a seriously bad approach to me. I think that the whole sentence about "the standard class of race conditions" should go. There is no more dance. Nothing in _bt_getroot() is surprising to me. The other comments explain things comprehensively. -- Peter Geoghegan

Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Peter Geoghegan
t only when its value is non-zero. -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-13 Thread Peter Geoghegan
lever about ignorable/half-dead/deleted pages, to be conservative.) -- Peter Geoghegan

Re: Use PageIndexTupleOverwrite() within nbtsort.c

2019-08-13 Thread Peter Geoghegan
ically different page (even after masking within btree_mask()). However, I eventually decided that you had it right. Your _bt_mark_page_halfdead() change is clearer overall and doesn't break WAL consistency checking in practice, for reasons that are no less obvious than before. Thanks! -- Peter Geoghegan

Re: Improve search for missing parent downlinks in amcheck

2019-08-13 Thread Peter Geoghegan
iary (e.g., * they are current target's child pages). Conceptually, problems are only * ever found in the current target page (or for a particular heap tuple during * heapallindexed verification). Each page found by verification's left/right, * top/bottom scan becomes the target exactly once. */ -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
uot;Row forwarding" across heap pages is the traditional way of ensuring that TIDs in indexes are stable even in the worst case, apparently, but other approaches also seem possible. [1] http://www.vldb.org/pvldb/vol10/p781-Wu.pdf -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
random unused OID: 9099 I would like to push this patch shortly. How do people feel about this wording? (It's based on the documentation added by commit a6417078.) -- Peter Geoghegan v2-0001-unused_oids-suggestion.patch Description: Binary data

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
On Fri, Aug 2, 2019 at 3:52 PM Tom Lane wrote: > Better ... but I'm the world's second worst Perl programmer, > so I have little to say about whether it's idiomatic. Perhaps Michael can weigh in here? I'd rather hear a second opinion on v4 of the patch before proceeding. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
implements your suggestion, generating output like the above. I haven't written a line of Perl in my life prior to today, so basic code review would be helpful. -- Peter Geoghegan v3-0001-unused_oids-suggestion.patch Description: Binary data

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
u have to be fairly unlucky to have that happen under the system introduced by commit a6417078.) It's probably the case that most patches that create a new pg_proc entry only create one. The question of consecutive OIDs only comes up with a fairly small number of patches. -- Peter Geoghegan

Re: Optimize single tuple fetch from nbtree index

2019-08-02 Thread Peter Geoghegan
tuples (say based on the "k=:val" constant) seems like it might generalize well enough. I suggest Floris look into that possibility. This paper might be worth a read: https://dl.acm.org/citation.cfm?id=582278 (Though it also might not be worth a read -- I haven't actually read it myself.) -- Peter Geoghegan

Re: Optimize single tuple fetch from nbtree index

2019-08-02 Thread Peter Geoghegan
On Fri, Aug 2, 2019 at 5:34 PM Peter Geoghegan wrote: > I wonder if some variety of block nested loop join would be helpful > here. I'm not aware of any specific design that would help with > Floris' case, but the idea of reducing the number of scans required on > the inner side

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-02 Thread Peter Geoghegan
l programmer is no excuse.) How about the attached? I've simply removed the "if ($oid > $prev_oid + 2)" test. -- Peter Geoghegan v4-0001-unused_oids-suggestion.patch Description: Binary data

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-03 Thread Peter Geoghegan
sed_oids would *maximize* the number of OID collisions. > We could > recommend the range if there are at least 10 OIDs available in the > range from the lowest position, and there are few patches eating more > than 5-10 OIDs at once. That sounds like an over-engineered solution to a pr

Re: Shrinking tuplesort.c's SortTuple struct (Was: More ideas for speeding up sorting)

2019-08-10 Thread Peter Geoghegan
s all of the same tricks as our existing the Bentley & McIlroy implementation, but is more cache efficient. It's considered the successor to B, and had input from Bentley himself. It is provably faster than B for a wide variety of inputs, at least on modern hardware. [1] http://www.vldb.org/journal/VLDBJ4/P603.pdf [2] https://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-12 Thread Peter Geoghegan
uppose that bt_right_page_check_scankey() helps with transposed pages, but doesn't help so much when you have WAL-level inconsistencies. -- Peter Geoghegan

Re: Do not check unlogged indexes on standby

2019-08-12 Thread Peter Geoghegan
ther than letting an ambiguous "can't happen" error get raised by low-level code. This might be possible with system catalog corruption, for example. Finally, I thought that the WARNING was a bit strong -- a NOTICE is more appropriate. Thanks! -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
ready for review again. I'm looking at it now. I'm going to spend a significant amount of time on this tomorrow. I think that we should start to think about efficient WAL-logging now. > In the meantime, I'll run more stress-tests. As you probably realize, wal_consistency_checking is a good thing to use with your tests here. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-19 Thread Peter Geoghegan
ve that you came up with anyway. > > How do you feel about officially calling this deduplication, not > > compression? I think that it's a more accurate name for the technique. > I agree. > Should I rename all related names of functions and variables in the patch? Please rename them when convenient. -- Peter Geoghegan

Re: Removing unneeded downlink field from nbtree stack struct

2019-08-14 Thread Peter Geoghegan
d). This seemed like something that was really up to the callers. Pushed a version with that change. Thanks for the review! -- Peter Geoghegan

Re: pg11.5: ExecHashJoinNewBatch: glibc detected...double free or corruption (!prev)

2019-08-24 Thread Peter Geoghegan
hmee19@news-spur.riddles.org.uk That was a BufFile that was under the control of a tuplestore, so it was similar to but different from your case. I suspect it's related. -- Peter Geoghegan

Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
which would improve matters further with low cardinality indexes.) -- Peter Geoghegan

IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
ke? Perhaps this is a problem that isn't worth solving right now, but it is definitely a real problem. [1] https://www.postgresql.org/message-id/66ce997fb523c04e9749452273184c6c137cb88...@exch-mbx-113.vmware.com -- Peter Geoghegan

Re: IoT/sensor data and B-Tree page splits

2019-08-26 Thread Peter Geoghegan
isadvantages, including the fact that you have to know that your data is amenable to BRIN indexing in order to use a BRIN index. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-27 Thread Peter Geoghegan
, because it doesn't care about the actual content of posting lists. And, we can fix the "fake new item is not actually real new item" issue at one point within _bt_split(), just as we're about to WAL log. What do you think of that approach? -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-31 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 10:10 PM Peter Geoghegan wrote: > I see some Valgrind errors on v9, all of which look like the following > two sample errors I go into below. I've found a fix for these Valgrind issues. It's a matter of making sure that _bt_truncate() sizes new pivot tuples pr

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-30 Thread Peter Geoghegan
On Thu, Aug 29, 2019 at 5:07 PM Peter Geoghegan wrote: > I agree that v9 might be ever so slightly more space efficient than v5 > was, on balance. I see some Valgrind errors on v9, all of which look like the following two sample errors I go into below. First one: ==11193== VALGRINDERROR

Re: Re: Email to hackers for test coverage

2019-08-28 Thread Peter Geoghegan
d is correct -- the NULL handling within ApplySortAbbrevFullComparator() cannot actually be used currently. I wouldn't change anything about the code, though, since it's useful to defensively handle NULLs. -- Peter Geoghegan

Re: Yet another fast GiST build

2019-08-29 Thread Peter Geoghegan
would have a lot of advantages in the long term. It is certainly theoretically appealing. Could this make it easier to use merge join with containment operators? I'm thinking of things like geospatial joins, which can generally only be performed as nested loop joins at the moment. This is often wildly inefficient. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-29 Thread Peter Geoghegan
eel about this CREATE INDEX index-size-is-larger business? -- Peter Geoghegan

Re: Yet another fast GiST build

2019-08-29 Thread Peter Geoghegan
lues. We've prototyped that, see [1]. I'm pretty sure that spatial joins generally need two spatial indexes (usually R-Trees). There seems to have been quite a lot of research in it in the 1990s. -- Peter Geoghegan

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
On Sun, Aug 25, 2019 at 2:55 PM Peter Geoghegan wrote: > I suppose that we'd add something new to CREATE OPERATOR CLASS to make > this work? My instinct is to avoid adding things that are only > meaningful for a single AM to interfaces like CREATE OPERATOR CLASS, > but the system

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
On Sun, Aug 25, 2019 at 2:18 PM Peter Geoghegan wrote: > > Indeed, we run up against this sort of thing all the time in, eg, planner > > optimizations. I think some sort of "equality is precise" indicator > > would be really useful for a lot of things. > >

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
letely, because we're not directly concerned with the physical representation used within an index. In fact, a major goal for this new infrastructure is that nbtree gets to fully own the representation (it just needs to know about the high level or logical requirements). -- Peter Geoghegan

Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
uot; collation isn't otherwise usable. Perhaps there are far more compelling planner optimization that I haven't considered, though. This idea probably has problems with interesting sort orders that aren't actually that interesting. -- Peter Geoghegan

Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence

2019-08-25 Thread Peter Geoghegan
the btree/numeric display scale problem are simply not worth solving directly. That would add a huge amount of complexity for very little benefit. [1] https://commitfest.postgresql.org/24/2202/ -- Peter Geoghegan

"Classic" nbtree suffix truncation prototype

2019-08-25 Thread Peter Geoghegan
5x+ reduction), along with a very small reduction in the number of leaf pages. Users that happen to have a lot of indexes that look like this are likely to find classic suffix truncation compelling, but that doesn't seem like a good enough reason to push ahead with the patch. -- Peter Geoghegan

Re: Optimize single tuple fetch from nbtree index

2019-08-23 Thread Peter Geoghegan
on. This code is a few years old, but I still wouldn't be surprised if it turned out to be slightly wrong in a way that was important. We still have no way of detecting if a buffer is accessed without a pin. There have been numerous bugs like that before. (We have talked about teaching Valgrind to detect the case, but that never actually happened.) -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-08-22 Thread Peter Geoghegan
THRESHOLD stuff really helped with those indexes. Want me to send this data and the associated tests script over to you? -- Peter Geoghegan

Re: when the IndexScan reset to the next ScanKey for in operator

2019-08-22 Thread Peter Geoghegan
m/cah2-wzmrt_0ybhf05axqb2oituqiqakr0lznntj8x3kadkz...@mail.gmail.com -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
er or not incrementally doing all the work (not just the WAL logging) makes sense. It's still too early to be sure about whether or not that's a good idea. -- Peter Geoghegan nbtree_wal_test.sql Description: Binary data

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
l debug this myself in a few days, though you may prefer to do it before then. -- Peter Geoghegan

Re: amcheck verification for GiST

2019-09-06 Thread Peter Geoghegan
On Fri, Sep 6, 2019 at 7:02 AM Alvaro Herrera from 2ndQuadrant wrote: > Peter, Heikki, are you going to do [at least] one more round of > design/functional review? I didn't plan on it, but somebody probably should. Are you offering to commit the patch? If not, I can take care of it. --

Re: amcheck verification for GiST

2019-09-06 Thread Peter Geoghegan
On Fri, Sep 6, 2019 at 2:35 PM Alvaro Herrera from 2ndQuadrant wrote: > I'd welcome it more if you did it; thanks. I'll take care of it, then. -- Peter Geoghegan

Re: amcheck verification for GiST

2019-09-11 Thread Peter Geoghegan
, and hopefully the next VACUUM will clean it up. """ Why is this not a problem for the new amcheck checks? Maybe this is a very naive question. I don't claim to be a GiST expert. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 3:09 PM Peter Geoghegan wrote: > Hmm. So v12 seems to have some problems with the WAL logging for > posting list splits. With wal_debug = on and > wal_consistency_checking='all', I can get a replica to fail > consistency checking very quickly when "

Re: Do not check unlogged indexes on standby

2019-09-11 Thread Peter Geoghegan
The patch has been committed already. Peter Geoghegan (Sent from my phone)

Re: Do not check unlogged indexes on standby

2019-09-11 Thread Peter Geoghegan
On Wed, Sep 11, 2019 at 7:10 PM Peter Geoghegan wrote: > The patch has been committed already. Oh, wait. It hasn't. Andrey didn't create a new thread for his largely independent patch, so I incorrectly assumed he created a CF entry for his original bugfix. -- Peter Geoghegan

Re: Create collation reporting the ICU locale display name

2019-09-12 Thread Peter Geoghegan
On Thu, Sep 12, 2019 at 11:30 AM Peter Geoghegan wrote: > I wonder if it's possible to display a localized version of the > display string in the NOTICE message? Does that work, or could it? For > example, do you see the message in French? BTW, I already know for sure that ICU supports

Re: Create collation reporting the ICU locale display name

2019-09-12 Thread Peter Geoghegan
For example, do you see the message in French? -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-16 Thread Peter Geoghegan
indirectly triggered more FPIs, which contributed to triggering a checkpoint even earlier...and so on. Synthetic test cases can avoid this. A useful synthetic test should have no checkpoints at all, so that we can see the broken down costs, without any second order effects that add more cost in weird ways. -- Peter Geoghegan

Re: Create collation reporting the ICU locale display name

2019-09-14 Thread Peter Geoghegan
eleted_in_Newsletter_I-8 -- Peter Geoghegan

Re: amcheck verification for GiST

2019-09-06 Thread Peter Geoghegan
On Fri, Sep 6, 2019 at 3:22 PM Peter Geoghegan wrote: > I'll take care of it, then. Attached is v10, which has some comment and style fix-ups, including the stuff Alvaro mentioned. It also adds line pointer sanitization to match what I added to verify_nbtree.c in commit a9ce839a (we use a cus

Re: [HACKERS] CLUSTER command progress monitor

2019-09-09 Thread Peter Geoghegan
th progress reporting infrastructure. I think that it's okay to redefine how progress reporting works with CLUSTER now, in order to fix the REINDEX/CLUSTER state clobbering bug. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-18 Thread Peter Geoghegan
As I went into at the start of this e-mail, unnecessarily doing expensive things like copying large posting lists around is a real concern. Even if it isn't truly useful for _bt_dedup_one_page() to operate in a very incremental fashion, incrementalism is probably still a good thing to aim for -- it seems to make deduplication faster in all cases. -- Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-09-18 Thread Peter Geoghegan
On Wed, Sep 18, 2019 at 10:43 AM Peter Geoghegan wrote: > This also suggests that making _bt_dedup_one_page() do raw page adds > and page deletes to the page in shared_buffers (i.e. don't use a temp > buffer page) could pay off. As I went into at the start of this > e-mail, unneces

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Peter Geoghegan
have very small BufFileWrite() size arguments. tuplestore.c, for one. -- Peter Geoghegan

Re: pgbench - implement strict TPC-B benchmark

2019-07-30 Thread Peter Geoghegan
nal script > is not really TPC-B. That's treading on being false advertising. IANAL, but it may not even be permissible to claim that we have implemented "standard TPC-B". -- Peter Geoghegan

Re: should there be a hard-limit on the number of transactions pending undo?

2019-07-29 Thread Peter Geoghegan
does the bit mean? It could mean "please check the undo > log," in which case it'd have to be set on insert, eventually cleared, > and then reset on delete, but I think that's likely to suck. I think > therefore that the bit should mean > is-deleted-but-not-necessarily-all-visible-yet, which avoids that > problem. That sounds about right to me. -- Peter Geoghegan

Re: pgbench - implement strict TPC-B benchmark

2019-07-31 Thread Peter Geoghegan
e. Not sure where that leaves this patch. What problem is it actually trying to solve? [1] http://www.tpc.org/tpcb/ -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-07-31 Thread Peter Geoghegan
On Fri, Jul 26, 2019 at 7:25 PM Peter Geoghegan wrote: > I guess that the idea here was to prevent masking on ipv6 addresses, > though not on ipv4 addresses. Obviously we're only dealing with a > prefix with ipv6 addresses, whereas we usually have the whole raw > ipaddr with ipv4. Not

The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-01 Thread Peter Geoghegan
the reserved range? It seems preferable for everybody to consistently use the reserved OID range. -- Peter Geoghegan

Re: The unused_oids script should have a reminder to use the 8000-8999 OID range

2019-08-01 Thread Peter Geoghegan
ct same mail at > CAH2-WzmCzNMebiN4-8p=ON92m0Rz0ybxNEKrO_2J+9DqWfWP=a...@mail.gmail.com :) Seems like I should propose a patch this time around. I don't do Perl, but I suppose I could manage something as trivial as this. -- Peter Geoghegan

Re: Patch for SortSupport implementation on inet/cdir

2019-07-26 Thread Peter Geoghegan
On Fri, Jul 26, 2019 at 6:58 PM Peter Geoghegan wrote: > I found this part of your approach confusing: > > > + /* > > +* Number of bits in subnet. e.g. An IPv4 that's /24 is 32 - 24 = 8. > > +* > > +* However, only some of the bits may hav

<    3   4   5   6   7   8   9   10   11   12   >