On Thu, Mar 1, 2018 at 11:15 AM, David Steele <da...@pgmasters.net> wrote:
> On 12/15/17 5:31 PM, Peter Geoghegan wrote:
>> Commit d70cf811, from 2014, promoted an Assert() within
>> IndexBuildHeapScan() to a "can't happen" elog() error, in order to
>> detec
ine and we may actually even invoke WHEN NOT
> MATCHED action in that case.
Again, I have to ask: is such an UPDATE actually meaningfully
different from a concurrent DELETE + INSERT? If so, why is a special
error better than a dup violation, or maybe even having the INSERT
(and whole MERGE statement) succeed?
--
Peter Geoghegan
anging fruit is outside of pgbench
itself. None of the more recent pgbench enhancements seem to make it
easier to use.
--
Peter Geoghegan
on fixes/improvements). The only
> thing pending is the decision to accept or change the currently implemented
> concurrency semantics.
I need to go look at that. I'll try to take a firmer position on this.
I know that I've been saying that for a quite a while now, but my
failure to take a firmer position for so long is not because I didn't
try. It's because there is no really good answer.
--
Peter Geoghegan
On Thu, Feb 22, 2018 at 6:32 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> Attached patch does this. I cannot recreate this issue locally, but
>> this should still fix it on skink.
>
> Committed.
Looks like it worked.
--
Peter Geoghegan
ueness
> check is necessary, so that's why I propose shortcircuiting the search
> only, and not the actual insertion on the page. I believe IMO it's
> more important to try and maximize the conditions under which the
> optimization can be applied.
I didn't get what the point of checking the first item on the page
(your proposal) was.
--
Peter Geoghegan
rch to find the
> proper insertion point, etc.
The whole way that this patch opts out of using an exclusive buffer
lock on the "first page the value could be on" (the _bt_check_unique()
+ _bt_findinsertloc() stuff) makes me uneasy. I know that that isn't
terribly concrete.
--
Peter Geoghegan
I believe PKs are a prime candidate for this optimization, and
> expecting it to apply only when no concurrency is involved is severely
> dumbing down the optimization.
Pavan justified the patch using a benchmark that only involved a
single client -- hardly typical for a patch that changes the B-Tree
code. If the benefits with many clients can be shown to matter, that
will make this much more interesting to me.
--
Peter Geoghegan
tils.c, so
that MERGE Query structs can be deparsed -- see get_query_def().
Yeah, we've decided we're not going to support user-visible rules, but
I continue to think that deparsing support is necessary on general
principle, and for the benefit of extensions like Citus that use
deparsing in a fairly broad way. At the very least, if we're not going
to support deparsing, there needs to be a better reason than "we're
not supporting user-visible rules".
--
Peter Geoghegan
have to worry about concurrent page recycling because
it is already the only thing that performs page deletion. It's already
the process that has the exclusive right to give index pages back to
the FSM.
--
Peter Geoghegan
On Wed, Apr 25, 2018 at 12:06 PM, Teodor Sigaev <teo...@sigaev.ru> wrote:
> Oops, I missed, I don't know how... Thank you very much for your quick eye!
Thanks Teodor.
--
Peter Geoghegan
On Wed, Apr 25, 2018 at 1:45 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Teodor, are you caught up to the point where it'd be okay to run pgindent,
> or are there still patches waiting?
I can't speak for Teodor, but fwiw I am not aware of any more patches waiting.
--
Peter Geoghegan
ty here is choosing an
on-disk representation that lets us do everything that we'll want to
do in the future. That requires strong cooperation.
--
Peter Geoghegan
about, too. Andrey is probably correct in his suspicion that
we'll end up prototyping a number of approaches.
I'm glad that you're thinking about this, in any case. Seems like a
promising area to work on.
--
Peter Geoghegan
before this specific problem with
cost_sort() can be addressed.
--
Peter Geoghegan
ot;gitcommit" syntax.
I find it useful to have a routine checklist for things like
committing, because it frees up a little space in my head for other
things. I do the same thing when preparing for a trip.
--
Peter Geoghegan
believe in
the patch's strategic importance" benchmark. TPC-C is clearly the most
influential database benchmark ever, so I think that that's a fair
request. (See the TPC-C commentary at
https://www.hammerdb.com/docs/ch03s02.html, for example.)
--
Peter Geoghegan
ta" field instead of a
text "data" field.)
--
Peter Geoghegan
On Wed, Oct 3, 2018 at 4:39 PM Peter Geoghegan wrote:
> I did find a pretty clear regression, though only with writes to
> unique indexes. Attached is v6, which fixes the issue. More on that
> below.
I've been benchmarking my patch using oltpbench's TPC-C benchmark
these past few wee
h that logs when evicting a
> buffer while already holding another lwlock. That shouldn't be too hard.
I'll look into this.
Thanks
--
Peter Geoghegan
Shared_buffers is 10gb iirc. The server has 32gb of memory. Yes, 'public'
is the patch case. Sorry for not mentioning it initially.
--
Peter Geoghegan
(Sent from my phone)
tation to influence
the outcome.
--
Peter Geoghegan
y patch. I can
find a way of only executing the read TPC-C queries, to see where they
are on their own. TPC-C is particularly write-heavy, especially
compared to the much more recent though less influential TPC-E
benchmark.
--
Peter Geoghegan
the process a bit easier. rr is supposed to be particularly helpful
with non-deterministic or difficult to recreate bugs; hopefully I'll
be able to apply it in that way when I gain some more experience with
it.
--
Peter Geoghegan
to last for a relatively long time. I'll look into it
again, though.
> Due to the optimization the _bt_binsrch() size has grown twice. May be
> you move this to some service routine?
Maybe. There are some tricky details that seem to work against it.
I'll see if it's possible to polish that some more, though.
--
Peter Geoghegan
e of the more general problem --
what you call the 'drop cascades to 62 other objects' problem is a
more specific subproblem, or, if you prefer, a more specific symptom
of the same problem.
Since I'm going to have to fix the problem head-on, I'll have to study
it in detail anyway.
--
Peter Geoghegan
On Fri, Nov 2, 2018 at 5:00 PM Peter Geoghegan wrote:
> I had the opportunity to discuss this patch at length with Heikki
> during pgConf.EU.
> The DESC heap TID sort order thing probably needs to go. I'll probably
> have to go fix the regression test failures that occur when AS
n't happen" elog errors in places like scanPendingInsert() and
ginInsertCleanup() would be a good start. Note that we already did
similar Assert-elog(ERROR) promotion this for posting tree code, when
similar bugs appeared way back in 2013.
--
Peter Geoghegan
ginInsertCleanup() is generally supposed to be a rare occurrence. It's
not surprising that it's hard for you to hit this issue.
BTW, I strongly doubt that disabling hugepages has fixed anything,
though it may have accidentally masked the problem. This is an issue
around the basic correctness of the locking protocols used by GIN.
--
Peter Geoghegan
, because it seems to be what breaks the subtree
design (we don't delete in two phases or anything in GIN).
I think that you have to be doing a multi-level delete for a deadlock
to take place, which isn't particularly likely to coincide with a
concurrent insertion in general. That may explain why it's taken a
year to get a high-quality bug report.
--
Peter Geoghegan
On Wed, Nov 7, 2018 at 5:46 PM Peter Geoghegan wrote:
> I think that you have to be doing a multi-level delete for a deadlock
> to take place, which isn't particularly likely to coincide with a
> concurrent insertion in general. That may explain why it's taken a
> year to get a high
On Wed, Nov 7, 2018 at 5:46 PM Peter Geoghegan wrote:
> Teodor: Do you think that the issue is fixable? It looks like there
> are serious issues with the design of 218f51584d5 to me. I don't think
> the general "there can't be any inserters at this subtree" thing works
&
end on using some mechanism or other. Though not
necessarily the one I've sketched out.
[1] https://postgr.es/m/24279.1525401...@sss.pgh.pa.us
--
Peter Geoghegan
ines, since of course I'm only doing
this with the pg_depend indexes, and not for every system catalog
index.
--
Peter Geoghegan
From 7158fa1f93447d1f9bb296605a65b0fecfd54210 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan
Date: Tue, 13 Nov 2018 18:14:23 -0800
Subject: [PATCH 01/18] Add pg
e
+DETAIL: extension earthdistance depends on function cube_out(cube)
Can anyone think of a workable, scalable approach to fixing the
processing order of this findDependentObjects() pg_depend scan so that
we reliably get the user-visible behavior we already tacitly expect?
--
Peter Geoghegan
F
On Tue, Nov 13, 2018 at 1:29 PM Peter Geoghegan wrote:
> A solution does occur to me that I'm kind of embarrassed to suggest,
> but that would nonetheless probably do the job:
> Why not vary the objsubid value among entries that don't use it
> anyway, so that they have a nega
ul behavior.
> May be you know any another problems of the patch?
Just the lack of pg_upgrade support. That is progressing nicely,
though. I'll probably have that part in the next revision of the
patch. I've found what looks like a workable approach, though I need
to work on a testing strategy for pg_upgrade.
--
Peter Geoghegan
e one point of contention for each of the really
hot values at the far left of the leaf level.
[1] https://commitfest.postgresql.org/20/1787/
--
Peter Geoghegan
On Fri, Sep 28, 2018 at 8:00 AM Peter Eisentraut
wrote:
> On 21/09/2018 01:18, Peter Geoghegan wrote:
> > * This means that there is a compatibility issue for anyone that is
> > already right on the threshold -- we *really* don't want to see a
> > REINDEX fail, but that see
n iterative
approach, with lots of prototyping. Your patch needs attention to
advance, and IMV the CF is the best way to get that attention. So, I
think that it would be fine to go submit it now.
I must admit that I didn't even notice that your patch lacked a CF
entry. Everyone has a different process, perhaps.
--
Peter Geoghegan
eb8a7c477156d0ad9d83e7297912cfe79#l612
[3]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/access/itup.h;h=bd3a702380954bc69c9536876094249430cb7c72;hb=df8b5f3eb8a7c477156d0ad9d83e7297912cfe79#l139
--
Peter Geoghegan
On Wed, Sep 19, 2018 at 11:23 AM Peter Geoghegan wrote:
> 3 modes
> ---
>
> My new approach is to teach _bt_findsplitloc() 3 distinct modes of
> operation: Regular/default mode, many duplicates mode, and single
> value mode.
I think that I'll have to add a fourth mod
ues with issues of concern to
natural language experts, so we as an ICU client can limit ourselves
to worrying about one of the two at any given time.
--
Peter Geoghegan
; And of course we need to think about how to handle upgrades, but you
> have already started a separate discussion about that.
Right.
[1]
https://postgr.es/m/CAH2-Wz=wAKwhv0PqEBFuK2_s8E60kZRMzDdyLi=-mvcm_ph...@mail.gmail.com
--
Peter Geoghegan
tree.
There is no consensus on exactly what the "b" actually stands for, but
it's definitely not "binary". I suppose that the original author meant
that a B-Tree is a generalization of a binary tree, which is basically
true -- though that's a very academic point.
--
Peter Geoghegan
or
this alignment padding? These two indexes are typically the largest
system catalog indexes by far, so the opportunity cost matters. I
don't think that the direct cost (more cycles) is worth worrying
about, though. Nobody has added a pg_depend column since it was first
introduced back in 2002.
--
Peter Geoghegan
can be only one "owning object".
+1. This is certainly a necessary requirement. Absurd error messages
are not okay.
--
Peter Geoghegan
es it seem necessary to fix the bug in the backbranches? I agree
that it's broken, but it's not obvious to me that it'll cause serious
problems in the real world that outweigh the potential downsides.
Perhaps I've missed some obvious downside.
--
Peter Geoghegan
I thought that your ALTER OBJECT DEPENDS ON EXTENSION example made the
case for fixing that directly inarguable. I'm slightly surprised that
you're not fully convinced of this already. Have I missed some
subtlety?
Thanks
--
Peter Geoghegan
debugging work. A "REINDEX index
bug_title_idx" makes amcheck happy, since the index tuple that points
to heap tuple '(579,4)' ends up being compressed in exactly the same
way as it is in the heap. The initial "INSERT ... SELECT" clearly
makes the executor produce compressed values for heap_insert(), though
not for btinsert() in this one instance. I've been able to confirm
this from gdb.
--
Peter Geoghegan
On Mon, Jan 14, 2019 at 1:46 PM Peter Geoghegan wrote:
> I would have said that the assumption is that a fixed source tuple
> will generate identical index entries. The problem with that is that
> my idea of what constitutes a fixed input now seems to have been
> faulty. I
On Mon, Jan 14, 2019 at 1:31 PM Tom Lane wrote:
> Peter Geoghegan writes:
> > The heapallindexed enhancement that made it into Postgres 11 assumes
> > that the representation of index tuples produced by index_form_tuple()
> > (or all relevant index_form_tuple() call
the first edition, so it isn't a hard purchase for me to
justify.
--
Peter Geoghegan
/1787/
--
Peter Geoghegan
would be harder than it now appears.
Thanks
--
Peter Geoghegan
On Fri, Jan 18, 2019 at 4:06 PM Peter Geoghegan wrote:
> > * Objects-to-drop output from DROP ROLE is still unstable. I suppose
> > this would be fixed by also doing sorting in that code path, but
> > I've not looked into it.
>
> The nbtree patch is not dependent on
On Mon, Jan 14, 2019 at 2:37 PM Peter Geoghegan wrote:
> The fix here must be to normalize index tuples that are compressed
> within amcheck, both during initial fingerprinting, and during
> subsequent probes of the Bloom filter in bt_tuple_present_callback().
I happened to talk to And
On Thu, Jan 17, 2019 at 5:09 PM Peter Geoghegan wrote:
> In the kludgey patch that I posted, the 4-byte value is manufactured
> artificially within a backend in descending order. That may have a
> slight advantage over object oid, even after the pg_depend correctness
> issues a
ct oid a perfectly
stable tie-breaker in practice?) on a similar thread from 2017.
--
Peter Geoghegan
esn't matter.
It might be better to mimic B-Tree and hash on the master branch.
--
Peter Geoghegan
patch needs opinion of an another reviewer.
Was I unclear on why the patch fixes the problem? Sorry, but I thought
it was obvious.
--
Peter Geoghegan
e
+DETAIL: extension earthdistance depends on function cube_out(cube)
This is a further example of "wrong, not just annoying". Technically
this is a broader problem than DEPENDENCY_INTERNAL_AUTO, I think,
though perhaps not too much broader.
--
Peter Geoghegan
suppressing semi-regularly. In practice, adding a
trailing attribute to each of the two pg_depend indexes almost
entirely eliminates the need to play whack-a-mole. That has to be an
advantage for the kind of approach that I've suggested.
--
Peter Geoghegan
t just be
superuser-only?
--
Peter Geoghegan
rd, but it's something to consider. It seems
generally useful to be able to force deterministic-ish behavior in a
single session. I don't expect that the test case is even a bit
portable, but the technique was quite effective.
--
Peter Geoghegan
rd way to think about this kind of debugging, and it would be
unreasonable to insist on providing any kind of standard framework.
It's a low-level tool, useful only to backend hackers.
--
Peter Geoghegan
efine'ing random() myself.
We're already making fairly broad assumptions about our having control
of the backend's PRNG state within InitProcessGlobals(). How should
this affect the new drandom()/setseed() private state, if at all?
--
Peter Geoghegan
s that totally depend on truncation for reasonable
performance, but even that doesn't necessarily imply that it consumes
too many cycles. I'm okay with imposing costs on a minority workload
provided the benefit is there, and the penalty isn't that noticeable
in realistic scenarios, to real users.
--
Peter Geoghegan
t a great design for
microvacuuming. It would be better if it happened on the primary,
where we generally have a lot more control/context. Of course, the
devil is in the details.
> Talking with Peter Geoghegan we were pondering a few ideas to address
> that:
> - we could stop doing page le
On Tue, Dec 18, 2018 at 1:20 PM Alvaro Herrera wrote:
>
> On 2018-Dec-18, Peter Geoghegan wrote:
> Hmm, interesting. I wonder if this is just a case of never testing this
> code under "postgres --ignore-system-indexes".
I suppose that you could say that. The regression t
tput when scanning pg_depend
indexes occur reliably. For better or for worse.
--
Peter Geoghegan
of. (BTW, dtrace probes can already
give the user much the same information -- I think that more people
should use those, since tracing technology on Linux has improved
drastically in the last few years.)
--
Peter Geoghegan
On Tue, Dec 18, 2018 at 2:26 PM Tom Lane wrote:
> Hm, that definitely leads me to feel that we've got bug(s) in
> dependency.c. I'll take a look sometime soon.
Thanks!
I'm greatly relieved that I probably won't have to become an expert on
dependency.c after all. :-)
--
Peter Geoghegan
used PG_INT32_MAX.
--
Peter Geoghegan
bt_index_check(), the
SQL-callable verification function that already only needs an
AccessShareLock.
[1]
https://postgr.es/m/CAH2-Wz=apbKyaFhEfRN3UK_yXZ8DSE4Ybr0A3D87=4jwyy1...@mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Sample_Databases
[3] https://archive.org/stream/symmetricconcurr00la
ture" thinking seems particularly valuable when working on the
B-Tree code; I don't want anybody to miss a possible future
opportunity.
[1]
https://postgr.es/m/cah2-wzn_nayk4pr0hrwo0stwhmxjp5qyu+x8vppt030xpqr...@mail.gmail.com
--
Peter Geoghegan
On Mon, Dec 3, 2018 at 7:10 PM Peter Geoghegan wrote:
> Attached is v9, which does things that way. There are no interesting
> changes, though I have set things up so that a later patch in the
> series can add "dynamic prefix truncation" -- I do not include any
> such pa
about here isn't really "EXPLAIN, but for DDL" so
much as a way of making EXPLAIN predict relation lock acquisitions,
including for DDL statements. I'm fine with calling that EXPLAIN, but
it's still not quite EXPLAIN as we know it today.
--
Peter Geoghegan
ral solution of the messages ordering problem.
I have no idea what you mean here. I'm proposing a patch that stops
it being a game of chance, while preserving the existing
slightly-random behavior to the greatest extent possible. I think that
my patch would have stopped that problem altogether. Are you
suggesting that it wouldn't have?
--
Peter Geoghegan
tability issue keeps coming up, which makes a comprehensive
approach seem attractive to me. At least 95% of the test instability
comes from pg_depend.
--
Peter Geoghegan
this feature. We're all very busy, but I
don't want to see it die.
--
Peter Geoghegan
age() doesn't do
this for you.
* Should we be concerned about the memory used by pushStackIfSplited()?
* How about a cross-check between IndexTupleSize() and
ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
we have this redundancy, which wastes space, but we do, so we might as
well get some small benefit from it.
--
Peter Geoghegan
ave in
mind in detail before commenting.
Thanks
--
Peter Geoghegan
k that reverting commit
218f51584d5 is the only option on the table for the back branches. Its
design is based on ideas on locking protocols that are fundamentally
incorrect and unworkable. I don't have a lot of faith in our ability
to retrofit a design that fixes the issue without causing problems
elsewhere.
--
Peter Geoghegan
t;1/3 of a page means 2704 bytes" in the docs,
since the definition was always a bit fuzzy. There will need to be a
compatibility note in the release notes, though.
--
Peter Geoghegan
y pointing out that Pavan is in an
unenviable position with this patch, through no fault of his own, and
despite a worthy effort.
I hope that he sticks it out, because that seems to be what it takes
to see something like this through. I don't know how to do better at
that.
--
Peter Geoghegan
ce it in the
wider landscape of ideas that are like this. Placing it in that wider
landscape, and figuring out next steps at a high level seem to be the
problem right now.
--
Peter Geoghegan
ulk loading far exceeds what is
truly required. BenchmarkSQL also makes it easy to generate useful
html reports, complete with graphs.
--
Peter Geoghegan
of nbtinsert.c. Perhaps it
> would be a good idea to move it to a whole new nbtsplitloc.c file? It's
> a very isolated piece of code.
Good idea. I'll give that a go.
> In the comment on _bt_leave_natts_fast():
> That's an interesting tidbit, but I'd suggest just removing this comment
> altogether. It's not really helping to understand the current
> implementation.
Will do.
> v9-0005-Add-high-key-continuescan-optimization.patch commit message:
>
> > Note that even pre-pg_upgrade'd v3 indexes make use of this
> > optimization.
>
> .. but we're missing the other optimizations that make it more
> effective, so it probably won't do much for v3 indexes. Does it make
> them slower? It's probably acceptable, even if there's a tiny
> regression, but I'm curious.
But v3 indexes get the same _bt_findsplitloc() treatment as v4 indexes
-- the new-item-split stuff works almost as well for v3 indexes, and
the other _bt_findsplitloc() stuff doesn't seem to make much
difference. I'm not sure if that's the right thing to do (probably
doesn't matter very much). Now, to answer your question about v3
indexes + the continuescan optimization: I think that it probably will
help a bit, with or without the _bt_findsplitloc() changes. Much
harder to be sure whether it's worth it on balance, since that's
workload dependent. My sense is that it's a much smaller benefit much
of the time, but the cost is still pretty low. So why not just make it
version-generic, and keep things relatively uncluttered?
Once again, I greatly appreciate your excellent review!
--
Peter Geoghegan
e status quo, or a platitude about inclusivity.
--
Peter Geoghegan
On Fri, Sep 14, 2018 at 3:32 PM, Tom Lane wrote:
> I'd go with #2, personally. It does seem that the costing needs work,
> but it's not clear to me that we know what to change, so it's kinda
> late to propose #3 for v11.
+1. I also favor option #2.
--
Peter Geoghegan
that particular naming.
FWIW, I don't think that your IEEE analogy quite works, because you're
talking about a property of a datatype. A collation is not intrinsic
to any collatable datatype. Fortunately, we're not required to agree
on what feels natural.
--
Peter Geoghegan
. That's
going to be the vast, vast majority of cases we care about.
--
Peter Geoghegan
to make it actually painful you need a
> workload where the implied randomness of accesses isn't already a major
> bottleneck - and that's hard.
There is also the fact that in many cases you'll just have accessed
the same buffers from within _bt_check_unique() anyway.
--
Peter Geoghegan
line to make gist logically
> broken. This triggers regression test of amcheck.
Any thoughts on this, Heikki?
It would be nice to be able to squeeze this into Postgres 12,
especially given that GiST has been enhanced for 12 already.
--
Peter Geoghegan
haps that needs to be
taken into account, without being indulged.
--
Peter Geoghegan
On Thu, Mar 21, 2019 at 10:28 AM Peter Geoghegan wrote:
> I've committed the first 4 patches. Many thanks to Heikki for his very
> valuable help! Thanks also to the other reviewers.
>
> I'll likely push the remaining two patches on Sunday or Monday.
I noticed that if I initidb a
ity, it depends entirely
on the situation at hand.
--
Peter Geoghegan
with same problem twice:
> v3 of the patch used AccessShareLock and many locks with incorrect order.
> We could use one of possible solutions: either use ShareLock, or rewrite scan
> to correct locking order.
> But patches v4-v7 use both.
It definitely has to be one or the other. The combination makes no sense.
--
Peter Geoghegan
he
syntax diagrams in psql if we go this way, though. Not sure what to do
about that.
--
Peter Geoghegan
401 - 500 of 3113 matches
Mail list logo