Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
e -- it couldn't possibly be worth much of any risk. I can see the appeal of consistency, but I also see the appeal of sticking to how things work there: continually and explicitly inserting into and compacting the heap seems like a good enough way of framing what a top-n heap does,

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
ng, LGTM. Thanks. I suggest spending at least as much time on unsympathetic cases (e.g., only 2 or 3 tapes must be merged). At the same time, I suggest focusing on a type that has relatively expensive comparisons, such as collated text, to make differences clearer. -- Peter Geoghegan -- Sen

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
On Tue, Sep 6, 2016 at 2:46 PM, Peter Geoghegan wrote: > Feel free to make a counter-proposal for a cap. I'm not attached to > 500. I'm mostly worried about blatant waste with very large workMem > sizings. Tens of thousands of tapes is just crazy. The amount of data > th

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
thing that I've put so much work into. Robert has been involved with 100% of all sorting patches I've written, generally with far less input from anyone else, and at this point, that's really rather a lot of complex patches. > Let's begin with patch 1: > > On 08/02/2016 01

Re: [HACKERS] Bug in 9.6 tuplesort batch memory growth logic

2016-09-06 Thread Peter Geoghegan
On Tue, Sep 6, 2016 at 12:51 PM, Tom Lane wrote: > I rewrote the comment and pushed it. Thank you. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
On Tue, Sep 6, 2016 at 12:39 PM, Peter Geoghegan wrote: > On Tue, Sep 6, 2016 at 12:08 AM, Heikki Linnakangas wrote: >>> I attach a patch that changes how we maintain the heap invariant >>> during tuplesort merging. > >> Nice! > > Thanks! BTW, the way that

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-09-06 Thread Peter Geoghegan
-way merge heap gets one element smaller). I can write a patch to do this renaming, if you're interested. Someone should fix it, because independent of all this, it's just wrong. [1] https://www.postgresql.org/message-id/CAM3SWZQKM=Pzc=cahzrixkjp2eo5q0jg1sofqqexfq647ji...@mail.gmail.com

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-06 Thread Peter Geoghegan
On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan wrote: > Offhand, I would think that taken together this is very important. I'd > certainly want to see cases in the hundreds of megabytes or gigabytes > of work_mem alongside your 4MB case, even just to be able to talk > informal

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-06 Thread Peter Geoghegan
r this is very important. I'd certainly want to see cases in the hundreds of megabytes or gigabytes of work_mem alongside your 4MB case, even just to be able to talk informally about this. As you know, the default work_mem value is very conservative. -- Peter Geoghegan -- Sent via pgsql-hackers

Re: [HACKERS] Bug in 9.6 tuplesort batch memory growth logic

2016-09-06 Thread Peter Geoghegan
ed. I also knew any choice of constant would be criticized (e.g., "that's voodoo"), so pointed out specifically that it was non-critical. What threshold would you use? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug in 9.6 tuplesort batch memory growth logic

2016-09-05 Thread Peter Geoghegan
6, but that seems unnecessary. I can reproduce it with my parallel CREATE INDEX patch applied, with just the right test case and right number of workers (it's rather delicate). After careful consideration, I can think of no reason why 9.6 would be unaffected. -- Peter Geoghegan From 850af5f9f7b1b

Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-02 Thread Peter Geoghegan
, priority, and possibly other considerations. I see the 9.6 work on external sort as a building piece for that, as it removed the one thing that was sensitive to work_mem in a surprising, unpredictable way. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-09-02 Thread Peter Geoghegan
On Fri, Sep 2, 2016 at 11:16 AM, Peter Geoghegan wrote: > There are only tiny differences, which in any case you can see in the > commit log on Github. There is no reason why code review needs to > block on this V3, IMV. Also, I don't think that we need to include V2's test

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-09-02 Thread Peter Geoghegan
g at.) There are only tiny differences, which in any case you can see in the commit log on Github. There is no reason why code review needs to block on this V3, IMV. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-09-02 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:14 PM, Peter Geoghegan wrote: > I'd certainly welcome that. There are Debian packages available from > the Github version of amcheck, which is otherwise practically > identical to the most recent version of the patch posted here: > > https://githu

Re: [HACKERS] less expensive pg_buffercache on big shmem

2016-09-01 Thread Peter Geoghegan
d on the off chance that it mattered, without actually being justified. I would like to be able to run pg_buffercache in production from time to time. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Peter Geoghegan
o use it on all production databases. It wouldn't have mattered that the verification was no less effective, since the bugs it found would simply never have been observed in practice. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-08-31 Thread Peter Geoghegan
rmation about a problem. And, ideally, we'd also have some indication of how big a difference that would make, it terms of measurable performance impact. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Wed, Aug 31, 2016 at 3:08 PM, Andres Freund wrote: > On August 31, 2016 3:06:23 PM PDT, Peter Geoghegan wrote: > >>In other painfully pedantic news, I should point out that >>sizeof(size_t) isn't necessarily word size (the most generic >>definition of word size fo

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
), contrary to my reading of the 0002-* patch comments. I'm mostly talking thinking about x86_64 here, of course. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP

2016-08-31 Thread Peter Geoghegan
On Sun, Nov 22, 2015 at 7:29 PM, Andreas Karlsson wrote: > Sorry for dropping this patch, but now I have started looking at it again. Any chance of picking this up again soon, Andreas? I think it's an important project. I would like to review it. -- Peter Geoghegan -- Sent v

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
On Wed, Aug 31, 2016 at 2:09 PM, Peter Geoghegan wrote: > The only thing that stuck out to any degree is that we don't grow the > "reln->md_seg_fds[forknum]" array within the new _fdvec_resize() > function geometrically. That new function looks like this: >

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-31 Thread Peter Geoghegan
ly doesn't matter, given what that array tracks. I'm just pointing out that that aspect did give me pause. The struct MdfdVec is small enough that that *might* be worthwhile. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

Re: [HACKERS] ICU integration

2016-08-31 Thread Peter Geoghegan
iguration > option perhaps in initdb to change the default so that, say, "fr_FR" > uses ICU and "fr_FR%posix" uses the old stuff. I suspect that we'd be better off adding a mechanism for adding a new collation after initdb runs, on a live production instance. Maybe t

Re: [HACKERS] ICU integration

2016-08-31 Thread Peter Geoghegan
dual OS packagers of ICU deciding what exact CLDR data to use, which may or may not be of any significant consequence in practice. [1] http://unicode.org/reports/tr10 [2] http://site.icu-project.org/design/size/collation [3] http://userguide.icu-project.org/icudata -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
ery or on standby after WAL replay. Right you are -- while BTP_INCOMPLETE_SPLIT is set during recovery, BTP_SPLIT_END is not. Still, most of the btpo_flags flags that are masked in the patch shouldn't be. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresq

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
st, because the other flags have clear-cut roles in various atomic operations that we WAL-log. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL consistency check facility

2016-08-27 Thread Peter Geoghegan
s special case for speculative insertion (to WAL-log the virtually useless speculative insertion token value)? I'm certain that the answer must be "no": This tool ought to deal with speculative insertion as a special case, and not vice-versa. -- Peter Geoghegan -- Sent via pgsql-h

Re: [HACKERS] increasing the default WAL segment size

2016-08-25 Thread Peter Geoghegan
k, because WAL segments consist of zeroes at the end when archive_timeout is applied (at least from 9.4 on). We compress the WAL segments, and many zeroes compress very well. I admit that I haven't looked at it in much detail, but that is my current understanding. -- Peter Geoghegan

Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 12:59 PM, Peter Geoghegan wrote: > Maybe we should change the ordering of those IndexInfo structs to > something more suitable, but it must be immutable (it cannot hinge > upon the details of one particular DML statement). I meant that it must be stable (not

Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
lly also wrote a patch to prefer insertion into the primary key first, which also went nowhere (I gave up on that one, to be fair). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
ct is initially found, and so no guarantees here.) Anyway, I don't have a lot of sympathy for this point of view, because the scenario is completely contrived. You have to draw the line somewhere. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postg

Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
27;t upsert while using more than one index as an arbiter index. This is true unless they're more or less equivalent, in which case multiple arbiter indexes can be inferred, but that clearly doesn't apply here. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hacke

Re: [HACKERS] [RFC] Change the default of update_process_title to off

2016-08-23 Thread Peter Geoghegan
concern here: https://www.postgresql.org/message-id/30619.1428157...@sss.pgh.pa.us ISTM that we don't even care about Windows performance to a minimal degree. Hopefully, the ICU stuff Peter Eisentraut is working on will level the playing field here a little bit, if only as an accidental side

Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-08-22 Thread Peter Geoghegan
is an interesting discussion of the matter here: http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-08-22 Thread Peter Geoghegan
I haven't looked at the patch, but offhand I wonder if it's worth considering control characters added by unicode, if you haven't already. -- Peter Geoghegan

[HACKERS] Re: [BUGS] Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)

2016-08-22 Thread Peter Geoghegan
thousands of servers. For some reason, both cases involved strings with code points from the Arabic alphabet, even though each case was from a totally unrelated customer database. I'll go update the Wiki page for this [1] now. [1] https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-22 Thread Peter Geoghegan
On Mon, Aug 1, 2016 at 3:18 PM, Peter Geoghegan wrote: > Attached WIP patch series: This has bitrot, since commit da1c9163 changed the interface for checking parallel safety. I'll have to fix that, and will probably take the opportunity to change how workers have maintenance_work_mem app

Re: [HACKERS] Bug in abbreviated keys abort handling (found with amcheck)

2016-08-22 Thread Peter Geoghegan
yet, even though any instances of corruption of text indexes I've seen originated before the point release in which strxfrm() became distrusted. I guess that not that many Heroku users use the "C" locale, which would still be affected with the latest point release. -- Peter Geogheg

[HACKERS] Bug in abbreviated keys abort handling (found with amcheck)

2016-08-19 Thread Peter Geoghegan
fraction of all databases tested, so I don't think it's very common in the wild. I'd be surprised if amcheck does not bring more bugs like this to my attention before too long. We should work on improving it, so that we have greater visibility into problems that occur in the f

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-08-18 Thread Peter Geoghegan
;s much more likely that they'd be "false positive" bugs. In any case, I haven't seen any issue with the tool itself yet, having now run the tool on thousands of servers. I think I'll have a lot more information in about a week, when I've had time to work through more data

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:42 PM, Andres Freund wrote: > How large was the index & table in question? I mean this really only > comes into effect at 100+ segments. Not that big, but I see no reason to take the chance, I suppose. -- Peter Geoghegan -- Sent via pgsql-hackers ma

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
t that would be a waste of time. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] _mdfd_getseg can be expensive

2016-08-18 Thread Peter Geoghegan
On Thu, Aug 18, 2016 at 5:26 PM, Andres Freund wrote: > Rebased version attached. A review would be welcome. Plan to push this > forward otherwise in the not too far away future. I can review this next week. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-h

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-08-18 Thread Peter Geoghegan
y, I would like to make amcheck verifying the heap through a B-Tree index as a next step. There is also a good case for the tool directly verifying heap relations, without involving any index, but that can come later. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgres

Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location

2016-08-18 Thread Peter Geoghegan
s necessary to be very ambitious in order to solve a problem. The understandable and usually well-reasoned approach of making progress as incremental as possible occasionally works against contributors. It's worth considering if this is such a case. -- Peter Geoghegan -- Sent via pgs

Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location

2016-08-18 Thread Peter Geoghegan
address this problem is with a duplicate list and/or prefix compression in leaf pages. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] amcheck (B-Tree integrity checking tool)

2016-08-18 Thread Peter Geoghegan
On Sat, Mar 12, 2016 at 12:38 PM, Peter Geoghegan wrote: > Only insofar as it helps diagnose the underlying issue, when it is a > more subtle issue. Actually fixing the index is almost certainly a > REINDEX. Once you're into the messy business of diagnosing a > problematic opclas

Re: [HACKERS] CLUSTER, reform_and_rewrite_tuple(), and parallelism

2016-08-17 Thread Peter Geoghegan
tuff in the frame pointer at that point. You either need to use > dwarf or lbr to get accurate ones. Is it worth doing that here, and redoing the test, so that the glibc attributions are correct? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] CLUSTER, reform_and_rewrite_tuple(), and parallelism

2016-08-17 Thread Peter Geoghegan
ated that the tuplesort CLUSTER takes just under 3 minutes (this includes writing out the new heap, of course). -- Peter Geoghegan cycles-cluster-presorted-flamegraph.svg.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make ch

Re: [HACKERS] CLUSTER, reform_and_rewrite_tuple(), and parallelism

2016-08-17 Thread Peter Geoghegan
tuples, the hash lookups in > rewrite_heap_tuple(), ...? Perhaps the attached svg "flamegraph" will give you some idea. This is based on the "perf cache-misses" event type. -- Peter Geoghegan cache-misses-cluster-presorted-flamegraph.svg.gz Description: GNU Zip compressed da

[HACKERS] CLUSTER, reform_and_rewrite_tuple(), and parallelism

2016-08-17 Thread Peter Geoghegan
aightforward to have reform_and_rewrite_tuple() work occur in parallel workers instead, which buys us a lot. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
The ordering dependencies happen to be quite naturally across one leader process and one or more worker processes. I do see value in this, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
workers that arrive after batch 0 is complete. Is that really so bad? In general, I don't tend to think of workers as the cost to worry about. Rather, we should be concerned about the active use of CPU cores as our major cost. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list

Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
the old hash table goes away. Of course, there are > some tricky issues with reading tapes that were originally created by > other backends, but if I understand correctly, Peter Geoghegan has > already done some work on that problem, and it seems like something we > can eventually solve,

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
This makes a C++ port seem less compelling to me than the idea first appears. Note, for example, that ICU is implemented in C++, but still has C stub functions, not necessarily for the exclusive benefit of C client code. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-ha

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
On Tue, Aug 16, 2016 at 1:29 PM, Peter Geoghegan wrote: > IMV, it would be useful to use C++ classes (and even template classes) > for a small number of data structures, while still largely adhering to > earlier practices (this is what GCC did). Specifically, a few modules > such a

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
ngInfo, could be made to follow the RAII/scope bound resource management usefully, which doesn't seem incompatible with memory contexts. However, this doesn't seem terribly exciting to me. [1] https://lwn.net/Articles/542457/ -- Peter Geoghegan -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-15 Thread Peter Geoghegan
On Wed, Aug 3, 2016 at 2:13 PM, Peter Geoghegan wrote: > Since merging is a big bottleneck with this, we should probably also > work to address that indirectly. I attach a patch that changes how we maintain the heap invariant during tuplesort merging. I already mentioned this over

[HACKERS] Is tuplesort_heap_siftup() a misnomer?

2016-08-12 Thread Peter Geoghegan
ween these two functions add clarity. tuplesort_heap_siftup() comments will also need to be updated, since "sift up" is mentioned there. Should I write a patch? [1] https://en.wikipedia.org/wiki/Heap_(data_structure) -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

2016-08-11 Thread Peter Geoghegan
h any effort to consolidate the number of spinlock acquisitions? In other words, maybe the most common idioms should be baked into the ConditionVariable interface, which could save callers from having to use their own mutex variable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-

Re: Improved ICU patch - WAS: [HACKERS] Implementing full UTF-8 support (aka supporting 0x00)

2016-08-11 Thread Peter Geoghegan
it works well with this. You really need to create a scenario with a real sort, and all the conditions I describe. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: Improved ICU patch - WAS: [HACKERS] Implementing full UTF-8 support (aka supporting 0x00)

2016-08-10 Thread Peter Geoghegan
I would think, which seems like a bug that is not dodged by simply not defining TRUST_STRXFRM. Isn't its assumption that that matching the ordering used elsewhere not really hold on FreeBSD builds? [1] https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue -- Peter Geoghegan -- Se

Re: [HACKERS] Parallel tuplesort, partitioning, merging, and the future

2016-08-10 Thread Peter Geoghegan
e accepted. It's not practical to go to the trouble of preventing it entirely. So, the comparison with quicksort works on a couple of levels. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort, partitioning, merging, and the future

2016-08-10 Thread Peter Geoghegan
er, you cannot perform the final merge on-the-fly; you must produce a serialized tape as output, which is used subsequently to support random seeks. There is no penalty when you manage to do the sort in memory, though (not that that has anything to do with parallel sort). -- Peter Geoghegan --

Re: [HACKERS] Parallel tuplesort, partitioning, merging, and the future

2016-08-10 Thread Peter Geoghegan
initially input. I need to do some more research before posting a patch, but right now I can see that it makes merging presorted numeric values more than 2x faster. And that's with 8 tapes, on my very I/O bound laptop. I bet that the benefits would also be large for text (temporal loc

[HACKERS] Parallel tuplesort, partitioning, merging, and the future

2016-08-08 Thread Peter Geoghegan
postgresql.org/message-id/CA%2BTgmoY5JYs4R1g_ZJ-P6SkULSb19xx4zUh7S8LJiXonCgVTuQ%40mail.gmail.com [3] http://pages.cs.wisc.edu/~dewitt/includes/paralleldb/parsort.pdf -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-06 Thread Peter Geoghegan
runs. All of these factors are why I believe I'm able to compete well with other systems with this relatively straightforward, evolutionary approach. I have a completely open mind about partitioning, but my approach makes sense in this context. -- Peter Geoghegan -- Sent via pgsql-hackers m

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-05 Thread Peter Geoghegan
other things. Sadly, I >>> don't have time to look at it right now. >> >> I would be happy to look at generalizing that further, to help >> parallel hash join. As you know, Thomas Munro and I have discussed >> this privately. > > Right. By the way, the patch is in better shape from that perspective, as compared to the early version Thomas (CC'd) had access to. The BufFile stuff is now credible as a general-purpose abstraction. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible duplicate release of buffer lock.

2016-08-04 Thread Peter Geoghegan
ow this, and write code defensively because of this). The bug is that there is any ERROR for VACUUM that isn't absolutely unavoidable (the damage has been done anyway -- the index is already corrupt). I'll need to think about this some more, when I have more time. Perhaps tomorrow. -- Pet

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-03 Thread Peter Geoghegan
ndirectly. > The work on making the logtape infrastructure parallel-aware seems > very interesting and potentially useful for other things. Sadly, I > don't have time to look at it right now. I would be happy to look at generalizing that further, to help parallel hash join. As you know, Thomas Munro and I have discussed this privately. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2016-08-02 Thread Peter Geoghegan
-recommend-it things. +1. This also has value in the context of automatically surfacing situations where "can't happen" errors do in fact happen at scale. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2016-08-01 Thread Peter Geoghegan
the utility statement equivalent of max_parallel_workers_per_gather. This is clearly necessary, since we're using up to maintenance_work_mem per worker, which is of course typically much higher than work_mem. I didn't feel the need to create a new maintenance-wise variant GUC for th

Re: [HACKERS] Hash indexes and effective_cache_size in CREATE INDEX documentation

2016-07-31 Thread Peter Geoghegan
d given our awful track record about maintaining this > documentation, I'm not sure that going into more detail is really > a good idea. Comments? +1 -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http:/

[HACKERS] Hash indexes and effective_cache_size in CREATE INDEX documentation

2016-07-30 Thread Peter Geoghegan
now not so sure that that's actually the case. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fixes for the Danish locale

2016-07-21 Thread Peter Geoghegan
8, for example. There were other locales that were affected less severely, and I think the majority were not shown to be affected at all. That being said, it probably wouldn't have caught that particular issue if we had broad coverage. It probably would catch a broken test, though. -- Pete

Re: [HACKERS] fixes for the Danish locale

2016-07-21 Thread Peter Geoghegan
> on a buildfarm animal should work" anyway. I'm much more interested in > supporting locales that someone cares enough about to configure a > buildfarm animal for. That seems like a high standard to me. Locale rules are known to change, and are explicitly versioned by glibc, for e

Re: [HACKERS] fixes for the Danish locale

2016-07-21 Thread Peter Geoghegan
for. > > Nah, we have a hard enough time with reproducibility of buildfarm results > without deliberately injecting transient failures. It could be pseudo-random, and so deterministic per buildfarm animal. That's what I did. -- Peter Geoghegan -- Sent via pgsql-hackers mailing

Re: [HACKERS] fixes for the Danish locale

2016-07-21 Thread Peter Geoghegan
27;s more or less what I did with the amcheck regression tests. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving executor performance

2016-07-18 Thread Peter Geoghegan
rt test cases [1] take as much as 15% less time to execute overall. That's a big difference. I looked at the disassembly, and the number of instructions for varstrfastcmp_c() was reduced from 113 to 29. That's the kind of difference that could add up to a lot. [1] https://github.com/peter

Re: [HACKERS] rethinking dense_alloc (HashJoin) as a memory context

2016-07-18 Thread Peter Geoghegan
e called batch memory) for just a few modules, and what remains isn't several broad swathes that can be delineated easily. I can see a "palloc a lot and don't worry too much about pfrees" allocator having some value, but I suspect that that isn't going to move the needle

Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
greed. Also, for what it's worth, I should point out to Jun that GetCurrentTimestamp() should definitely not be called when a spinlock is held like that. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: htt

Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
. It's unfortunate that there isn't a good third-party tool that does that, but there is nothing that prevents it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Obsolete comment within fmgr.c

2016-07-17 Thread Peter Geoghegan
On Thu, Apr 14, 2016 at 4:03 PM, Peter Geoghegan wrote: > Attached patch removes obsolete comment from fmgr.c. This patch seems to have been overlooked. It's a pretty straightforward case. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)

Re: [HACKERS] Improving executor performance - tidbitmap

2016-07-14 Thread Peter Geoghegan
correlation performs compared to a bitmap index scan of a B-Tree (due to a less selective bitmap index scan qual, as in your example), but offhand I guess that could be faster in general, making the bottleneck you're addressing relatively greater there. -- Peter Geoghegan -- Sent via pgsql-hack

Re: [HACKERS] Improving executor performance - tidbitmap

2016-07-14 Thread Peter Geoghegan
rom 2461.622 > to 952.161. That's pretty great. I wonder what this would look like with a BRIN index, since l_shipdate looks like a good candidate for BRIN indexing. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in batch tuplesort memory CLUSTER case (9.6 only)

2016-07-13 Thread Peter Geoghegan
On Wed, Jul 13, 2016 at 12:24 PM, Tom Lane wrote: >> I am happy with the adjustment. Please commit the adjusted patch. > > Done with minor adjustments. Thanks. I'm pleased that we found a way forward that addressed every concern. -- Peter Geoghegan -- Sent via pgsql-hac

Re: [HACKERS] Bug in batch tuplesort memory CLUSTER case (9.6 only)

2016-07-13 Thread Peter Geoghegan
and | lthousand | tenthous | ltenthous > -+--+--+---+--+--- > (0 rows) It independently occurred to me that I should have done something like this afterwards. I agree. > If you're good with that adjustment, I'm happy to commit this. I am happy with the adjustmen

Re: [HACKERS] rethinking dense_alloc (HashJoin) as a memory context

2016-07-13 Thread Peter Geoghegan
structures, leaving a cache-friendly layout). I suspect that there are not that many places where it is worth it to even contemplate batch or dense allocators, so I doubt that what we will see all that many more instances of "local allocators". -- Peter Geoghegan -- Sent via pgs

Re: [HACKERS] UPSERT/RETURNING -> ON CONFLICT SELECT?

2016-07-13 Thread Peter Geoghegan
t select returning id > ) insert into bar(foo_id,i) > select id,2 from _foo; I gather that the point of this pseudo SQL is to show how you might be able to project and select the values not successfully inserted. Can't you just pipeline together some CTEs instead? -- Peter Geoghe

Re: [HACKERS] remove checkpoint_warning

2016-07-11 Thread Peter Geoghegan
th checkpoints starting > because of XLOG, but there's no indication of that being a bad thing. I agree. checkpoint_warning exists for the benefit of novice DBAs. I've seen those warnings in customer logs on several occasions, at least back when I was a consultant. -- Peter Geoghegan

Re: [HACKERS] \timing interval

2016-07-09 Thread Peter Geoghegan
On Sat, Jul 9, 2016 at 1:48 PM, Alvaro Herrera wrote: >> How about >> >> Time: 1234567.666 ms (20m 34.6s) >> >> ? > > +1 LGTM +1 -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription

Re: [HACKERS] Bug in batch tuplesort memory CLUSTER case (9.6 only)

2016-07-08 Thread Peter Geoghegan
Testing replacement selection in the second CLUSTER is made very convenient by the fact that we just ran CLUSTER, so input should be presorted. -- Peter Geoghegan 0001-Add-test-coverage-for-CLUSTER-external-sorts.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hacker

Re: [HACKERS] MVCC overheads

2016-07-08 Thread Peter Geoghegan
orrectly)? Apparently that's something that's been discussed a few times among senior community members, and I think it has promise. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] \timing interval

2016-07-07 Thread Peter Geoghegan
On Thu, Jul 7, 2016 at 2:52 PM, Corey Huinker wrote: > Wouldn't it be great if we had a way of printing timing in more human > friendly formats? Yes, it would. I've thought about doing this myself. So, +1 to the idea from me. -- Peter Geoghegan -- Sent via pgsql-hackers mai

Re: [HACKERS] Bug in batch tuplesort memory CLUSTER case (9.6 only)

2016-07-07 Thread Peter Geoghegan
On Thu, Jul 7, 2016 at 10:51 AM, Robert Haas wrote: > Thanks for testing. I've committed this patch, breaking off one > unrelated bit of into a separate commit. Thank you. To be clear, I still intend to follow up with a CLUSTER external sort test case, as outlined to Noah. -- Pete

Re: [HACKERS] Bug in batch tuplesort memory CLUSTER case (9.6 only)

2016-07-07 Thread Peter Geoghegan
ately described steps to reproduce? > > I can confirm that (after 62 minutes) your test procedure reached SIGSEGV > today and then completed successfully with your patch. Thanks for going to the trouble of confirming that the test procedure causes a segmentation fault, and that my patch app

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan
On Tue, Jul 5, 2016 at 12:58 PM, Tom Lane wrote: > Perhaps we could change the wording of temp_file_limit's description > from "space that a session can use" to "space that a process can use" > to help clarify this? That's all that I was looking for, real

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan
use temp_buffers directly in practice. max_files_per_process is already clearly per process, so no change needed there either. I don't see a case other than temp_file_limit that appears to be even marginally in need of a specific note. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgs

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan
n limit, whereas users are quite used to the idea that work_mem might be doled out multiple times for multiple executor nodes. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

<    2   3   4   5   6   7   8   9   10   11   >