Re: [Sender Address Forgery]Re: error message when subscription target is a partitioned table

2019-01-12 Thread Michael Paquier
On Fri, Jan 11, 2019 at 04:04:41PM +0900, Michael Paquier wrote:
> Thanks.  I can commit this version if there are no objections then.

And done.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL vs SQL/XML Standards

2019-01-12 Thread Pavel Stehule
ne 13. 1. 2019 v 0:39 odesílatel Chapman Flack 
napsal:

> On 12/30/18 03:23, Pavel Stehule wrote:
> > Unfortunately, there is a different releases of libxml2 with different
> > error reporting and it is hard (impossible) to prepare for all variants.
> :-/
> >
> > I prepare xml.out for my FC29 (fresh libxml2) and for no support xml.
> > Other I prepare by patching - and this error (in context) is expected.
>
> It turns out that the variant was already accounted for in the xml_2.out
> variant result file, it just needed to have the new results added.
>
> Done in xmltable-xpath-result-processing-bugfix-4.patch attached.
>
>
> On 12/31/18 01:03, Pavel Stehule wrote:
> > po 31. 12. 2018 v 3:15 odesílatel Chapman Flack 
> > napsal:
> >> But the PostgreSQL situation is a little more strange. PG uses BY VALUE
> >> semantics as the default when no passing method is specified. PG also
> uses
> >> BY VALUE semantics when BY REF is explicitly requested, which is rude,
> >> just like Oracle. But why should an explicit specification of BY VALUE
> >> (which is, after all, the semantics we're going to use anyway!) produce
> >> this?
> >>
> >> ERROR:  syntax error at or near "value"
> >>
> >> To me, that doesn't seem like least astonishment.
> >>
> >> I am not seeing what would be complicated about removing that
> astonishment
> >> by simply allowing the grammar productions to also consume BY VALUE and
> >> ignore it.
> >
> > ok - I am not against implementation of ignored BY VALUE. But I don't
> like
> > a idea to disable BY REF.
>
> Done in attached xmltable-xmlexists-passing-mechanisms-1.patch along with
> some corresponding documentation adjustments.
>
> I am still working on more extensive documentation, but it seemed best
> to include the changes related to BY REF / BY VALUE in the same patch
> with the grammar change.
>

looks well, thank you for patch

Pavel


> -Chap
>


Re: PATCH: Include all columns in default names for foreign key constraints.

2019-01-12 Thread Vik Fearing
On 13/01/2019 01:55, Paul Martinez wrote:
> This is my first submission to Postgres, so I'm not entirely sure what the
> protocol is here to get this merged; should I add this patch to the 2019-03
> Commitfest?

I haven't looked at the patch yet, but I think it's a good idea and
anyway yes, please add it to the next commitfest.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

2019-01-12 Thread David Rowley
On Sun, 13 Jan 2019 at 14:49, David Rowley  wrote:
> I've not looked in detail, but perhaps the place to put the tests are
> in src/test/modules/test_predtest.  This module adds a function named
> test_predtest() that you can likely use more easily than trying to
> EXPLAIN plans and hoping the planner's behaviour helps to exhibit the
> behaviour of the changed code.

I just noticed that I accidentally reviewed v1 instead of v2.  I see
you found that module.  I'll keep this as waiting on author until the
other bug is fixed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



PATCH: Include all columns in default names for foreign key constraints.

2019-01-12 Thread Paul Martinez
Hello!

I have written a small patch to modify the default names for foreign key
constraints. Currently if the foreign key is composed of multiple columns we
only use the first one in the constraint name. This can lead to similar
constraint names when two foreign keys start with the same column. This sort
of situation may commonly occur in a multi-tenant environment.

> CREATE TABLE users (tenant_id int, id int, PRIMARY KEY (tenant_id, id));
> CREATE TABLE posts (tenant_id int, id int, PRIMARY KEY (tenant_id, id));
> CREATE TABLE comments (
tenant_id int,
id int,
post_id int,
commenter_id int,
FOREIGN KEY (tenant_id, post_id) REFERENCES posts,
FOREIGN KEY (tenant_id, commenter_id) REFERENCES users
  )
> \d comments
 Table "public.comments"

 Foreign-key constraints:
  "comments_tenant_id_fkey" FOREIGN KEY (tenant_id, commenter_id)
REFERENCES users(tenant_id, id)
  "comments_tenant_id_fkey1" FOREIGN KEY (tenant_id, post_id)
REFERENCES posts(tenant_id, id)

The two constraints have nearly identical names. With my patch the default names
will include both column names, so we have we will instead have this output:

 Foreign-key constraints:
  "comments_tenant_id_commenter_id_fkey" FOREIGN KEY (tenant_id,
commenter_id) REFERENCES users(tenant_id, id)
  "comments_tenant_id_post_id_fkey" FOREIGN KEY (tenant_id, post_id)
REFERENCES posts(tenant_id, id)

This makes the default names for foreign keys in line with the default names
for indexes. Hopefully an uncontroversial change!

The logic for creating index names is in the function ChooseIndexNameAddition
in src/backend/commands/indexcmds.c. There is also similar logic fore creating
names for statistics in ChooseExtendedStatisticNameAddition in
src/backend/commands/statscmds.c.

I pretty much just copied and pasted the implementation from
ChooseIndexNameAddition and placed it in src/backend/commands/tablecmds.c.
The new function is called ChooseForeignKeyConstraintNameAddition. I updated
the comments in indexcmds.c and statscmds.c to also reference this new function.
Each of the three versions takes in the columns in slightly different forms, so
I don't think creating a single implementation of this small bit of logic is
desirable, and I have no idea where such a util function would go.

Regression tests are in src/test/regress/sql/foreign_key.sql. I create two
composite foreign keys on table, one via the CREATE TABLE statement, and the
other in a ALTER TABLE statement. The generated names of the constraints are
then queried from the pg_constraint table.


This is my first submission to Postgres, so I'm not entirely sure what the
protocol is here to get this merged; should I add this patch to the 2019-03
Commitfest?

Happy to hear any feedback!

- Paul Martinez


foreign-key-constraint-names-v1.patch
Description: Binary data


Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-01-12 Thread Tomas Vondra
On 1/10/19 6:09 PM, Dean Rasheed wrote:
> On Wed, 26 Dec 2018 at 22:09, Tomas Vondra  
> wrote:
>>
>> Attached is an updated version of the patch - rebased and fixing the
>> warnings reported by Thomas Munro.
>>
> 
> Here are a few random review comments based on what I've read so far:
> 
> 
> On the CREATE STATISTICS doc page, the syntax in the new examples
> added to the bottom of the page is incorrect. E.g., instead of
> 
> CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
> 
> it should read
> 
> CREATE STATISTICS s2 (mcv) ON a, b FROM t2;
> 

Fixed.

> I think perhaps there should be also be a short explanatory sentence
> after each example (as in the previous one) just to explain what the
> example is intended to demonstrate. E.g., for the new MCV example,
> perhaps say
> 
>These statistics give the planner more detailed information about the
>specific values that commonly appear in the table, as well as an upper
>bound on the selectivities of combinations of values that do not appear in
>the table, allowing it to generate better estimates in both cases.
> 
> I don't think there's a need for too much detail there, since it's
> explained more fully elsewhere, but it feels like it needs a little
> more just to explain the purpose of the example.
> 

I agree, this part of docs can be quite terse. I've adopted the wording
you proposed, and I've done something similar for the histogram patch,
which needs to add something too. It's a bit repetitive, though.

> 
> There is additional documentation in perform.sgml that needs updating
> -- about what kinds of stats the planner keeps. Those docs are
> actually quite similar to the ones on planstats.sgml. It seems the
> former focus more one what stats the planner stores, while the latter
> focus on how the planner uses those stats.
> 

OK, I've expanded this part a bit too.

> 
> In func.sgml, the docs for pg_mcv_list_items need extending to include
> the base frequency column. Similarly for the example query in
> planstats.sgml.
> 

Fixed.

> 
> Tab-completion for the CREATE STATISTICS statement should be extended
> for the new kinds.
> 

Fixed.

> 
> Looking at mcv_update_match_bitmap(), it's called 3 times (twice
> recursively from within itself), and I think the pattern for calling
> it is a bit messy. E.g.,
> 
> /* by default none of the MCV items matches the clauses */
> bool_matches = palloc0(sizeof(char) * mcvlist->nitems);
> 
> if (or_clause(clause))
> {
> /* OR clauses assume nothing matches, initially */
> memset(bool_matches, STATS_MATCH_NONE, sizeof(char) *
> mcvlist->nitems);
> }
> else
> {
> /* AND clauses assume everything matches, initially */
> memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
> mcvlist->nitems);
> }
> 
> /* build the match bitmap for the OR-clauses */
> mcv_update_match_bitmap(root, bool_clauses, keys,
> mcvlist, bool_matches,
> or_clause(clause));
> 
> the comment for the AND case directly contradicts the initial comment,
> and the final comment is wrong because it could be and AND clause. For
> a NOT clause it does:
> 
> /* by default none of the MCV items matches the clauses */
> not_matches = palloc0(sizeof(char) * mcvlist->nitems);
> 
> /* NOT clauses assume nothing matches, initially */
> memset(not_matches, STATS_MATCH_FULL, sizeof(char) *
> mcvlist->nitems);
> 
> /* build the match bitmap for the NOT-clause */
> mcv_update_match_bitmap(root, not_args, keys,
> mcvlist, not_matches, false);
> 
> so the second comment is wrong. I understand the evolution that lead
> to this function existing in this form, but I think that it can now be
> refactored into a "getter" function rather than an "update" function.
> I.e., something like mcv_get_match_bitmap() which first allocates the
> array to be returned and initialises it based on the passed-in value
> of is_or. That way, all the calling sites can be simplified to
> one-liners like
> 
> /* get the match bitmap for the AND/OR clause */
> bool_matches = mcv_get_match_bitmap(root, bool_clauses, keys,
> mcvlist, or_clause(clause));
> 

Yes, I agree. I've reworked the function per your proposal, and I've
done the same for the histogram too.

> 
> In the previous discussion around UpdateStatisticsForTypeChange(), the
> consensus appeared to be that we should just unconditionally drop all
> extended statistics when ALTER TABLE changes the type of an included
> column (just as we do for per-column stats), since such a type change
> can rewrite the data in arbitrary ways, so there's no reason to assume
> that the old stats are still valid. I 

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-01-12 Thread Tomas Vondra



On 1/12/19 8:49 AM, Dean Rasheed wrote:
> On Fri, 11 Jan 2019, 21:18 Tomas Vondra   wrote:
> 
> 
> On 1/10/19 4:20 PM, Dean Rasheed wrote:
> > ...
> >
> > So perhaps what we should do for multivariate stats is simply use the
> > relative standard error approach (i.e., reuse the patch in [2] with a
> > 20% RSE cutoff). That had a lot of testing at the time, against a wide
> > range of data distributions, and proved to be very good, not to
> > mention being very simple.
> >
> > That approach would encompass both groups more and less common than
> > the base frequency, because it relies entirely on the group appearing
> > enough times in the sample to infer that any errors on the resulting
> > estimates will be reasonably well controlled. It wouldn't actually
> > look at the base frequency at all in deciding which items to keep.
> >
> 
> I've been looking at this approach today, and I'm a bit puzzled. That
> patch essentially uses SRE to compute mincount like this:
> 
>     mincount = n*(N-n) / (N-n+0.04*n*(N-1))
> 
> and then includes all items more common than this threshold.
> 
> 
> Right.
> 
> How could
> that handle items significantly less common than the base frequency?
> 
> 
> Well what I meant was that it will *allow* items significantly less
> common than the base frequency, because it's not even looking at the
> base frequency. For example, if the table size were N=100,000 and we
> sampled n=10,000 rows from that, mincount would work out as 22. So it's
> easy to construct allowed items more common than that and still
> significantly less common than their base frequency.
> 

OK, understood. I agree that's a sensible yet simple approach, so I've
adopted it in the next version of the patch.

> A possible refinement would be to say that if there are more than
> stats_target items more common than this mincount threshold, rather than
> excluding the least common ones to get the target number of items,
> exclude the ones closest to their base frequencies, on the grounds that
> those are the ones for which the MCV stats will make the least
> difference. That might complicate the code somewhat though -- I don't
> have it in front of me, so I can't remember if it even tracks more than
> stats_target items.
> 

Yes, the patch does limit the number of items to stats_target (a maximum
of per-attribute stattarget values, to be precise). IIRC that's a piece
you've added sometime last year ;-)

I've been experimenting with removing items closest to base frequencies
today, and I came to the conclusion that it's rather tricky for a couple
of reasons.

1) How exactly do you measure "closeness" to base frequency? I've tried
computing the error in different ways, including:

  * Max(freq/base, base/freq)
  * abs(freq - base)

but this does not seem to affect the behavior very much, TBH.

2) This necessarily reduces mcv_totalsel, i.e. it increases the part not
covered by MCV. And estimates on this part are rather crude.

3) It does nothing for "impossible" items, i.e. combinations that do not
exist at all. Clearly, those won't be part of the sample, and so can't
be included in the MCV no matter which error definition we pick. And for
very rare combinations it might lead to sudden changes, depending on
whether the group gets sampled or not.

So IMHO it's better to stick to the simple SRE approach for now.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2019-01-12 Thread David Rowley
On Thu, 3 Jan 2019 at 08:01, Tomas Vondra  wrote:
> AFAICS the patch essentially does two things: (a) identifies Append
> paths with a single member and (b) ensures the Vars are properly mapped
> when the Append node is skipped when creating the plan.

Yes, but traditional Append and MergeAppend paths with a single member
are never actually generated.  I say "traditional" as we do happen to
use a single-subpath Append as the "proxy" path to represent a path
that needs to not have a plan node generated during createplan.
Originally I had a new Path type named "ProxyPath" to do this, but per
recommendation from Tom, I overloaded Append to mean that. (Append
already has its meaning overloaded in regards to dummy paths.)

> I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
> perhaps we could do at least (b) to setrefs.c?

The problem I found with doing var translations in setrefs.c was that
the plan tree is traversed there breadth-first and in createplan.c we
build the plan depth-first.  The problem I didn't manage to overcome
with that is that when we replace a "ProxyPath" (actually an Append
path marked as is_proxy=true) it only alter targetlists, etc of nodes
above that in the plan tree. The nodes below it and their targetlists
don't care, as these don't fetch Vars from nodes above them.  If we do
the Var translation in setrefs then we've not yet looked at the nodes
that don't care and we've already done the nodes that do care. So the
tree traversal is backwards.  I sort of borrowed the traversal in
createplan.c since it was in the correct depth-first order.   Equally,
I could have invented an entirely new stage that traverses the path
tree depth-first, but I imagined that would have added more overhead
and raised even more questions. Piggy-backing on createplan seemed
like the best option at the time.

> I'd say mapping the Vars is the most fragile and error-prone piece of
> this patch. It's a bit annoying that it's inventing a new infrastructure
> to do that, translates expressions in various places, establishes new
> rules for tlists (having to call finalize_plan_tlist when calling
> build_path_tlist before create_plan_recurse), etc.

I entirely agree. It's by far the worst part of the patch.  Getting
the code to a working state to do this was hard. I kept finding places
that I'd missed the translation.  I'd rather it worked some other way.
I just don't yet know what that way is.  I changed the core regression
table "tenk" to become a partitioned table with a single partition in
the hope to catch all of the places that need the translations to be
performed.  I'm not entirely confident that I've caught them all by
doing this.

I've attached an updated patch that's really just a rebased version of
v8.  The old version conflicted with changes made in 1db5667b. The
only real change was to the commit message. I'd previously managed to
miss out the part about not generating needless Append/MergeAppend
paths can allow plan shapes that were previously not possible.  I'd
only mentioned the executor not having to pull tuples through these
nodes, which increases performance. Not having this perhaps caused
some of the confusion earlier in this thread.

One drawback to this patch, which I'm unsure if I previously mentioned
is that run-time pruning only works for Append and MergeAppend.  If we
remove the Append/MergeAppend node then the single Append/MergeAppend
sub-plan has no hope of being pruned.  Going from 1 to 0 sub-plans may
not be that common, but no longer supporting that is something that
needs to be considered.   I had imagined that gained flexibility of
plan shapes plus less executor overhead was better than that, but
there likely is a small use-case for keeping that ability.  It seems
impossible to have both advantages of this patch without that
disadvantage.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v9-0001-Forgo-generating-single-subpath-Append-and-MergeA.patch
Description: Binary data


Re: O_DIRECT for relations and SLRUs (Prototype)

2019-01-12 Thread Thomas Munro
On Sun, Jan 13, 2019 at 5:13 AM Andrey Borodin  wrote:
>
> Hi!
>
> > 12 янв. 2019 г., в 9:46, Michael Paquier  написал(а):
> >
> > Attached is a toy patch that I have begun using for tests in this
> > area.  That's nothing really serious at this stage, but you can use
> > that if you would like to see the impact of O_DIRECT.  Of course,
> > things get significantly slower.
>
> Cool!
> I've just gathered a group of students to task them with experimenting with 
> shared buffer eviction algorithms during their February internship at 
> Yandex-Sirius edu project. Your patch seems very handy for benchmarks in this 
> area.

+1, thanks for sharing the patch.  Even though just turning on
O_DIRECT is the trivial part of this project, it's good to encourage
discussion.  We may indeed become more sensitive to the quality of
buffer eviction algorithms, but it seems like the main work to regain
lost performance will be the background IO scheduling piece:

1.  We need a new "bgreader" process to do read-ahead.  I think you'd
want a way to tell it with explicit hints (for example, perhaps
sequential scans would advertise that they're reading sequentially so
that it starts to slurp future blocks into the buffer pool, and
streaming replicas might look ahead in the WAL and tell it what's
coming).  In theory this might be better than the heuristics OSes use
to guess our access pattern and pre-fetch into the page cache, since
we have better information (and of course we're skipping a buffer
layer).

2.  We need a new kind of bgwriter/syncer that aggressively creates
clean pages so that foreground processes rarely have to evict (since
that is now super slow), but also efficiently finds ranges of dirty
blocks that it can write in big sequential chunks.

3.  We probably want SLRUs to use the main buffer pool, instead of
their own mini-pools, so they can benefit from the above.

Whether we need multiple bgreader and bgwriter processes or perhaps a
general IO scheduler process may depend on whether we also want to
switch to async (multiplexing from a single process).  Starting simple
with a traditional sync IO and N processes seems OK to me.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Alternative to \copy in psql modelled after \g

2019-01-12 Thread Tom Lane
I took a quick look at this patch.  Some thoughts:

* I think the right way to proceed is to commit (and back-patch)
this without any regression test case, and maybe later look into
adding a TAP test that covers "\g file".  We have no test cases
for any variant of "\g file", and because of platform variability
and suchlike considerations, adding that seems like a project of
its own --- one I'd not want to back-patch.

* I do agree with documenting this by adding some small note to the
discussion of \copy.

* I think you've made the wrong choice about how this interacts with
the pset.copyStream option.  If copyStream is set, that should
continue to take priority, IMO, as anything else would break do_copy's
assumptions about what will happen.  One example of why it would be
problematic is that both levels of code would think they control what
to do with the sigpipe trap.  Now, it's likely that \copy's syntax
makes it impossible for both copyStream and gfname to be set at the
same time anyway, because \copy extends to the end of the line.  But
if it were to happen, we don't want it to be something that do_copy
has to take into account; better therefore just to leave gfname alone.
(Note also the comment just above where you patched, which you failed
to update.)

* You need to be more careful about error cases.  I note that
openQueryOutputFile is capable of returning failure with is_pipe
set true, which would lead this patch to disable the sigpipe trap
and never re-enable it.

regards, tom lane



Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-12 Thread Tom Lane
Andrew Dunstan  writes:
> On 1/11/19 6:33 PM, Tom Lane wrote:
>> While I think I've fixed this bug, I'm still quite confused about why
>> only some buildfarm animals showed the problem.

> ... Is there something weird about naming of library files on HP-UX?

Doh!  I looked right at this code last night, but it failed to click:

# these files should be present if we've temp_installed everything,
# and not if we haven't. The represent core, contrib and test_modules.
return ( (-d $tmp_loc)
  && (-f "$bindir/postgres"   || -f "$bindir/postgres.exe")
  && (-f "$libdir/hstore.so"  || -f "$libdir/hstore.dll")
  && (-f "$libdir/test_parser.so" || -f "$libdir/test_parser.dll"));

On HPUX (at least the version gaur is running), the extension for
shared libraries is ".sl" not ".so".

That doesn't explain the failures on damselfly and koreaceratops,
but they're both running very old buildfarm clients, which most
likely just don't have the optimization to share a temp-install.

I wonder if it's practical to scrape DLSUFFIX out of src/Makefile.port
instead of listing all the possibilities here.  But I'm not sure how
you'd deal with this bit in Makefile.hpux:

ifeq ($(host_cpu), ia64)
   DLSUFFIX = .so
else
   DLSUFFIX = .sl
endif

Anyway, the bigger picture here is that the shared-temp-install
optimization is masking bugs in local "make check" rules.  Not
sure how much we care about that, though.  Any such bug is only
of interest to developers, and it only matters if someone actually
stumbles over it.

regards, tom lane



Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-12 Thread Andrew Dunstan


On 1/11/19 6:33 PM, Tom Lane wrote:
> I wrote:
>> So it appears that in v10,
>>  ./configure ... --enable-tap-tests ...
>>  make
>>  make install
>>  cd contrib/test_decoding
>>  make check
>> fails due to failure to install test_decoding into the tmp_install
>> tree, while it works in v11.  Moreover, that's not specific to
>> gaur: it happens on my Linux box too.  I'm not very sure why only
>> three buildfarm animals are unhappy --- maybe in the buildfarm
>> context it requires a specific combination of options to show the
>> problem.
> While I think I've fixed this bug, I'm still quite confused about why
> only some buildfarm animals showed the problem.  Comparing log files,
> it seems that the ones that were working were relying on having
> done a complete temp-install at a higher level, while the ones that
> were failing were trying to make a temp install from scratch in
> contrib/test_decoding and hence seeing the bug.  For example,
> longfin's test-decoding-check log starts out
>
> napshot: 2019-01-11 21:12:17
>
> /Applications/Xcode.app/Contents/Developer/usr/bin/make -C 
> ../../src/test/regress all
> /Applications/Xcode.app/Contents/Developer/usr/bin/make -C ../../../src/port 
> all
> /Applications/Xcode.app/Contents/Developer/usr/bin/make -C ../backend 
> submake-errcodes
> make[3]: Nothing to be done for `submake-errcodes'.
>
> while gaur's starts out
>
> Snapshot: 2019-01-11 07:30:45
>
> rm -rf '/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install
> /bin/sh ../../config/install-sh -c -d 
> '/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log
> make -C '../..' 
> DESTDIR='/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install install 
> >'/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log/install.log 
> 2>&1
> make -j1  checkprep 
> >>'/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log/install.log 
> 2>&1
> make -C ../../src/test/regress all
> make[1]: Entering directory 
> `/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/test/regress'
> make -C ../../../src/port all
> make[2]: Entering directory 
> `/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/port'
> make -C ../backend submake-errcodes
> make[3]: Entering directory 
> `/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/backend'
> make[3]: Nothing to be done for `submake-errcodes'.
>
> These two animals are running the same buildfarm client version,
> and I don't see any relevant difference in their configurations,
> so why are they behaving differently?  Andrew, any ideas?
>
>   



Possibly an error in 
https://github.com/PGBuildFarm/client-code/commit/3026438dcefebcc6fe2d44eb7b60812e257a0614


It looks like longfin detects that it has all it needs to proceed, and
so calls make with "NO_INSTALL=yes", but gaur doesn't.  Not sure why
that would be - if anything I'd expect the test to fail on OSX rather
than HP-UX. Is there something weird about naming of library files on HP-UX?


cheers


andrew





Re: O_DIRECT for relations and SLRUs (Prototype)

2019-01-12 Thread Andrey Borodin
Hi!

> 12 янв. 2019 г., в 9:46, Michael Paquier  написал(а):
> 
> Attached is a toy patch that I have begun using for tests in this
> area.  That's nothing really serious at this stage, but you can use
> that if you would like to see the impact of O_DIRECT.  Of course,
> things get significantly slower.

Cool!
I've just gathered a group of students to task them with experimenting with 
shared buffer eviction algorithms during their February internship at 
Yandex-Sirius edu project. Your patch seems very handy for benchmarks in this 
area.

Thanks!

Best regards, Andrey Borodin.



Re: log bind parameter values on error

2019-01-12 Thread Alexey Bashtanov



please see attached

sorry, some unintended changes sneaked in, please see the corrected patch
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f64402a..924a76c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6270,6 +6270,23 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+ 
+  log_parameters (boolean)
+  
+   log_parameters configuration parameter
+  
+  
+  
+   
+Controls whether the statement is logged with bind parameter values. 
+It adds some overhead, as postgres will cache textual
+representations of parameter values in memory for all statements,
+even if they eventually do not get logged. The default is
+off. Only superusers can change this setting.
+   
+  
+ 
+
  
   log_statement (enum)
   
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index a98c836..9ee3954 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -404,6 +404,7 @@ EvaluateParams(PreparedStatement *pstmt, List *params,
 	paramLI->parserSetup = NULL;
 	paramLI->parserSetupArg = NULL;
 	paramLI->numParams = num_params;
+	paramLI->hasTextValues = false;
 
 	i = 0;
 	foreach(l, exprstates)
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index de41588..195e3c5 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -921,6 +921,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
 			paramLI->parserSetup = NULL;
 			paramLI->parserSetupArg = NULL;
 			paramLI->numParams = nargs;
+			paramLI->hasTextValues = false;
 			fcache->paramLI = paramLI;
 		}
 		else
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 94a53e0..398f0e3 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2399,6 +2399,7 @@ _SPI_convert_params(int nargs, Oid *argtypes,
 		paramLI->parserSetup = NULL;
 		paramLI->parserSetupArg = NULL;
 		paramLI->numParams = nargs;
+		paramLI->hasTextValues = false;
 
 		for (i = 0; i < nargs; i++)
 		{
diff --git a/src/backend/nodes/params.c b/src/backend/nodes/params.c
index a89a25e..5aebd428 100644
--- a/src/backend/nodes/params.c
+++ b/src/backend/nodes/params.c
@@ -31,6 +31,8 @@
  * set of parameter values.  If dynamic parameter hooks are present, we
  * intentionally do not copy them into the result.  Rather, we forcibly
  * instantiate all available parameter values and copy the datum values.
+ *
+ * We don't copy textual representations here.
  */
 ParamListInfo
 copyParamList(ParamListInfo from)
@@ -53,6 +55,7 @@ copyParamList(ParamListInfo from)
 	retval->parserSetup = NULL;
 	retval->parserSetupArg = NULL;
 	retval->numParams = from->numParams;
+	retval->hasTextValues = false;
 
 	for (i = 0; i < from->numParams; i++)
 	{
@@ -229,6 +232,7 @@ RestoreParamList(char **start_address)
 	paramLI->parserSetup = NULL;
 	paramLI->parserSetupArg = NULL;
 	paramLI->numParams = nparams;
+	paramLI->hasTextValues = false;
 
 	for (i = 0; i < nparams; i++)
 	{
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 0c0891b..1f81fba 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -86,6 +86,12 @@
  */
 const char *debug_query_string; /* client-supplied query string */
 
+/*
+ * The top-level portal that the client is immediately working with:
+ * creating, binding, executing, or all at one using simple protocol
+ */
+Portal current_top_portal = NULL;
+
 /* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
 CommandDest whereToSendOutput = DestDebug;
 
@@ -183,7 +189,7 @@ static void forbidden_in_wal_sender(char firstchar);
 static List *pg_rewrite_query(Query *query);
 static bool check_log_statement(List *stmt_list);
 static int	errdetail_execute(List *raw_parsetree_list);
-static int	errdetail_params(ParamListInfo params);
+static int	errdetail_params(Portal portal);
 static int	errdetail_abort(void);
 static int	errdetail_recovery_conflict(void);
 static void start_xact_command(void);
@@ -1694,6 +1700,9 @@ exec_bind_message(StringInfo input_message)
 	else
 		portal = CreatePortal(portal_name, false, false);
 
+	Assert(current_top_portal == NULL);
+	current_top_portal = portal;
+
 	/*
 	 * Prepare to copy stuff into the portal's memory context.  We do all this
 	 * copying first, because it could possibly fail (out-of-memory) and we
@@ -1731,6 +1740,9 @@ exec_bind_message(StringInfo input_message)
 	 */
 	if (numParams > 0)
 	{
+		/* GUC value can change, so we remember its state to be consistent */
+		bool need_text_values = log_parameters;
+
 		params = (ParamListInfo) palloc(offsetof(ParamListInfoData, params) +
 		numParams * sizeof(ParamExternData));
 		/* we have static list of params, so no hooks needed */
@@ -1741,6 +1753,8 @@ exec_bind_message(StringInfo input_message)
 		params->parserSetup = NULL;
 		

Re: log bind parameter values on error

2019-01-12 Thread Alexey Bashtanov

Hello Peter,


Unlike SQL, parameters may spend much more memory, so I'd have them
in portal memory context to make sure the memory is released earlier
rather than later.

Having them in the portal structure is different from having it in the
portal memory context.  Although there is perhaps value in keeping them
together.

yeah, to avoid pointers to deallocated areas


Let's see how it looks.


please see attached

Best,
  Alex
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f64402a..924a76c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6270,6 +6270,23 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+ 
+  log_parameters (boolean)
+  
+   log_parameters configuration parameter
+  
+  
+  
+   
+Controls whether the statement is logged with bind parameter values. 
+It adds some overhead, as postgres will cache textual
+representations of parameter values in memory for all statements,
+even if they eventually do not get logged. The default is
+off. Only superusers can change this setting.
+   
+  
+ 
+
  
   log_statement (enum)
   
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index a98c836..9ee3954 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -404,6 +404,7 @@ EvaluateParams(PreparedStatement *pstmt, List *params,
 	paramLI->parserSetup = NULL;
 	paramLI->parserSetupArg = NULL;
 	paramLI->numParams = num_params;
+	paramLI->hasTextValues = false;
 
 	i = 0;
 	foreach(l, exprstates)
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index de41588..195e3c5 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -921,6 +921,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
 			paramLI->parserSetup = NULL;
 			paramLI->parserSetupArg = NULL;
 			paramLI->numParams = nargs;
+			paramLI->hasTextValues = false;
 			fcache->paramLI = paramLI;
 		}
 		else
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 94a53e0..398f0e3 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2399,6 +2399,7 @@ _SPI_convert_params(int nargs, Oid *argtypes,
 		paramLI->parserSetup = NULL;
 		paramLI->parserSetupArg = NULL;
 		paramLI->numParams = nargs;
+		paramLI->hasTextValues = false;
 
 		for (i = 0; i < nargs; i++)
 		{
diff --git a/src/backend/nodes/params.c b/src/backend/nodes/params.c
index a89a25e..5aebd428 100644
--- a/src/backend/nodes/params.c
+++ b/src/backend/nodes/params.c
@@ -31,6 +31,8 @@
  * set of parameter values.  If dynamic parameter hooks are present, we
  * intentionally do not copy them into the result.  Rather, we forcibly
  * instantiate all available parameter values and copy the datum values.
+ *
+ * We don't copy textual representations here.
  */
 ParamListInfo
 copyParamList(ParamListInfo from)
@@ -53,6 +55,7 @@ copyParamList(ParamListInfo from)
 	retval->parserSetup = NULL;
 	retval->parserSetupArg = NULL;
 	retval->numParams = from->numParams;
+	retval->hasTextValues = false;
 
 	for (i = 0; i < from->numParams; i++)
 	{
@@ -229,6 +232,7 @@ RestoreParamList(char **start_address)
 	paramLI->parserSetup = NULL;
 	paramLI->parserSetupArg = NULL;
 	paramLI->numParams = nparams;
+	paramLI->hasTextValues = false;
 
 	for (i = 0; i < nparams; i++)
 	{
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 0c0891b..1f81fba 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -86,6 +86,12 @@
  */
 const char *debug_query_string; /* client-supplied query string */
 
+/*
+ * The top-level portal that the client is immediately working with:
+ * creating, binding, executing, or all at one using simple protocol
+ */
+Portal current_top_portal = NULL;
+
 /* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
 CommandDest whereToSendOutput = DestDebug;
 
@@ -183,7 +189,7 @@ static void forbidden_in_wal_sender(char firstchar);
 static List *pg_rewrite_query(Query *query);
 static bool check_log_statement(List *stmt_list);
 static int	errdetail_execute(List *raw_parsetree_list);
-static int	errdetail_params(ParamListInfo params);
+static int	errdetail_params(Portal portal);
 static int	errdetail_abort(void);
 static int	errdetail_recovery_conflict(void);
 static void start_xact_command(void);
@@ -1694,6 +1700,9 @@ exec_bind_message(StringInfo input_message)
 	else
 		portal = CreatePortal(portal_name, false, false);
 
+	Assert(current_top_portal == NULL);
+	current_top_portal = portal;
+
 	/*
 	 * Prepare to copy stuff into the portal's memory context.  We do all this
 	 * copying first, because it could possibly fail (out-of-memory) and we
@@ -1731,6 +1740,9 @@ exec_bind_message(StringInfo input_message)
 	 */
 	if (numParams > 0)
 	{
+		/* GUC value can change, so we remember its state to be consistent */
+		bool 

Re: Pluggable Storage - Andres's take

2019-01-12 Thread Dmitry Dolgov
> On Sat, Jan 12, 2019 at 1:44 AM Andres Freund  wrote:
>
> > + appendPQExpBuffer(cmd, "SET default_table_access_method = %s;", 
> > tableam);
>
> This needs escaping, at the very least with "", but better with proper
> routines for dealing with identifiers.

Thanks for noticing, fixed.

> > @@ -5914,7 +5922,7 @@ getTables(Archive *fout, int *numTables)
> > "tc.relfrozenxid AS 
> > tfrozenxid, "
> > "tc.relminmxid AS tminmxid, 
> > "
> > "c.relpersistence, 
> > c.relispopulated, "
> > -   "c.relreplident, 
> > c.relpages, "
> > +   "c.relreplident, 
> > c.relpages, am.amname AS amname, "
>
> That AS doesn't do anything, does it?

Rigth, I've renamed it few times and forgot to get rid of it. Removed.

>
> >   /* other fields were zeroed above */
> >
> > @@ -9355,7 +9370,7 @@ dumpComment(Archive *fout, const char *type, const 
> > char *name,
> >* post-data.
> >*/
> >   ArchiveEntry(fout, nilCatalogId, createDumpId(),
> > -  tag->data, namespace, NULL, owner,
> > +  tag->data, namespace, NULL, owner, 
> > NULL,
> >"COMMENT", SECTION_NONE,
> >query->data, "", NULL,
> >&(dumpId), 1,
>
> We really ought to move the arguments to a struct, so we don't generate
> quite as much useless diffs whenever we do a change around one of
> these...

That's what I though too. Maybe then I'll suggest a mini-patch to the master to
refactor these arguments out into a separate struct, so we can leverage it here.


pg_dump_access_method_v2.patch
Description: Binary data


Re: Logical decoding for operations on zheap tables

2019-01-12 Thread Amit Kapila
On Fri, Jan 4, 2019 at 9:01 AM Andres Freund  wrote:
>
> On 2019-01-04 08:54:34 +0530, Amit Kapila wrote:
> > The only point for exposing a different tuple format via plugin was a
> > performance which I think can be addressed if we expose via slots.  I
> > don't want to take up exposing slots instead of tuples for plugins as
> > part of this project and I think if we want to go with that, it is
> > better done as part of pluggable API?
>
> No, I don't think it makes sense to address this is as part of
> pluggable storage. That patchset is already way too invasive and
> large.
>

Fair enough.  I think that for now (and maybe for the first version
that can be committed) we might want to use heap tuple format.  There
will be some overhead but I think code-wise, things will be simpler.
I have prototyped it for Insert and Delete operations of zheap and the
only thing that is required are new decode functions, see the attached
patch.  I have done very minimal testing of this patch as this is just
to show you and others the direction we are taking (w.r.t tuple
format) to support logical decoding in zheap.

Thanks for the feedback, further thoughts are welcome!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


decode_zops_3.patch
Description: Binary data


Re: Delay locking partitions during query execution

2019-01-12 Thread David Rowley
On Tue, 4 Dec 2018 at 00:42, David Rowley  wrote:
> Over here and along similar lines to the above, but this time I'd like
> to take this even further and change things so we don't lock *any*
> partitions during AcquireExecutorLocks() and instead just lock them
> when we first access them with ExecGetRangeTableRelation().

I've attached a rebase version of this. The previous version
conflicted with some changes made in b60c397599.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v2-0001-Allow-lock-acquisitions-for-partitions-to-be-dela.patch
Description: Binary data


Re: speeding up planning with partitions

2019-01-12 Thread David Rowley
On Sat, 12 Jan 2019 at 02:00, Amit Langote
 wrote:
> That's an oversight.  Fixed by making add_child_rel_equivalences do this:
>
> +cur_ec->ec_relids = bms_difference(cur_ec->ec_relids,
> +   parent_rel->relids);
> +cur_ec->ec_relids = bms_add_members(cur_ec->ec_relids,
> +child_rel->relids);
>
> >
> > UPDATE: I see you're likely leaving this alone since you're only doing
> > a shallow copy of the eclasses in
> > adjust_inherited_target_child_root(). It seems like a pretty bad idea
> > to do a shallow copy there.
>
> So, you're talking about this code:
>
> /*
>  * Child root should get its own copy of ECs, because they'll be modified
>  * to replace parent EC expressions by child expressions in
>  * add_child_rel_equivalences.
>  */
> subroot->eq_classes = NIL;
> foreach(lc, root->eq_classes)
> {
> EquivalenceClass *ec = lfirst(lc);
> EquivalenceClass *new_ec = makeNode(EquivalenceClass);
>
> memcpy(new_ec, ec, sizeof(EquivalenceClass));
> new_ec->ec_members = list_copy(ec->ec_members);
> subroot->eq_classes = lappend(subroot->eq_classes, new_ec);
> }
>
> Can you say what you think is wrong with this way of making a copy of the ECs?

If you shallow copy with memcpy(new_ec, ec,
sizeof(EquivalenceClass));, then fields such as ec_relids will just
point to the same memory as the parent PlannerInfo's
EquivalenceClasses, so when you do your adjustment (as above) on the
child eclasses, you'll modify memory belonging to the parent. I'd
assume you'd not want to do that since you need to keep the parent
intact at least to make copies for other child rels.  You'd have
gotten away with it before since you performed a list_copy() and only
were deleting the parent's EMs with list_delete_cell() which was
modifying the copy, not the original list. Since you were missing the
alteration to ec_relids, then you might not have seen issues with the
shallow copy, but now that you are changing that field, are you not
modifying the ec_relids field that still set in the parent's
PlannerInfo?  In this instance, you've sidestepped that issue by using
bms_difference() which creates a copy of the Bitmapset and modifies
that. I think you'd probably see issues if you tried to use
bms_del_members().  I think not doing the deep copy is going to give
other people making changes in this are a hard time in the future.

> > 12. The following comment makes less sense now that you've modified
> > the previous paragraph:
> >
> > + * Fortunately, the UPDATE/DELETE target can never be the nullable side of 
> > an
> > + * outer join, so it's OK to generate the plan this way.
> >
> > This text used to refer to:
> >
> > but target inheritance has to be expanded at
> >  * the top.  The reason is that for UPDATE, each target relation needs a
> >  * different targetlist matching its own column set.  Fortunately,
> >  * the UPDATE/DELETE target can never be the nullable side of an outer join,
> >  * so it's OK to generate the plan this way.
> >
> > you no longer describe plan as being expanded from the top rather than
> > at the bottom, which IMO is what "this way" refers to.
>
> To be honest, I never quite understood what that line really means, which
> is why I kept it around.  What I do know though is that, even with the
> patched, we still generate subpaths for each child whose targetlist
> patches the child, so I think the part about "this way" that prompted
> someone to write that line still remains.  Does that make sense to you?

Not quite.  I thought that "OK to generate the plan this way" is
talking about expanding the children at the top of the plan, e.g.

explain (costs off) update listp set b = b + 1 from t where listp.a = t.a;
  QUERY PLAN
--
 Update on listp
   Update on listp1
   Update on listp2
   ->  Merge Join
 Merge Cond: (listp1.a = t.a)
 ->  Sort
   Sort Key: listp1.a
   ->  Seq Scan on listp1
 ->  Sort
   Sort Key: t.a
   ->  Seq Scan on t
   ->  Merge Join
 Merge Cond: (listp2.a = t.a)
 ->  Sort
   Sort Key: listp2.a
   ->  Seq Scan on listp2
 ->  Sort
   Sort Key: t.a
   ->  Seq Scan on t

i.e each non-pruned partition gets joined to the other tables
(expanded from the top of the plan tree). The other tables appear once
for each child target relation.

instead of:

postgres=# explain (costs off) select * from listp inner join t on
listp.a = t.a;
  QUERY PLAN
--
 Merge Join
   Merge Cond: (t.a = listp1.a)
   ->  Sort
 Sort Key: t.a
 ->  Seq Scan on t
   ->  Sort
 Sort Key: listp1.a
 ->  Append
   ->  Seq Scan on listp1
   ->  Seq Scan on listp2

where