date:20141214

[HACKERS] Confusing comment in tidbitmap.c

2014-12-14 Thread David Rowley

The following comment above #define PAGES_PER_CHUNK in tibbitmap.c appears
to be incorrect:

"But we
 * also want PAGES_PER_CHUNK to be a power of 2 to avoid expensive integer
 * remainder operations.  So, define it like this:"

I don't quite follow this as it does nothing of the kind.
Check tbm_page_is_lossy() where we do: bitno = pageno % PAGES_PER_CHUNK;

Or am I missing something about the compiler optimizing that to: bitno =
pageno & 255; ?

Regards

David Rowley

Re: [HACKERS] speedup tidbitmap patch: cache page

2014-12-14 Thread David Rowley

On 23 October 2014 at 00:52, Teodor Sigaev  wrote:
>
> In specific workload postgres could spend a lot of time in
> tbm_add_tuples,  up to 50% of query time. hash_search call is expensive and
> called twice for each ItemPointer to insert. Suggested patch tries to cache
> last PagetableEntry pointer in hope that next ItemPointer will be on the
> same relation page.
>
> Patch is rather trivial and I don't believe that it could cause
> performance degradation. But it could increase performance on some workload.
>
> An example:
> # create extension btree_gin;
> # select (v / 10)::int4 as i into t from generate_series(1, 500) as v;
> # create index idx on t using gin (i);
> # set enable_seqscan  = off;
>
> # explain analyze select * from t where i >= 0;
> without patch: Execution time: 2427.692 ms
> with patch:Execution time: 2058.718 ms
>
> # explain analyze select * from t where i >= 100 and i<= 100;
> without patch: Execution time: 524.441 ms
> with patch:Execution time: 195.381 ms
>
>
This seems like quite a promising performance boost.

I've been having a look at this and I'm wondering about a certain scenario:

In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block,
and on the next iteration of the loop we have the same block again, have
you benchmarked any caching code to store if tbm_page_is_lossy() returned
true for that block on the previous iteration of the loop? This would save
from having to call tbm_page_is_lossy() again for the same block. Or are
you just expecting that tbm_page_is_lossy() returns true so rarely that
you'll end up caching the page most of the time, and gain on skipping both
hash lookups on the next loop, since page will be set in this case?

It would be nice to see a comment to explain why it might be a good idea to
cache the page lookup. Perhaps something like:

+   /*
+* We'll cache this page as it's quite likely that
on the next loop
+* we'll be seeing the same page again. This will
save from having
+* to lookup the page in the hashtable again.
+*/
+ page = tbm_get_pageentry(tbm, blk);

I also wondered if there might be a small slowdown in the case where the
index only finds 1 matching tuple. So I tried the following:

select v::int4 as i into t1 from generate_series(1, 500) as v;
create index t1_idx on t1 using gin (i);

select count(*) from t1 where i >= 0;

In this case tbm_add_tuples is called with ntids == 1, so there's only 1
loop performed per call. I wondered if the page == NULL check would add
much overhead the function.

Times are in milliseconds:

Patch Master
2413.183 2339.979
2352.564 2428.025
2369.118 2359.498
2338.442 2350.974
2373.330 2400.942
2383.883 2342.461
2393.804 2359.49
2334.998 2359.884
2428.156 2520.842
2336.978 2356.995
avg. 2372.4456 2381.909 99.6%
med. 2371.224 2359.494 100.5%

It appears that if it does, then it's not very much.

Regards

David Rowley

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 13/12/14 09:37, Craig Ringer wrote:

>> Speaking as the originator of commitfests, they were *always* intended
>> to be a temporary measure, a step on the way to something else like
>> continuous integration.
> 
> I'd really like to see the project revisit some of the underlying
> assumptions that're being made in this discussion:
> 
> - Patches must be email attachments to a mailing list
> 
> - Changes must be committed by applying a diff
> 
> ... and take a look at some of the options a git-based workflow might
> offer, especially in combination with some of the tools out there that
> help track working branches, run CI, etc.
> 
> Having grown used to push/pull workflows with CI integration I find the
> PostgreSQL patch workflow very frustrating, especially for larger
> patches. It's particularly annoying to see a patch series squashed into
> a monster patch whenever it changes hands or gets rebased, because it's
> being handed around as a great honking diff not a git working branch.
> 
> Is it time to stop using git like CVS?

While not so active these days with PostgreSQL, I believe there is still
great scope for improving the workflow of reviewing and applying
patches. And while the commitfests were a good idea when they started,
it's obvious that in their current form they do not scale to meet the
current rate of development.

If I could name just one thing that I think would improve things it
would be submission of patches to the list in git format-patch format.
Why? Because it enables two things: 1) by definition patches are
"small-chunked" into individually reviewable pieces and 2) subject lines
allow list members to filter things that aren't relevant to them.

Here's an example of how the workflow works for the QEMU project which
I'm involved in, which also has similar issues in terms of numbers of
developers and supported platforms:


1) Developer submits a patch to the -devel list with git-format-patch

- For subsystems with listed maintainers in the MAINTAINERS file, the
maintainer must be CCd on the patchset (this is so that even if
developers decide to filter PATCH emails to the list, they still get the
CC copy)

- Patchsets must be iterative and fully bisectable, with clear comments
supplied in the commit message

- Any patchset with > 1 patch MUST have a cover letter

- All patches have a "Signed-off-by" indicating that the developer has
accepted the terms of the project license

- Each subject line should be in the format "[PATCH] subsystem: one line
comment" to enabled people to determine its relevance


2) Other developers start reviewing patches

There are several mini-workflows that tend to occur here so I'll try and
give some popular examples.

- If a patch is a build fix, one of the core committers will verify and
apply directly to the master repository

- If a maintainer is happy with the whole patchset, they reply to the
cover letter with a "Reviewed-by" and a line stating which subtree the
patchset has been applied to. Maintainers add a "Signed-off-by" to all
patches accepted via their subtree.

- A maintainer may indicate that the patch itself is fine, but the
commit message is not clear/incorrect and should be changed before being
resubmitted

- Any member of the mailing list may reply/review an individual patch by
hitting "reply" in their mail client. Patch comments are included
inline. If a reviewer is happy with an individual patch, they reply to
the patch and add a "Reviewed-by" line; anyone can provide a review,
even if they are not a maintainer

- For complex patches, a maintainer may request that the patchset may be
split into further patchsets. In general patchsets of > 20-30 patches
tend to be frowned upon, and will often immediately result in a reply
saying "please break this down into smaller chunks"

- A maintainer may reply and say that while the patchset in its final
form needs more work, some clean-ups introduced by the patch are ready
to be applied, e.g. please resubmit patches 1-4, 7 and 8 as a separate
patchset which can then be applied directly to a subtree


3) Patchset tidy-up and submission

After the initial review, new versions of the patchset are generally
reposted to the list within a few days.

- The patch version is included in the header with the same subject
line, e.g. "[PATCHv2] Add new feature"

- The cover letter contains a mini-changelog giving the changes between
v1 and v2 e.g.

v2:
  Fix spelling mistake pointed out by Peter
  Change lock ordering requested by Andreas

- "Reviewed-by" tags for individual patches sent to the list are
appended to the commit message for that patch before resubmission


Once a maintainer has accepted a patch into their subtree, they send a
pull request to the core maintainers to apply their trees to the main
repository. I appreciate that currently merge commits aren't used in the
PostgreSQL git repository so a pull request effectively becomes "rebase
this patchset onto master and push".

So why does this help

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> #2 is solved by my previous comments about giving the CFM/C the 
> authority. -Core could do that, they are in charge of release.

I don't think authority is the solution. Or certainly not one that 
would work with an open source project like ours. What *would* 
work is to identify and fix the friction points that prevent 
people from joining, make the work harder than it needs to be, 
and makes people stop reviewing? I could quickly identify a handful 
of things, primarily among them the awful link-to-the-archives 
to gather up all the patches process. We have git, let's use it 
as it was intended.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201412141011
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlSNqK8ACgkQvJuQZxSWSsiewwCffAxv8xSZEyLWFz/b2+PxXOXS
xB4An2ubr7ovELtFMKZOZCsFHQAyVca4
=S6ZQ
-END PGP SIGNATURE-




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

>
> I ran a test using postgres-US.fo built in the PostgreSQL source tree,
> which is 38 MB, and ran
>
> select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
> 'http://www.w3.org/1999/XSL/Format']])) from data;
>
> (Table contains one row only.)
>
> The timings were basically indistinguishable between the three code
> versions.
>
> I'll try to reproduce your test.  How big is your file?  Do you have a
> link to the actual file?  Could you share your load script?
>

I use this xml sample:
http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml

Basically i loaded the xml to table u 100 times. Load script attached.

Regards,
-- 
Ali Akbar


load_test.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Amit Kapila

On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila 
wrote:
>
> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane  wrote:
> >
> > Andrew Dunstan  writes:
> > > On 11/20/2014 02:27 AM, Amit Kapila wrote:
> > >> Now the part where I would like to receive feedback before revising
the
> > >> patch is on the coding style.  It seems to me from Tom's comments
that
> > >> he is not happy with the code, now I am not sure which part of the
patch
> > >> he thinks needs change.  Tom if possible, could you be slightly more
> > >> specific about your concern w.r.t code?
> >
> > > In view of the request above for comments from Tom, I have moved this
> > > back to "Needs Review".
> >
> > Sorry, I was not paying very close attention to this thread and missed
> > the request for comments.  A few such:
> >
> > 1. The patch is completely naive about what might be in the symlink
> > path string; eg embedded spaces in the path would break it.  On at
> > least some platforms, newlines could be in the path as well.  I'm not
> > sure about how to guard against this while maintaining human readability
> > of the file.
>
> I will look into this and see what best can be done.
>

One way to deal with this could be to append a delimiter(which is not
allowed
in tablespace path like quote (\')) at the end of tablespace path while
writing the same to symlink label file and then use that as end marker while
reading it from file.  I think that might defeat the human readable aspect
of
this file to an extent, but I am not sure if it is too important to keep it
human readable.  I think even if we want to provide some information to
user from internal files, it is always better to provide it via some
utility/tool.

Do we support newline in tablespace path?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Craig Ringer

On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
> Compare this to say, for example, huge patches such as RLS.

I specifically objected to that being flattened into a single monster
patch when I saw that'd been done. If you look at my part in the work on
the row security patch, while I was ultimately unsuccessful in getting
the patch mergeable I spent quite a bit of time splitting it up into a
logical patch-series for sane review and development. I am quite annoyed
that it was simply flattened back into an unreviewable, hard-to-follow
blob and committed in that form.

It's not like development on a patch series is difficult. You commit
small fixes and changes, then you 'git rebase -i' and reorder them to
apply to the appropriate changesets. Or you can do a 'rebase -i' and in
'e'dit mode make amendments to individual commits. Or you can commit
'fixup!'s that get auto-squashed.

This is part of my grumbling about the use of git like it's still CVS.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Craig Ringer

On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:

> If I could name just one thing that I think would improve things it
> would be submission of patches to the list in git format-patch format.
> Why? Because it enables two things: 1) by definition patches are
> "small-chunked" into individually reviewable pieces and 2) subject lines
> allow list members to filter things that aren't relevant to them.

Personally I do just that. Even though the official docs say context
diff is preferred, I think most people now use context diff in practice,
from 'git diff'.

I'm surprised most people don't use git format-patch .

Note that we can't just "git am" a patch from git format-patch at the
moment, because policy is that the committer should be the Author field.
The project would have to define a transition point where we started
using the git Author field for the author and the separate Committer
field, like git-am does by default. I'm not sure that'll ever actually
happen, or that it's even desirable.

> - Patchsets must be iterative and fully bisectable, with clear comments
> supplied in the commit message

Fully support this one.

Also in the smallest reasonable size divisions.

> - If a maintainer is happy with the whole patchset, they reply to the
> cover letter with a "Reviewed-by" and a line stating which subtree the
> patchset has been applied to. Maintainers add a "Signed-off-by" to all
> patches accepted via their subtree.

Sounds a lot like the kernel workflow (though in practice it seems like
the kernel folks tend bypass all this mailing list guff and just use
direct pulls/pushes for lots of things).

> - Any member of the mailing list may reply/review an individual patch by
> hitting "reply" in their mail client. Patch comments are included
> inline. If a reviewer is happy with an individual patch, they reply to
> the patch and add a "Reviewed-by" line; anyone can provide a review,
> even if they are not a maintainer

This is "fun" with many mail clients that like to linewrap text bodies
and don't permit inline replies to attached text.

> 6) Smaller patches are applied more often
> 
> By breaking large patches down into smaller chunks, even if the main
> feature itself isn't ready to be applied to the main repository then a
> lot of the preceding patches laying down the groundwork can.

I think PostgreSQL does OK on smaller patches - at least sometimes.
There can be a great deal of bikeshedding and endless arguments,
back-and-forth about utility, in-tree users, etc, but no formal process
is ever going to change that. The process of getting an uncontroversial
patch into Pg is generally painless, if often rather slow unless a
committer sees it and commits it immediately upon it being posted.

> The benefits here are that people with large out-of-tree patches 

I'm not sure we have all that many people in that category - though I'm
a part of a team that most certainly does fit the bill.

Even the "small" patches for BDR have been challenging and often
contentious though.

> Since as large features get
> implemented as a series of smaller patches

A big barrier here is the "we must have in-tree users" thing, which
makes it hard to do iterative work on a finer grained scale.

Some C-level unit tests that let us exercise small units of
functionality might ease those complaints. Right now the closest we have
to that is contrib/test_[blah] which is only useful for public API and
even then quite limited.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Magnus Hagander

On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier 
wrote:
>
> Hi all,
>
> This week, we heard about a user willing to use a custom timestamp
> format across a set of services to improve the debugability of the
> whole set, Postgres being one of them. Unfortunately datestyle does
> not take into account the logs. Would it be worth adding a new GUC
> able to control the timestamp format in the logs?
>
> We could still redirect the logs with syslog and have a custom
> timestamp format there, but in the case of this particular user syslog
> was a no-go. Looking at the code, timestamp format is now hardcoded in
> setup_formatted_log_time and setup_formatted_start_time when calling
> pg_strftime @ elog.c, so it doesn't seem to be much complicated to do.
>
> Opinions? This thread is here for that.
>

A separate GUC seems kind of weird. Wouldn't it be better with something
like %(format)t or such in the log_line_prefix itself in that case? That
could also be expanded to other parameters, should we need them?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Tom Lane

Magnus Hagander  writes:
> On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier 
> wrote:
>> This week, we heard about a user willing to use a custom timestamp
>> format across a set of services to improve the debugability of the
>> whole set, Postgres being one of them. Unfortunately datestyle does
>> not take into account the logs. Would it be worth adding a new GUC
>> able to control the timestamp format in the logs?

> A separate GUC seems kind of weird. Wouldn't it be better with something
> like %(format)t or such in the log_line_prefix itself in that case? That
> could also be expanded to other parameters, should we need them?

TBH, my answer to the rhetorical question is "no".  There is nothing
weird about the timestamps %t emits now, and no reason why they should
need to be configurable, except that somebody thinks it's easier to
lobby us to complicate our software than to fix whatever they have that
can't consume standard timestamp format.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 15:51, Craig Ringer wrote:

> On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
>> Compare this to say, for example, huge patches such as RLS.
> 
> I specifically objected to that being flattened into a single monster
> patch when I saw that'd been done. If you look at my part in the work on
> the row security patch, while I was ultimately unsuccessful in getting
> the patch mergeable I spent quite a bit of time splitting it up into a
> logical patch-series for sane review and development. I am quite annoyed
> that it was simply flattened back into an unreviewable, hard-to-follow
> blob and committed in that form.
> 
> It's not like development on a patch series is difficult. You commit
> small fixes and changes, then you 'git rebase -i' and reorder them to
> apply to the appropriate changesets. Or you can do a 'rebase -i' and in
> 'e'dit mode make amendments to individual commits. Or you can commit
> 'fixup!'s that get auto-squashed.
> 
> This is part of my grumbling about the use of git like it's still CVS.

I just want to add that I'm always grateful for the time that yourself
and all committers put into PostgreSQL, and my example of RLS was really
to signify "big feature X" rather than to pick on a particular aspect of
that patch. I apologise if you though I was in any way criticising the
work that you, or anyone else, put into this feature.

The argument I wanted to make is that if someone starts seeing a problem
with a current build of PostgreSQL and git bisects it down to a
particular commit, then smaller, iterative commits are extremely helpful
in this regard.

When searching for a needle in a haystack, there is very big difference
between a "add big feature X" haystack and a "change struct alignment
for Y" haystack, the latter which is implicit in having smaller,
iterative patchsets.

ATB,

Mark.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Tom Lane

Craig Ringer  writes:
> On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
>> Compare this to say, for example, huge patches such as RLS.

> I specifically objected to that being flattened into a single monster
> patch when I saw that'd been done. If you look at my part in the work on
> the row security patch, while I was ultimately unsuccessful in getting
> the patch mergeable I spent quite a bit of time splitting it up into a
> logical patch-series for sane review and development. I am quite annoyed
> that it was simply flattened back into an unreviewable, hard-to-follow
> blob and committed in that form.

TBH, I'm not really on board with this line of argument.  I don't find
broken-down patches to be particularly useful for review purposes.  An
example I was just fooling with this week is the GROUPING SETS patch,
which was broken into three sections for no good reason at all.  (The
fourth and fifth subpatches, being alternative solutions to one problem,
are in a different category of course.)  Too often, decisions made in
one subpatch don't make any sense until you see the larger picture.

Also, speaking of the larger picture: the current Postgres revision
history amounts to 37578 commits (as of sometime yesterday) --- and that's
just in the HEAD branch.  If we'd made an effort to break feature patches
into bite-size chunks like you're recommending here, we'd probably have
easily half a million commits in the mainline history.  That would not be
convenient to work with, and I really doubt that it would be more useful
for "git bisect" purposes, and I'll bet a large amount of money that most
of them would not have had commit messages composed with any care at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusing comment in tidbitmap.c

2014-12-14 Thread Tom Lane

David Rowley  writes:
> The following comment above #define PAGES_PER_CHUNK in tibbitmap.c appears
> to be incorrect:

> "But we
>  * also want PAGES_PER_CHUNK to be a power of 2 to avoid expensive integer
>  * remainder operations.  So, define it like this:"

> I don't quite follow this as it does nothing of the kind.
> Check tbm_page_is_lossy() where we do: bitno = pageno % PAGES_PER_CHUNK;

> Or am I missing something about the compiler optimizing that to: bitno =
> pageno & 255; ?

Exactly.  Any C compiler ever written will do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-12-14 Thread Pavel Stehule

Hi

any comments to last proposal and patch?

Pavel

2014-11-26 19:52 GMT+01:00 Pavel Stehule :
>
> Hi
>
> 2014-11-26 16:46 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2014-11-26 13:31 GMT+01:00 Marko Tiikkaja :
>>
>>> On 11/26/14 8:55 AM, Pavel Stehule wrote:
>>>
 * should be assertions globally enabled/disabled? - I have no personal
 preference in this question.

>>>
>>> I think so.  The way I would use this function is to put expensive
>>> checks into strategic locations which would only run when developing
>>> locally (and additionally perhaps in one of the test environments.)  And in
>>> that case I'd like to globally disable them for the live environment.
>>>
>>
>> ok
>>
>>
>>>
>>>  * can be ASSERT exception handled ? - I prefer this be unhandled
 exception
 - like query_canceled because It should not be masked by plpgsql
 EXCEPTION
 WHEN OTHERS ...

>>>
>>> I don't care much either way, as long as we get good information about
>>> what went wrong.  A stack trace and hopefully something like
>>> print_strict_params for parameters to the "expr".
>>>
>>
>> There is more ways, I can live with both
>>
>
> here is proof concept
>
> what do you think about it?
>
> Regards
>
> Pavel
>
>
>>
>> Pavel
>>
>>
>>
>>>
>>>
>>> .marko
>>>
>>
>>
>

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 15:57, Craig Ringer wrote:

> On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
> 
>> If I could name just one thing that I think would improve things it
>> would be submission of patches to the list in git format-patch format.
>> Why? Because it enables two things: 1) by definition patches are
>> "small-chunked" into individually reviewable pieces and 2) subject lines
>> allow list members to filter things that aren't relevant to them.
> 
> Personally I do just that. Even though the official docs say context
> diff is preferred, I think most people now use context diff in practice,
> from 'git diff'.
> 
> I'm surprised most people don't use git format-patch .
> 
> Note that we can't just "git am" a patch from git format-patch at the
> moment, because policy is that the committer should be the Author field.
> The project would have to define a transition point where we started
> using the git Author field for the author and the separate Committer
> field, like git-am does by default. I'm not sure that'll ever actually
> happen, or that it's even desirable.

Sure, I appreciate that. The part I particularly wanted to emphasise
here was the use of a system identifier as part of the subject line,
e.g. say if there was a hypothetical patchset which sped up PostgreSQL
by 20% then you could see subjects like this:

[PATCH 1/x] lmgr: reduce lock strength for LWLock
[PATCH 2/x] wal: change lock ordering for WAL writes
[PATCH 3/x] optimiser: exit early if lock unobtainable

While these are obviously unrealistic, what I hope here is that people
with interest in these areas would take note of individual patches even
if they aren't interested in the entire patchset.

So maybe since Andreas did some recent locking work, he spots "lmgr" in
the subject and then makes a note to review that part of the patch.
Similary Heikki could spot "wal" and make a note to look at that, whilst
Tom would zero in on the optimiser changes.

The aim here is that no one person needs to sit and initially review a
complete patch before returning feedback to the developer.

>> - Patchsets must be iterative and fully bisectable, with clear comments
>> supplied in the commit message
> 
> Fully support this one.
> 
> Also in the smallest reasonable size divisions.

I should add here that the QEMU folk do tend to go to great lengths to
preserve bisectability; often intermediate compatibility APIs are
introduced early in the patchset and then removed at the very end when
the final feature is implemented.

>> - If a maintainer is happy with the whole patchset, they reply to the
>> cover letter with a "Reviewed-by" and a line stating which subtree the
>> patchset has been applied to. Maintainers add a "Signed-off-by" to all
>> patches accepted via their subtree.
> 
> Sounds a lot like the kernel workflow (though in practice it seems like
> the kernel folks tend bypass all this mailing list guff and just use
> direct pulls/pushes for lots of things).

Yes indeed - mainly from the people on the KVM module side. Again I'm
not saying that this workflow is correct for PostgreSQL, I was trying to
use this an example to explain *why* the workflow is done in this manner
and how it helps developers such as myself.

>> - Any member of the mailing list may reply/review an individual patch by
>> hitting "reply" in their mail client. Patch comments are included
>> inline. If a reviewer is happy with an individual patch, they reply to
>> the patch and add a "Reviewed-by" line; anyone can provide a review,
>> even if they are not a maintainer
> 
> This is "fun" with many mail clients that like to linewrap text bodies
> and don't permit inline replies to attached text.

Wow really? I can't say I've seen this in practice for a long time but I
defer to your experience here.

>> 6) Smaller patches are applied more often
>>
>> By breaking large patches down into smaller chunks, even if the main
>> feature itself isn't ready to be applied to the main repository then a
>> lot of the preceding patches laying down the groundwork can.
> 
> I think PostgreSQL does OK on smaller patches - at least sometimes.
> There can be a great deal of bikeshedding and endless arguments,
> back-and-forth about utility, in-tree users, etc, but no formal process
> is ever going to change that. The process of getting an uncontroversial
> patch into Pg is generally painless, if often rather slow unless a
> committer sees it and commits it immediately upon it being posted.

I agree that trivial patches do tend to get picked up reasonably
quickly. It strikes me that it may prevent patches slipping through the
list if someone were officially nominated as a trivial patch monitor,
i.e. someone for developers to "ping" if their patch has been ignored
but it sounds like this is not such an issue in practice.

>> The benefits here are that people with large out-of-tree patches 
> 
> I'm not sure we have all that many people in that category - though I'm
> a part of a team that most certainly d

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Andrew Dunstan



On 12/14/2014 12:05 PM, Tom Lane wrote:

Craig Ringer  writes:

On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:

Compare this to say, for example, huge patches such as RLS.

I specifically objected to that being flattened into a single monster
patch when I saw that'd been done. If you look at my part in the work on
the row security patch, while I was ultimately unsuccessful in getting
the patch mergeable I spent quite a bit of time splitting it up into a
logical patch-series for sane review and development. I am quite annoyed
that it was simply flattened back into an unreviewable, hard-to-follow
blob and committed in that form.

TBH, I'm not really on board with this line of argument.  I don't find
broken-down patches to be particularly useful for review purposes.  An
example I was just fooling with this week is the GROUPING SETS patch,
which was broken into three sections for no good reason at all.  (The
fourth and fifth subpatches, being alternative solutions to one problem,
are in a different category of course.)  Too often, decisions made in
one subpatch don't make any sense until you see the larger picture.

Also, speaking of the larger picture: the current Postgres revision
history amounts to 37578 commits (as of sometime yesterday) --- and that's
just in the HEAD branch.  If we'd made an effort to break feature patches
into bite-size chunks like you're recommending here, we'd probably have
easily half a million commits in the mainline history.  That would not be
convenient to work with, and I really doubt that it would be more useful
for "git bisect" purposes, and I'll bet a large amount of money that most
of them would not have had commit messages composed with any care at all.


I have tried to stay away from this thread, but ...

I'm also quite dubious about this suggested workflow, partly for the 
reasons Tom gives, and partly because it would constrain the way I work. 
I tend to commit with little notes to myself in the commit logs, notes 
that are never intended to become part of the public project history. I 
should be quite sad to lose that.


As for using git bisect, usually when I do this each iteration is quite 
expensive. Multiplying the number of commits by a factor between 10 and 
100, which is what I think this would involve, would just make git 
bisect have to do between 3 and 7 more iterations, ISTM. That's not a win.


On the larger issue, let me just note that I don't believe we have what 
is fundamentally a technological problem, and while technological 
changes can of course sometimes make things easier, they can also blind 
us to the more basic problems we are facing.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Alvaro Herrera

Amit Kapila wrote:

> One way to deal with this could be to append a delimiter(which is not
> allowed
> in tablespace path like quote (\')) at the end of tablespace path while
> writing the same to symlink label file and then use that as end marker while
> reading it from file.

Some GNU tools such as xargs and find use a null char as item delimiter;
see find -print0 and xargs -0.  IIRC one of our tools also allow that
(psql?).  Doing the same here would make human reading a bit more
difficult, but not completely impossible.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Alvaro Herrera

Tom Lane wrote:
> Magnus Hagander  writes:
> > On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier 
> > wrote:
> >> This week, we heard about a user willing to use a custom timestamp
> >> format across a set of services to improve the debugability of the
> >> whole set, Postgres being one of them. Unfortunately datestyle does
> >> not take into account the logs. Would it be worth adding a new GUC
> >> able to control the timestamp format in the logs?
> 
> > A separate GUC seems kind of weird. Wouldn't it be better with something
> > like %(format)t or such in the log_line_prefix itself in that case? That
> > could also be expanded to other parameters, should we need them?
> 
> TBH, my answer to the rhetorical question is "no".  There is nothing
> weird about the timestamps %t emits now, and no reason why they should
> need to be configurable, except that somebody thinks it's easier to
> lobby us to complicate our software than to fix whatever they have that
> can't consume standard timestamp format.

I imagine pgBadger/pgFouine wouldn't be happy with the timestamp being
infinitely configurable.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-12-14 Thread Alvaro Herrera

Pavel Stehule wrote:
> Hi
> 
> any comments to last proposal and patch?

WTH is that pstrdup("hint is null") thing?  Debugging leftover?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 17:05, Tom Lane wrote:

> Craig Ringer  writes:
>> On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
>>> Compare this to say, for example, huge patches such as RLS.
> 
>> I specifically objected to that being flattened into a single monster
>> patch when I saw that'd been done. If you look at my part in the work on
>> the row security patch, while I was ultimately unsuccessful in getting
>> the patch mergeable I spent quite a bit of time splitting it up into a
>> logical patch-series for sane review and development. I am quite annoyed
>> that it was simply flattened back into an unreviewable, hard-to-follow
>> blob and committed in that form.
> 
> TBH, I'm not really on board with this line of argument.  I don't find
> broken-down patches to be particularly useful for review purposes.  An
> example I was just fooling with this week is the GROUPING SETS patch,
> which was broken into three sections for no good reason at all.  (The
> fourth and fifth subpatches, being alternative solutions to one problem,
> are in a different category of course.)  Too often, decisions made in
> one subpatch don't make any sense until you see the larger picture.

The way in which this is normally handled is via the cover letter which
is going to be nicely captured in the mail archives; typically a cover
letter for a patchset consists of several paragraphs along the lines of
patches 1-3 do some re-org work, patch 4 reworks the Y API for new
feature X implemented in patches 9-12.

As an example take a look at the cover letter for this patch submitted
over the past few days:
http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01990.html. It
seems to me that this would give the level of detail which you are
looking for.

I agree that definitely in some cases patches could be broken into "too
fine" pieces, but then I think it's perfectly okay for committers to
turn around and request the series be re-submitted in more sensible
chunks if they feel things have gone too far the other way.

> Also, speaking of the larger picture: the current Postgres revision
> history amounts to 37578 commits (as of sometime yesterday) --- and that's
> just in the HEAD branch.  If we'd made an effort to break feature patches
> into bite-size chunks like you're recommending here, we'd probably have
> easily half a million commits in the mainline history.  That would not be
> convenient to work with, and I really doubt that it would be more useful
> for "git bisect" purposes, and I'll bet a large amount of money that most
> of them would not have had commit messages composed with any care at all.

Having worked on a few kernel patches myself, git is surprisingly
effective on huge repositories once the cache is warmed up so I don't
feel this would be an issue with a PostgreSQL git repository which will
be many magnitudes smaller.

In terms of commit messages, I don't know if you missed that part of my
original response to Craig but it's considered normal for a maintainer
to reject a patch because of an incorrect/irrelevant commit message.

So if I submitted a patch that fixed a particular bug but you as a
committer weren't happy with the commit message, then I'm perfectly okay
with you asking me to resubmit with updated/corrected patch message wording.

ATB,

Mark.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> TBH, my answer to the rhetorical question is "no".  There is nothing
>> weird about the timestamps %t emits now, and no reason why they should
>> need to be configurable, except that somebody thinks it's easier to
>> lobby us to complicate our software than to fix whatever they have that
>> can't consume standard timestamp format.

> I imagine pgBadger/pgFouine wouldn't be happy with the timestamp being
> infinitely configurable.

Yeah, if the repercussions were confined to the server code that would be
one thing, but there are a number of external tools that would be affected
as well.  The "complication" cost has to be thought about that way rather
than evaluated in isolation.

You could argue that pgBadger et al could just document that they don't
support nonstandard timestamp formats ... but then it's really unclear why
we're shifting the complexity burden in this direction rather than asking
why the one proprietary application that wants the other thing can't cope
with the existing format choice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Peter Geoghegan

On Sun, Dec 14, 2014 at 9:05 AM, Tom Lane  wrote:
> TBH, I'm not really on board with this line of argument.  I don't find
> broken-down patches to be particularly useful for review purposes.  An
> example I was just fooling with this week is the GROUPING SETS patch,
> which was broken into three sections for no good reason at all.  (The
> fourth and fifth subpatches, being alternative solutions to one problem,
> are in a different category of course.)  Too often, decisions made in
> one subpatch don't make any sense until you see the larger picture.

It sounds like they didn't use the technique effectively, then. I
think it will be useful to reviewers that I've broken out the
mechanism through which the ON CONFLICT UPDATE patch accepts the
EXCLUDED.* pseudo-alias into a separate commit (plus docs in a
separate commit, as well as tests) - it's a non-trivial additional
piece of code that it wouldn't be outrageous to omit in a release, and
isn't particularly anticipated by earlier cumulative commits. Maybe
people don't have a good enough sense of what sort of subdivision is
appropriate yet. I think that better style will emerge over time.

Of course, if you still prefer a monolithic commit, it's pretty easy
to produce one from a patch series. It's not easy to go in the other
direction, though.
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 17:30, Andrew Dunstan wrote:

> On 12/14/2014 12:05 PM, Tom Lane wrote:
>> Craig Ringer  writes:
>>> On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote:
 Compare this to say, for example, huge patches such as RLS.
>>> I specifically objected to that being flattened into a single monster
>>> patch when I saw that'd been done. If you look at my part in the work on
>>> the row security patch, while I was ultimately unsuccessful in getting
>>> the patch mergeable I spent quite a bit of time splitting it up into a
>>> logical patch-series for sane review and development. I am quite annoyed
>>> that it was simply flattened back into an unreviewable, hard-to-follow
>>> blob and committed in that form.
>> TBH, I'm not really on board with this line of argument.  I don't find
>> broken-down patches to be particularly useful for review purposes.  An
>> example I was just fooling with this week is the GROUPING SETS patch,
>> which was broken into three sections for no good reason at all.  (The
>> fourth and fifth subpatches, being alternative solutions to one problem,
>> are in a different category of course.)  Too often, decisions made in
>> one subpatch don't make any sense until you see the larger picture.
>>
>> Also, speaking of the larger picture: the current Postgres revision
>> history amounts to 37578 commits (as of sometime yesterday) --- and
>> that's
>> just in the HEAD branch.  If we'd made an effort to break feature patches
>> into bite-size chunks like you're recommending here, we'd probably have
>> easily half a million commits in the mainline history.  That would not be
>> convenient to work with, and I really doubt that it would be more useful
>> for "git bisect" purposes, and I'll bet a large amount of money that most
>> of them would not have had commit messages composed with any care at all.
> 
> I have tried to stay away from this thread, but ...
> 
> I'm also quite dubious about this suggested workflow, partly for the
> reasons Tom gives, and partly because it would constrain the way I work.
> I tend to commit with little notes to myself in the commit logs, notes
> that are never intended to become part of the public project history. I
> should be quite sad to lose that.

Interestingly enough, I tend to work in a very similar way to this. When
submitting patches upstream, I tend to rebase on a new branch and then
squash/rework as required.

One thing I do tend to find is that once I start rebasing the patches
for upstream, I find that many of my personal notes can be rewritten as
part of the patch comment (I would like to think that comments useful to
myself will be useful to other developers some day). Once the rebase is
complete, often I find that I have no non-public comments left so that
problem becomes almost non-existent.

> As for using git bisect, usually when I do this each iteration is quite
> expensive. Multiplying the number of commits by a factor between 10 and
> 100, which is what I think this would involve, would just make git
> bisect have to do between 3 and 7 more iterations, ISTM. That's not a win.

I find that this isn't too bad in practice. If you think about
monolithic patches during a commitfest, you can imagine that most of
them will touch one of the core .h files which will require most things
to be rebuilt once applied during bisection.

With smaller iterative patchsets, each patch tends to only touch a
handful of files and so "make install" generally only needs to rebuild
and link a very small number of files which is reasonably quick. Since
all of the changes for global header files are contained in 1-2 patches
then the number of complete rebuilds isn't as many as you might expect.

> On the larger issue, let me just note that I don't believe we have what
> is fundamentally a technological problem, and while technological
> changes can of course sometimes make things easier, they can also blind
> us to the more basic problems we are facing.

Absolutely. I firmly believe that the right tools for the job are
available, it's really trying to find a workflow solution that works
well for everyone involved in the project.

ATB,

Mark.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: plpgsql - Assert statement

2014-12-14 Thread Pavel Stehule

2014-12-14 18:57 GMT+01:00 Alvaro Herrera :
>
> Pavel Stehule wrote:
> > Hi
> >
> > any comments to last proposal and patch?
>
> WTH is that pstrdup("hint is null") thing?  Debugging leftover?
>

No, only not well error message. I propose a syntax ASSERT
boolean_expression [, text expression ] -- text expression is >>hint<< for
assertion debugging

This text expression should not be null when it is used. I am not sure,
what is well behave - so when assertions fails and text expression is null,
then I use message "hint is null" as hint.

Regards

Pavel

>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 18:24, Peter Geoghegan wrote:

> On Sun, Dec 14, 2014 at 9:05 AM, Tom Lane  wrote:
>> TBH, I'm not really on board with this line of argument.  I don't find
>> broken-down patches to be particularly useful for review purposes.  An
>> example I was just fooling with this week is the GROUPING SETS patch,
>> which was broken into three sections for no good reason at all.  (The
>> fourth and fifth subpatches, being alternative solutions to one problem,
>> are in a different category of course.)  Too often, decisions made in
>> one subpatch don't make any sense until you see the larger picture.
> 
> It sounds like they didn't use the technique effectively, then. I
> think it will be useful to reviewers that I've broken out the
> mechanism through which the ON CONFLICT UPDATE patch accepts the
> EXCLUDED.* pseudo-alias into a separate commit (plus docs in a
> separate commit, as well as tests) - it's a non-trivial additional
> piece of code that it wouldn't be outrageous to omit in a release, and
> isn't particularly anticipated by earlier cumulative commits. Maybe
> people don't have a good enough sense of what sort of subdivision is
> appropriate yet. I think that better style will emerge over time.

For me this is a great example of how to get more developers involved in
patch review. Imagine that I'm an experienced developer with little
previous exposure to PostgreSQL, but with a really strong flex/bison
background.

If someone posts a patch to the list as a single grouping sets patch,
then I'm going to look at that and think "I have no way of understanding
this" and do nothing with it.

However if it were posted as part of patchset with a subject of "[PATCH]
gram.y: add EXCLUDED pseudo-alias to bison grammar" then this may pique
my interest enough to review the changes to the grammar rules, with the
barrier for entry reduced to understanding just the PostgreSQL-specific
parts.

What should happen over time is that developers like this would earn the
trust of the committers, so committers can spend more time reviewing the
remaining parts of the patchset. And of course the project has now
engaged a new developer into the community who otherwise may not have
not participated.

> Of course, if you still prefer a monolithic commit, it's pretty easy
> to produce one from a patch series. It's not easy to go in the other
> direction, though.

Agreed.

ATB,

Mark.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN range operator class

2014-12-14 Thread Emre Hasegeli

> I thought we can do better than minmax for the inet data type,
> and ended up with a generalized opclass supporting both inet and range
> types.  Patch based on minmax-v20 attached.  It works well except
> a few small problems.  I will improve the patch and add into
> a commitfest after BRIN framework is committed.

I wanted to send a new version before the commitfest to get some
feedback, but it is still work in progress.  Patch attached rebased
to the current HEAD.  This version supports more operators and
box from geometric data types.  Opclasses are renamed to inclusion_ops
to be more generic.  The problems I mentioned remain beause I
couldn't solve them without touching the BRIN framework.

> To support more operators I needed to change amstrategies and
> amsupport on the catalog.  It would be nice if amsupport can be set
> to 0 like am strategies.

I think it would be nicer to get the functions from the operators
with using the strategy numbers instead of adding them directly as
support functions.  I looked around a bit but couldn't find
a sensible way to support it.  Is it possible without adding them
to the RelationData struct?

> Inet data types accept IP version 4 and version 6.  It isn't possible
> to represent union of addresses from different versions with a valid
> inet type.  So, I made the union function return NULL in this case.
> Then, I tried to store if returned value is NULL or not, in
> column->values[] as boolean, but it failed on the pfree() inside
> brin_dtuple_initilize().  It doesn't seem right to free the values
> based on attr->attbyval.

This problem remains.  There is also a similar problem with the
range types, namely empty ranges.  There should be special cases
for them on some of the strategies.  I tried to solve the problems
in several different ways, but got a segfault one line or another.
This makes me think that BRIN framework doesn't support to store
different types than the indexed column in the values array.
For example, brin_deform_tuple() iterates over the values array and
copies them using the length of the attr on the index, not the length
of the type defined by OpcInfo function.  If storing another types
aren't supported, why is it required to return oid's on the OpcInfo
function.  I am confused.

I didn't try to support other geometric types than box as I couldn't
managed to store a different type on the values array, but it would
be nice to get some feedback about the overall design.  I was
thinking to add a STORAGE parameter to the index to support other
geometric types.  I am not sure that adding the STORAGE parameter
to be used by the opclass implementation is the right way.  It
wouldn't be the actual thing that is stored by the index, it will be
an element in the values array.  Maybe, data type specific opclasses
is the way to go, not a generic one as I am trying.


brin-inclusion-v02.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Craig Ringer

On 12/15/2014 02:46 AM, Mark Cave-Ayland wrote:
> Interestingly enough, I tend to work in a very similar way to this. When
> submitting patches upstream, I tend to rebase on a new branch and then
> squash/rework as required.

Same here, and I find it works really well ... when I do it properly.

I usually have a private development branch that's full of "fixup! "
commits, WIPs, awful commit messages, etc.

Then I have the tree that's been rebased, squashed, and tided up.

I periodically tidy my latest work and replace the old clean tree with
it, then start my new development branch on top of the new clean tree.

This gives me a somewhat nice looking, well ordered patch series to work
on top of, while retaining the flexibility to do lots of quick fixes etc.

Admittedly, sometimes the development tree gets a bit large and it takes
a while before I give it a proper cleanup. My current project being very
much in that category. Still - it works well in general.

> I find that this isn't too bad in practice. If you think about
> monolithic patches during a commitfest, you can imagine that most of
> them will touch one of the core .h files which will require most things
> to be rebuilt once applied during bisection.

It's not build time, it's test-run time. Especially if you can't
automate the test, or it isn't practical to.

That's a legitimate concern - but I don't think we'd see a blowout of
patch counts to quite the same degree.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-14 Thread Mark Cave-Ayland

On 14/12/14 20:07, Craig Ringer wrote:

> On 12/15/2014 02:46 AM, Mark Cave-Ayland wrote:
>> Interestingly enough, I tend to work in a very similar way to this. When
>> submitting patches upstream, I tend to rebase on a new branch and then
>> squash/rework as required.
> 
> Same here, and I find it works really well ... when I do it properly.
> 
> I usually have a private development branch that's full of "fixup! "
> commits, WIPs, awful commit messages, etc.
> 
> Then I have the tree that's been rebased, squashed, and tided up.
> 
> I periodically tidy my latest work and replace the old clean tree with
> it, then start my new development branch on top of the new clean tree.
> 
> This gives me a somewhat nice looking, well ordered patch series to work
> on top of, while retaining the flexibility to do lots of quick fixes etc.
> 
> Admittedly, sometimes the development tree gets a bit large and it takes
> a while before I give it a proper cleanup. My current project being very
> much in that category. Still - it works well in general.

That describes my workflow fairly well too.

>> I find that this isn't too bad in practice. If you think about
>> monolithic patches during a commitfest, you can imagine that most of
>> them will touch one of the core .h files which will require most things
>> to be rebuilt once applied during bisection.
> 
> It's not build time, it's test-run time. Especially if you can't
> automate the test, or it isn't practical to.
> 
> That's a legitimate concern - but I don't think we'd see a blowout of
> patch counts to quite the same degree.

At the end of the day, each project finds its own level as to how
complex individual patches should be and what should comprise a
patchset. Over the past couple of years the QEMU process has evolved
into its current form as maintainers and developers have figured out
what works best for them, and I don't see that PostgreSQL would be any
different in this respect - it takes time to adapt to a new workflow and
get it right.

Having worked on various parts for PostGIS for around 10 years, I've had
my head stuck into various parts of the GiST code and have got to know a
few parts of it really well.

What I find frustrating is that I've come back from a workflow where
I've been reviewing/testing patches within months of joining a project
because the barrier for entry has been so low, and yet even with much
longer experience of the PostgreSQL codebase I feel I can't do the same
for patches submitted to the commitfest.

And why is this? It's because while I know very specific areas of the
code well, many patches span different areas of the source tree of which
I have little and/or no experience, which when supplied as a single
monolithic patch makes it impossible for me to give meaningful review to
all but a tiny proportion of them.

I believe that this is one of the main reasons why people struggle to
find patch reviewers, with the consequence being that the majority of
work falls back onto the CF manager and the committers which is why we
have the current situation.

ATB,

Mark.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 1:36 AM, Magnus Hagander  wrote:
> On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier 
> wrote:
>>
>> Hi all,
>>
>> This week, we heard about a user willing to use a custom timestamp
>> format across a set of services to improve the debugability of the
>> whole set, Postgres being one of them. Unfortunately datestyle does
>> not take into account the logs. Would it be worth adding a new GUC
>> able to control the timestamp format in the logs?
>>
>> We could still redirect the logs with syslog and have a custom
>> timestamp format there, but in the case of this particular user syslog
>> was a no-go. Looking at the code, timestamp format is now hardcoded in
>> setup_formatted_log_time and setup_formatted_start_time when calling
>> pg_strftime @ elog.c, so it doesn't seem to be much complicated to do.
>>
>> Opinions? This thread is here for that.
>
>
> A separate GUC seems kind of weird.
Check.

> Wouldn't it be better with something like %(format)t or such in the 
> log_line_prefix itself in that case?
> That could also be expanded to other parameters, should we need them?
Possible. I am not sure if we will be able to have a new parameter in
log_line_prefix as modulable as timestamps for formatting though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

2014-12-14 22:18 GMT+07:00 Ali Akbar :
>
>
> I ran a test using postgres-US.fo built in the PostgreSQL source tree,
>> which is 38 MB, and ran
>>
>> select unnest(xpath('//fo:bookmark-title', b, array[array['fo',
>> 'http://www.w3.org/1999/XSL/Format']])) from data;
>>
>> (Table contains one row only.)
>>
>> The timings were basically indistinguishable between the three code
>> versions.
>>
>> I'll try to reproduce your test.  How big is your file?  Do you have a
>> link to the actual file?  Could you share your load script?
>>
>
> I use this xml sample:
> http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml
>
> Basically i loaded the xml to table u 100 times. Load script attached.
>

Peter, while reviewing the better performing patch myself, now i think the
patch needs more work to be committed. The structuring of the method will
be confusing in the long term. I think i'll restructure the patch in the
next commitfest.

So i propose to break the patch:
1. We apply the current patch which uses xmlNodeCopy, so that the
long-standing bug will be fixed in postgres.
2. I'll work with the performance enhancement in the next commitfest.

Maybe for (2), the current better-performing patch can be viewed as PoC of
the expected performance.

Regards,
-- 
Ali Akbar

Re: [HACKERS] Custom timestamp format in logs

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 3:16 AM, Tom Lane  wrote:
> You could argue that pgBadger et al could just document that they don't
> support nonstandard timestamp formats ... but then it's really unclear why
> we're shifting the complexity burden in this direction rather than asking
> why the one proprietary application that wants the other thing can't cope
> with the existing format choice.
Well, the opposite side can argue exactly the contrary with the user
hat: why doesn't Postgres allow this kind of customization, knowing
that the other things running on my server can do it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Tom Lane

Amit Kapila  writes:
>> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane  wrote:
>>> 1. The patch is completely naive about what might be in the symlink
>>> path string; eg embedded spaces in the path would break it.  On at
>>> least some platforms, newlines could be in the path as well.  I'm not
>>> sure about how to guard against this while maintaining human readability
>>> of the file.

> One way to deal with this could be to append a delimiter(which is not
> allowed
> in tablespace path like quote (\')) at the end of tablespace path while
> writing the same to symlink label file and then use that as end marker while
> reading it from file.

What makes you think quote isn't allowed in tablespace paths?  Even if we
were to disallow it at the SQL level, there'd be nothing stopping a DBA
from changing the path after the fact by redefining the symlink outside
SQL --- something I believe we specifically meant to allow, considering
we went to the trouble of getting rid of the pg_tablespace.spclocation
column.

Pretty much the only character we can be entirely certain is not in a
symlink's value is \0.  As Alvaro mentioned, using that in the file
is a possible alternative, although it could easily confuse some users
and/or text editors.  The only other alternatives I can see are:

* Go over to a byte-count-then-value format.  Also possible, also rather
  unfriendly from a user's standpoint.

* Establish an escaping convention, eg backslash before any funny
  characters.  Unfortunately backslash wouldn't be too nice from the
  viewpoint of Windows users.

* Make pg_basebackup check for and fail on symlinks containing characters
  it can't handle.  Pretty icky, though I suppose there's some argument
  that things like newlines wouldn't be in any rational tablespace path.
  But I doubt you can make that argument for spaces, quotes, or backslashes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Andrew Dunstan



On 12/14/2014 07:09 PM, Tom Lane wrote:

Amit Kapila  writes:

On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane  wrote:

1. The patch is completely naive about what might be in the symlink
path string; eg embedded spaces in the path would break it.  On at
least some platforms, newlines could be in the path as well.  I'm not
sure about how to guard against this while maintaining human readability
of the file.

One way to deal with this could be to append a delimiter(which is not
allowed
in tablespace path like quote (\')) at the end of tablespace path while
writing the same to symlink label file and then use that as end marker while
reading it from file.

What makes you think quote isn't allowed in tablespace paths?  Even if we
were to disallow it at the SQL level, there'd be nothing stopping a DBA
from changing the path after the fact by redefining the symlink outside
SQL --- something I believe we specifically meant to allow, considering
we went to the trouble of getting rid of the pg_tablespace.spclocation
column.

Pretty much the only character we can be entirely certain is not in a
symlink's value is \0.  As Alvaro mentioned, using that in the file
is a possible alternative, although it could easily confuse some users
and/or text editors.  The only other alternatives I can see are:

* Go over to a byte-count-then-value format.  Also possible, also rather
   unfriendly from a user's standpoint.

* Establish an escaping convention, eg backslash before any funny
   characters.  Unfortunately backslash wouldn't be too nice from the
   viewpoint of Windows users.

* Make pg_basebackup check for and fail on symlinks containing characters
   it can't handle.  Pretty icky, though I suppose there's some argument
   that things like newlines wouldn't be in any rational tablespace path.
   But I doubt you can make that argument for spaces, quotes, or backslashes.





Using an escaping convention makes by far the most sense to me. It's 
what occurred to me earlier today even before I read the above. We could 
adopt the URL convention of %xx for escapable characters - that would 
avoid \ nastiness.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] DROP PRIVILEGES OWNED BY

2014-12-14 Thread Marko Tiikkaja


Hi,

This week I had a problem where I wanted to drop only the privileges a 
certain role had in the system, while keeping all the objects.  I 
couldn't figure out a reasonable way to do that, so I've attached a 
patch for this to this email.  Please consider it for inclusion into 
9.5.  The syntax is:


  DROP PRIVILEGES OWNED BY role [, ...]

I at some point decided to implement it as a new command instead of 
changing DropOwnedStmt, and I think that might have been a mistake.  It 
might have made more sense to instead teach DROP OWNED to accept a 
specification of which things to drop.  But the proposal is more 
important than such details, I think.



.marko
*** a/doc/src/sgml/ref/drop_owned.sgml
--- b/doc/src/sgml/ref/drop_owned.sgml
***
*** 111,116  DROP OWNED BY name [, ...] [ CASCAD
--- 111,117 
See Also
  

+
 
 

*** /dev/null
--- b/doc/src/sgml/ref/drop_privileges_owned.sgml
***
*** 0 
--- 1,72 
+ 
+ 
+ 
+  
+   DROP PRIVILEGES OWNED
+  
+ 
+  
+   DROP PRIVILEGES OWNED
+   7
+   SQL - Language Statements
+  
+ 
+  
+   DROP PRIVILEGES OWNED
+   remove privileges granted to a database role
+  
+ 
+  
+ 
+ DROP PRIVILEGES OWNED BY name [, 
...]
+ 
+  
+ 
+  
+   Description
+ 
+   
+DROP PRIVILEGES OWNED revokes all privileges granted to
+the given roles on objects in the current database and on shared objects
+(databases, tablespaces).
+   
+  
+ 
+  
+   Parameters
+ 
+   
+
+ name
+ 
+  
+   The name of a role whose privileges will be revoked.
+  
+ 
+
+   
+  
+ 
+  
+   Compatibility
+ 
+   
+The DROP PRIVILEGES OWNED statement is a
+PostgreSQL extension.
+   
+  
+ 
+  
+   See Also
+ 
+   
+
+
+
+   
+  
+ 
+ 
*** a/src/backend/catalog/pg_shdepend.c
--- b/src/backend/catalog/pg_shdepend.c
***
*** 1162,1174  isSharedObjectPinned(Oid classId, Oid objectId, Relation 
sdepRel)
   * interdependent objects in the wrong order.
   */
  void
! shdepDropOwned(List *roleids, DropBehavior behavior)
  {
RelationsdepRel;
ListCell   *cell;
!   ObjectAddresses *deleteobjs;
  
!   deleteobjs = new_object_addresses();
  
/*
 * We don't need this strong a lock here, but we'll call routines that
--- 1162,1175 
   * interdependent objects in the wrong order.
   */
  void
! shdepDropOwned(List *roleids, DropBehavior behavior, bool privilegesOnly)
  {
RelationsdepRel;
ListCell   *cell;
!   ObjectAddresses *deleteobjs = NULL;
  
!   if (!privilegesOnly)
!   deleteobjs = new_object_addresses();
  
/*
 * We don't need this strong a lock here, but we'll call routines that
***
*** 1243,1249  shdepDropOwned(List *roleids, DropBehavior behavior)
break;
case SHARED_DEPENDENCY_OWNER:
/* If a local object, save it for 
deletion below */
!   if (sdepForm->dbid == MyDatabaseId)
{
obj.classId = sdepForm->classid;
obj.objectId = sdepForm->objid;
--- 1244,1250 
break;
case SHARED_DEPENDENCY_OWNER:
/* If a local object, save it for 
deletion below */
!   if (!privilegesOnly && sdepForm->dbid 
== MyDatabaseId)
{
obj.classId = sdepForm->classid;
obj.objectId = sdepForm->objid;
***
*** 1257,1268  shdepDropOwned(List *roleids, DropBehavior behavior)
systable_endscan(scan);
}
  
!   /* the dependency mechanism does the actual work */
!   performMultipleDeletions(deleteobjs, behavior, 0);
  
heap_close(sdepRel, RowExclusiveLock);
  
!   free_object_addresses(deleteobjs);
  }
  
  /*
--- 1258,1274 
systable_endscan(scan);
}
  
!   /*
!* Unless we were asked not to drop objects, now is the time to let the
!* dependency mechanism do the actual work of dropping them.
!*/
!   if (!privilegesOnly)
!   performMultipleDeletions(deleteobjs, behavior, 0);
  
heap_close(sdepRel, RowExclusiveLock);
  
!   if (deleteobjs)
!   free_object_addresses(deleteobjs);
  }
  
  /*
*** a/src/backend/commands/event_trigger.c
--- b/src/backend/commands/event_trigger.c
***
*** 264,269  check_ddl_tag(const char *tag)
--- 264,270 
pg_strcasecmp(tag, "ALTER DEFAULT PRIVILEGES") == 0 ||
pg_strcasecmp(tag, "ALTER LARGE OBJECT") ==

Re: [HACKERS] DROP PRIVILEGES OWNED BY

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 9:43 AM, Marko Tiikkaja  wrote:
> Hi,
>
> This week I had a problem where I wanted to drop only the privileges a
> certain role had in the system, while keeping all the objects.  I couldn't
> figure out a reasonable way to do that, so I've attached a patch for this to
> this email.  Please consider it for inclusion into 9.5.  The syntax is:
>
>   DROP PRIVILEGES OWNED BY role [, ...]
>
> I at some point decided to implement it as a new command instead of changing
> DropOwnedStmt, and I think that might have been a mistake.  It might have
> made more sense to instead teach DROP OWNED to accept a specification of
> which things to drop.  But the proposal is more important than such details,
> I think.
You should consider adding it to the upcoming CF:
https://commitfest.postgresql.org/action/commitfest_view?id=25
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Amit Kapila

On Mon, Dec 15, 2014 at 5:39 AM, Tom Lane  wrote:
>
> Amit Kapila  writes:
> >> On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane  wrote:
> >>> 1. The patch is completely naive about what might be in the symlink
> >>> path string; eg embedded spaces in the path would break it.  On at
> >>> least some platforms, newlines could be in the path as well.  I'm not
> >>> sure about how to guard against this while maintaining human
readability
> >>> of the file.
>
> > One way to deal with this could be to append a delimiter(which is not
> > allowed
> > in tablespace path like quote (\')) at the end of tablespace path while
> > writing the same to symlink label file and then use that as end marker
while
> > reading it from file.
>
> What makes you think quote isn't allowed in tablespace paths?

Below part of code makes me think that quote is not allowed.

Oid
CreateTableSpace(CreateTableSpaceStmt *stmt)
{
..
/* disallow quotes, else CREATE DATABASE would be at risk */
if (strchr(location, '\''))
ereport(ERROR,
(errcode(ERRCODE_INVALID_NAME),
errmsg("tablespace location cannot contain single quotes")));
}


> Even if we
> were to disallow it at the SQL level, there'd be nothing stopping a DBA
> from changing the path after the fact by redefining the symlink outside
> SQL --- something I believe we specifically meant to allow, considering
> we went to the trouble of getting rid of the pg_tablespace.spclocation
> column.
>
> Pretty much the only character we can be entirely certain is not in a
> symlink's value is \0.  As Alvaro mentioned, using that in the file
> is a possible alternative, although it could easily confuse some users
> and/or text editors.  The only other alternatives I can see are:
>
> * Go over to a byte-count-then-value format.  Also possible, also rather
>   unfriendly from a user's standpoint.
>
> * Establish an escaping convention, eg backslash before any funny
>   characters.  Unfortunately backslash wouldn't be too nice from the
>   viewpoint of Windows users.
>
> * Make pg_basebackup check for and fail on symlinks containing characters
>   it can't handle.  Pretty icky, though I suppose there's some argument
>   that things like newlines wouldn't be in any rational tablespace path.

Yeah, another thing is that during tablespace creation, we use below
code to form tablespace path which prompted me to ask question that
do we allow newline in create tablespace path.

create_tablespace_directories()
{
..
location_with_version_dir = psprintf("%s/%s", location,
TABLESPACE_VERSION_DIRECTORY);
..
}

Now if above code understand newline in path, then can't we make
some arrangement during file read?

>   But I doubt you can make that argument for spaces, quotes, or
backslashes.
>

If we disallow newline in symlink path via pg_basebackup path, then we
might be able to use 'Negated scanset' format specifier of fscanf
("%[^\n]s")
to handle other characters.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Status of CF 2014-10 and upcoming 2014-12

2014-12-14 Thread Michael Paquier

On Sun, Dec 14, 2014 at 12:29 AM, Andrew Dunstan  wrote:
>> Opinions?
> I will look at clearing out a few of the Ready For Committer patches today
> and tomorrow.
Thank you.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-12-14 Thread Michael Paquier

On Tue, Dec 9, 2014 at 2:52 AM, Peter Geoghegan  wrote:
> On Mon, Dec 8, 2014 at 9:43 AM, Peter Geoghegan  wrote:
>> I think it's very possible that the wrong alias may be provided by the
>> user, and that we should consider that when providing a hint.
>
> Note that the existing mechanism (the mechanism that I'm trying to
> improve) only ever shows this error message:
>
> "There is a column named \"%s\" in table \"%s\", but it cannot be
> referenced from this part of the query."
>
> I think it's pretty clear that this general class of user error is common.
Moving this patch to CF 2014-12 as work is still going on, note that
it is currently marked with Robert as reviewer and that its current
status is "Needs review".
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-14 Thread Michael Paquier

On Thu, Dec 11, 2014 at 3:36 AM, Tom Lane  wrote:
> I don't really have any comments on the algorithms yet, having spent too
> much time trying to figure out underdocumented data structures to get to
> the algorithms.  However, noting the addition of list_intersection_int()
> made me wonder whether you'd not be better off reducing the integer lists
> to bitmapsets a lot sooner, perhaps even at parse analysis.
> list_intersection_int() is going to be O(N^2) by nature.  Maybe N can't
> get large enough to matter in this context, but I do see places that
> seem to be concerned about performance.
>
> I've not spent any real effort looking at gsp2.patch yet, but it seems
> even worse off comment-wise: if there's any explanation in there at all
> of what a "chained aggregate" is, I didn't find it.  I'd also counsel you
> to find some other way to do it than putting bool chain_head fields in
> Aggref nodes; that looks like a mess, eg, it will break equal() tests
> for expression nodes that probably should still be seen as equal.
>
> I took a quick look at gsp-u.patch.  It seems like that approach should
> work, with of course the caveat that using CUBE/ROLLUP as function names
> in a GROUP BY list would be problematic.  I'm not convinced by the
> commentary in ruleutils.c suggesting that extra parentheses would help
> disambiguate: aren't extra parentheses still going to contain grouping
> specs according to the standard?  Forcibly schema-qualifying such function
> names seems like a less fragile answer on that end.
Based on those comments, I am marking this patch as "Returned with
Feedback" on the CF app for 2014-10. Andrew, feel free to move this
entry to CF 2014-12 if you are planning to continue working on it so
as it would get additional review. (Note that this patch status was
"Waiting on Author" when writing this text).
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

2014-12-14 Thread Michael Paquier

On Thu, Dec 11, 2014 at 2:54 PM, Ashutosh Bapat
 wrote:
> On Thu, Dec 11, 2014 at 8:39 AM, Etsuro Fujita 
> wrote:
>>
>> Hi Ashutosh,
>>
>> Thanks for the review!
>>
>> (2014/12/10 14:47), Ashutosh Bapat wrote:
>>>
>>> We haven't heard anything from Horiguchi-san and Hanada-san for almost a
>>> week. So, I am fine marking it as "ready for committer". What do you say?
Moving this patch to CF 2014-12 with the same status. Let's get a
committer having a look at it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GiST kNN search queue (Re: KNN-GiST with recheck)

2014-12-14 Thread Michael Paquier

On Thu, Dec 11, 2014 at 12:50 AM, Heikki Linnakangas
 wrote:
> On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
>>>
>>> >3. A binary heap would be a better data structure to buffer the
>>> > rechecked
>>> >values. A Red-Black tree allows random insertions and deletions, but in
>>> >this case you need to insert arbitrary values but only remove the
>>> > minimum
>>> >item. That's exactly what a binary heap excels at. We have a nice binary
>>> >heap implementation in the backend that you can use, see
>>> >src/backend/lib/binaryheap.c.
>>> >
>>
>> Hmm. For me binary heap would be a better data structure for KNN-GiST at
>> all :-)
>
>
> I decided to give this a shot, replacing the red-black tree in GiST with the
> binary heap we have in lib/binaryheap.c. It made the GiST code somewhat
> simpler, as the binaryheap interface is simpler than the red-black tree one.
> Unfortunately, performance was somewhat worse. That was quite surprising, as
> insertions and deletions are both O(log N) in both data structures, but the
> red-black tree implementation is more complicated.
>
> I implemented another data structure called a Pairing Heap. It's also a
> fairly simple data structure, but insertions are O(1) instead of O(log N).
> It also performs fairly well in practice.
>
> With that, I got a small but measurable improvement. To test, I created a
> table like this:
>
> create table gisttest (id integer, p point);
> insert into gisttest select id, point(random(), random()) from
> generate_series(1, 100) id;
> create index i_gisttest on gisttest using gist (p);
>
> And I ran this query with pgbench:
>
> select id from gisttest order by p <-> '(0,0)' limit 1000;
>
> With unpatched master, I got about 650 TPS, and with the patch 720 TPS.
> That's a nice little improvement, but perhaps more importantly, the pairing
> heap implementation consumes less memory. To measure that, I put a
> MemoryContextStats(so->queueCtx) call into gistendscan. With the above
> query, but without the "limit" clause, on master I got:
>
> GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998 chunks);
> 21296 used
>
> And with the patch:
>
> GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502 chunks);
> 21072 used
>
> That's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice to
> reduce that memory consumption, as there is no hard upper bound on how much
> might be needed. If the GiST tree is really disorganized for some reason, a
> query might need a lot more.
>
>
> So all in all, I quite like this patch, even though it doesn't do anything
> too phenomenal. It adds a some code, in the form of the new pairing heap
> implementation, but it makes the GiST code a little bit simpler. And it
> gives a small performance gain, and reduces memory usage a bit.
Hum. It looks that this patch using binary heap is intended to be a
replacement red-black tree method. Any reason why it isn't added to
the CF to track it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] KNN-GiST with recheck

2014-12-14 Thread Michael Paquier

On Tue, Jan 28, 2014 at 10:54 PM, Heikki Linnakangas
 wrote:
> 1. This patch introduces a new "polygon <-> point" operator. That seems
> useful on its own, with or without this patch.
This patch is tracked with this entry in the commit fest app and is
marked as "Ready for committer". Hence I am moving this specific part
to 2014-12 to keep track of it:

> 3. A binary heap would be a better data structure to buffer the rechecked
> values. A Red-Black tree allows random insertions and deletions, but in this
> case you need to insert arbitrary values but only remove the minimum item.
> That's exactly what a binary heap excels at. We have a nice binary heap
> implementation in the backend that you can use, see
> src/backend/lib/binaryheap.c.
Based on those comments, I am marking this entry as returned with feedback:
https://commitfest.postgresql.org/action/patch_view?id=1367
Heikki has sent as well a new patch to use a binary heap method
instead of the red-black tree here:
http://www.postgresql.org/message-id/54886bb8.9040...@vmware.com
IMO this last patch should be added in the CF app, that's not the case now.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-14 Thread Peter Eisentraut

Here is a patch series to move

pg_archivecleanup
pg_standby
pg_test_fsync
pg_test_timing
pg_upgrade
pg_xlogdump
pgbench

from contrib/ to src/bin/.

Move people didn't like moving oid2name and vacuumlo, which I
understand.  So I'll leave those for now.

I have also included a patch to move pg_upgrade_support into the
backend, as discussed.

Open issues/discussion points:

- no Windows build system support yet

- I didn't move the source of the man pages to .../sgml/ref.  This could
be done.

- I didn't rename to the SGML ids of the man pages to git the patter of
the other applications, because that would change the name of the HTML
pages.

- We have traditionally kept an individual author acknowledgement in the
man pages for contrib items.  Should those be removed now?

From 0a4b9a4eaf930b8e14b4b3e2077ca7706a65ab6a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 14 Dec 2014 20:41:58 -0500
Subject: [PATCH 1/9] Sort SUBDIRS variable in src/bin/Makefile

The previous order appears to have been historically grown randomness.
---
 src/bin/Makefile | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/bin/Makefile b/src/bin/Makefile
index f03cc42..f0589f3 100644
--- a/src/bin/Makefile
+++ b/src/bin/Makefile
@@ -13,8 +13,16 @@ subdir = src/bin
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = initdb pg_ctl pg_dump \
-	psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup
+SUBDIRS = \
+	initdb \
+	pg_basebackup \
+	pg_config \
+	pg_controldata \
+	pg_ctl \
+	pg_dump \
+	pg_resetxlog \
+	psql \
+	scripts
 
 ifeq ($(PORTNAME), win32)
 SUBDIRS += pgevent
-- 
2.2.0

From 403955c76136960d7d444e946ba2c20110d7e263 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 14 Dec 2014 20:41:58 -0500
Subject: [PATCH 2/9] Move pg_archivecleanup from contrib/ to src/bin/

---
 contrib/Makefile   |  1 -
 contrib/pg_archivecleanup/Makefile | 18 --
 doc/src/sgml/contrib.sgml  |  1 -
 doc/src/sgml/reference.sgml|  1 +
 src/bin/Makefile   |  1 +
 {contrib => src/bin}/pg_archivecleanup/.gitignore  |  0
 src/bin/pg_archivecleanup/Makefile | 14 ++
 .../bin}/pg_archivecleanup/pg_archivecleanup.c |  2 +-
 8 files changed, 17 insertions(+), 21 deletions(-)
 delete mode 100644 contrib/pg_archivecleanup/Makefile
 rename {contrib => src/bin}/pg_archivecleanup/.gitignore (100%)
 create mode 100644 src/bin/pg_archivecleanup/Makefile
 rename {contrib => src/bin}/pg_archivecleanup/pg_archivecleanup.c (99%)

diff --git a/contrib/Makefile b/contrib/Makefile
index 195d447..c56050e 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -28,7 +28,6 @@ SUBDIRS = \
 		oid2name	\
 		pageinspect	\
 		passwordcheck	\
-		pg_archivecleanup \
 		pg_buffercache	\
 		pg_freespacemap \
 		pg_prewarm	\
diff --git a/contrib/pg_archivecleanup/Makefile b/contrib/pg_archivecleanup/Makefile
deleted file mode 100644
index ab52390..000
--- a/contrib/pg_archivecleanup/Makefile
+++ /dev/null
@@ -1,18 +0,0 @@
-# contrib/pg_archivecleanup/Makefile
-
-PGFILEDESC = "pg_archivecleanup - cleans archive when used with streaming replication"
-PGAPPICON = win32
-
-PROGRAM = pg_archivecleanup
-OBJS	= pg_archivecleanup.o $(WIN32RES)
-
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
-subdir = contrib/pg_archivecleanup
-top_builddir = ../..
-include $(top_builddir)/src/Makefile.global
-include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index a698d0f..f21fa14 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -202,7 +202,6 @@ Server Applications
part of the core PostgreSQL distribution.
   
 
- &pgarchivecleanup;
  &pgstandby;
  &pgtestfsync;
  &pgtesttiming;
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 10c9a6d..62267db 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -257,6 +257,7 @@ PostgreSQL Server Applications
   
 
&initdb;
+   &pgarchivecleanup;
&pgControldata;
&pgCtl;
&pgResetxlog;
diff --git a/src/bin/Makefile b/src/bin/Makefile
index f0589f3..0e7b183 100644
--- a/src/bin/Makefile
+++ b/src/bin/Makefile
@@ -15,6 +15,7 @@ include $(top_builddir)/src/Makefile.global
 
 SUBDIRS = \
 	initdb \
+	pg_archivecleanup \
 	pg_basebackup \
 	pg_config \
 	pg_controldata \
diff --git a/contrib/pg_archivecleanup/.gitignore b/src/bin/pg_archivecleanup/.gitignore
similarity index 100%
rename from contrib/pg_archivecleanup/.gitignore
rename to src/bin/pg_archivecleanup/.gitignore
diff --git a/src/bin/pg_archivecleanup/Makefile b/src/bin/pg_archivecleanup/Makefile
new file mode 100644
index 000..5df86eb
--- /dev/null
+++ b/src/bin/pg_archivecleanup/Makefile
@@ -0,0 +1,

Re: [HACKERS] pgcrypto: PGP signatures

2014-12-14 Thread Michael Paquier

On Wed, Nov 12, 2014 at 7:05 AM, Jeff Janes  wrote:
> On Sat, Nov 1, 2014 at 7:52 AM, Marko Tiikkaja  wrote:
>>
>> Hi,
>>
>> I discovered a problem with the lack of MDC handling in the signature info
>> extraction code, so I've fixed that and added a test message.  v9 here.
>>
>>
>>
>
> Hi Marko,
>
> I get a segfault when the length of the message is exactly 16308 bytes, see
> attached perl script.
>
> I can't get a backtrace, for some reason it acts as if there were no debug
> symbols despite that I built with them.  I've not seen that before.
>
> I get this whether or not the bug 11905 patch is applied, so the problem
> seems to be related but different.
This patch status was "Ready for committer" but it still has visibly
some bugs, and has not been updated in a while as pointed out by Jeff.
So I am switching it as "Returned with feedback".
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 10:59 AM, Peter Eisentraut  wrote:
> - no Windows build system support yet
Do you need some help here? I am getting worried about potential
breakage with the version information built with MSVC and MinGW..

> - We have traditionally kept an individual author acknowledgement in the
> man pages for contrib items.  Should those be removed now?
+1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-14 Thread Michael Paquier

On Fri, Dec 12, 2014 at 6:39 AM, Robert Haas  wrote:
> It's probably something we should add, but there's enough to do
> getting the basic feature working first.
Moving this patch to CF 2014-12 as work is still going on.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-14 Thread Amit Langote


Alvaro wrote:
> Claudio Freire wrote:
> 
> > Fair enough, but that's not the same as not requiring easy proofs. The
> > planner might not the one doing the proofs, but you still need proofs.
> >
> > Even if the proving method is hardcoded into the partitioning method,
> > as in the case of list or range partitioning, it's still a proof. With
> > arbitrary functions (which is what prompted me to mention proofs) you
> > can't do that. A function works very well for inserting, but not for
> > selecting.
> >
> > I could be wrong though. Maybe there's a way to turn SQL functions
> > into analyzable things? But it would still be very easy to shoot
> > yourself in the foot by writing one that is too complex.
> 
> Arbitrary SQL expressions (including functions) are not the thing to use
> for partitioning -- at least that's how I understand this whole
> discussion.  I don't think you want to do "proofs" as such -- they are
> expensive.
> 

This means if a user puts arbitrary expressions in a partition definition, say,

... FOR VALUES  extract(month from current_date) TO extract(month from 
current_date + interval '3 months'),

we make sure that those expressions are pre-computed to literal values. The 
exact time when that happens is open for discussion I guess. It could be either 
DDL time or, if feasible, during relation cache building when we compute the 
value from pg_node_tree of this expression which we may choose to store in the 
partition definition catalog. The former entails an obvious challenge of 
figuring out how we store the computed value into catalog (pg_node_tree of a 
Const?).

> To make this discussion a bit clearer, there are two things to
> distinguish: one is routing tuples, when an INSERT or COPY command
> references the partitioned table, into the individual partitions
> (ingress); the other is deciding which partitions to read when a SELECT
> query wants to read tuples from the partitioned table (egress).
> 
> On ingress, what you want is something like being able to do something
> on the tuple that tells you which partition it belongs into.  Ideally
> this is something much lighter than running an expression; if you can
> just apply an operator to the partitioning column values, that should be
> plenty fast.  This requires no proof.
> 

And I am thinking this's all executor stuff.

> On egress you need some direct way to compare the scan quals with the
> partitioning values.  I would imagine this to be similar to how scan
> quals are compared to the values stored in a BRIN index: each scan qual
> has a corresponding operator strategy and a scan key, and you can say
> "aye" or "nay" based on a small set of operations that can be run
> cheaply, again without any proof or running arbitrary expressions.
> 

My knowledge of this is far from being perfect, though to clear any confusions -

As far as planning is concerned, I could not imagine how index access method 
way of pruning partitions could be made to work. Of course, I may be missing 
something. 

When you say "scan qual has a corresponding operator strategy", I'd think that 
is a part of scan key in executor, no?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-12-14 Thread Michael Paquier

On Wed, Dec 3, 2014 at 10:43 AM, Peter Geoghegan  wrote:
> On Tue, Dec 2, 2014 at 5:28 PM, Peter Geoghegan  wrote:
>> Attached, revised patchset makes these updates.
>
> Whoops. Missed some obsolete comments. Here is a third commit that
> makes a further small modification to one comment.
Moving this patch to CF 2014-12 as more review seems to be needed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-14 Thread Michael Paquier

On Mon, Dec 8, 2014 at 9:39 AM, Michael Paquier
 wrote:
> On Tue, Dec 2, 2014 at 2:14 PM, Jeff Davis  wrote:
>> On Sun, 2014-11-30 at 17:49 -0800, Peter Geoghegan wrote:
>>> On Mon, Nov 17, 2014 at 11:39 PM, Jeff Davis  wrote:
>>> > I can also just move isReset there, and keep mem_allocated as a uint64.
>>> > That way, if I find later that I want to track the aggregated value for
>>> > the child contexts as well, I can split it into two uint32s. I'll hold
>>> > off any any such optimizations until I see some numbers from HashAgg
>>> > though.
>>>
>>> I took a quick look at memory-accounting-v8.patch.
>>>
>>> Is there some reason why mem_allocated is a uint64? All other things
>>> being equal, I'd follow the example of tuplesort.c's
>>> MemoryContextAllocHuge() API, which (following bugfix commit
>>> 79e0f87a1) uses int64 variables to track available memory and so on.
>>
>> No reason. New version attached; that's the only change.
> Note that I am marking this patch back to "Needs Review" state. I
Moving to CF 2014-12.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.

2014-12-14 Thread Etsuro Fujita

(2014/12/13 1:17), Tom Lane wrote:
> Etsuro Fujita  writes:
>>> (2014/12/12 10:37), Tom Lane wrote:
 Yeah, this is clearly a thinko: really, nothing in the planner should
 be using get_parse_rowmark().  I looked around for other errors of the
 same type and found that postgresGetForeignPlan() is also using
 get_parse_rowmark().  While that's harmless at the moment because we
 don't support foreign tables as children, it's still wrong.

>> In order
>> to get the locking strength, I think we need to see the RowMarkClauses
>> and thus still need to use get_parse_rowmark() in
>> postgresGetForeignPlan(), though I agree with you that that is ugly.

> I think this needs more thought; I'm still convinced that having the FDW
> look at the parse rowmarks is the Wrong Thing.  However, we don't need
> to solve it in existing branches.  With 9.4 release so close, the right
> thing is to revert that change for now and consider a HEAD-only patch
> later.

OK

> (One idea is to go ahead and make a ROW_MARK_COPY item, but
> add a field to PlanRowMark to record the original value.

+1

> We should
> probably also think about allowing FDWs to change these settings if
> they want to.

This is not clear to me.  Maybe I'm missing something, but I think that
the FDW only needs to look at the original locking strength in
GetForeignPlan().  Please explain that in a little more detail.

Thanks,

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] replicating DROP commands across servers

2014-12-14 Thread Michael Paquier

On Thu, Oct 16, 2014 at 7:01 AM, Alvaro Herrera
 wrote:
> Alvaro Herrera wrote:
>> Andres Freund wrote:
>>
>> > Having reread the patch just now I basically see two things to
>> > criticize:
>> > a) why isn't this accessible at SQL level? That seems easy to address.
>> > b) Arguably some of this could well be done in separate commits.
>>
>> Fair comments.  I will split it up.
>
> Here's a split version.  The last part is still missing some polish --
> in particular handling for OBJECT_POLICY, and the SQL interface which
> would also allow us to get something in the regression tests.
>
>
> Note: in this patch series you can find the ObjectTypeMap thing that you
> thought was obsolete in the DDL deparse patch ...

This patch has had no activity for the last two months, is in "Needs
Review" state and has marked as reviewer Dimitri. As there is no
activity from the reviewer, I am moving that to CF 2014-12 and
removing Dimitri as reviewer. If someone wants to have a look at this
patch, feel free to do so. Dimitri, if you are still planning to look
at it, please re-add your name.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-14 Thread Peter Eisentraut

On 12/14/14 9:08 PM, Michael Paquier wrote:
>> - no Windows build system support yet
> Do you need some help here?

Sure.

> I am getting worried about potential
> breakage with the version information built with MSVC and MinGW..

I don't know what that is.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-12-14 Thread Michael Paquier

On Fri, Oct 17, 2014 at 7:44 AM, Simon Riggs  wrote:
> Thanks for the review.
>
> On 16 October 2014 23:23, MauMau  wrote:
>
>> (3)
>> SELECT against a view generated two audit log lines, one for the view
>> itself, and the other for the underlying table.  Is this intended?  I'm not
>> saying that's wrong but just asking.
>
> Intended
>
>> (4)
>> I'm afraid audit-logging DML statements on temporary tables will annoy
>> users, because temporary tables are not interesting.
>
> Agreed.
>
>> (5)
>> This is related to (4).  As somebody mentioned, I think the ability to
>> select target objects of audit logging is definitely necessary.  Without
>> that, huge amount of audit logs would be generated for uninteresting
>> objects.  That would also impact performance.
>
> Discussed and agreed already
>
>> (6)
>> What's the performance impact of audit logging?  I bet many users will ask
>> "about what percentage would the throughtput decrease by?"  I'd like to know
>> the concrete example, say, pgbench and DBT-2.
>
> Don't know, but its not hugely relevant. If you need it, you need it.

Do you have real numbers about the performance impact that this module
has when plugged in the server? If yes, it would be good to get an
idea of how much audit costs and if it has a clear performance impact
this should be documented to warn the user. Some users may really need
audit features as you mention, but the performance drop could be an
obstacle for others so they may prefer continue betting on performance
instead of pgaudit.

>> (8)
>> The code looks good.  However, I'm worried about the maintenance.  How can
>> developers notice that pgaudit.c needs modification when they add a new SQL
>> statement?  What keyword can they use to grep the source code to find
>> pgaudit.c?
>
> Suggestions welcome.
Where are we on this? This patch has no activity for the last two
months... So marking it as returned with feedback. It would be good to
see actual numbers about the performance impact this involves.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 11:45 AM, Peter Eisentraut  wrote:
> On 12/14/14 9:08 PM, Michael Paquier wrote:
>>> - no Windows build system support yet
>> Do you need some help here?
>
> Sure.
>
>> I am getting worried about potential
>> breakage with the version information built with MSVC and MinGW..
>
> I don't know what that is.
File version information for all the binaries and libraries produced
by compilation, per se for example ee9569e.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: multivariate statistics / proof of concept

2014-12-14 Thread Michael Paquier

On Wed, Dec 10, 2014 at 5:15 AM, Tomas Vondra  wrote:
> I agree with moving the patch to the next CF - I'm working on the patch,
> but I will take a bit more time to submit a new version and I can do
> that in the next CF.
OK cool. I just moved it by myself. I didn't see it yet registered in 2014-12.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-12-14 Thread Michael Paquier

On Thu, Nov 27, 2014 at 8:51 PM, Alex Shulgin  wrote:
>
> Dag-Erling Smørgrav  writes:
>
>> Alex Shulgin  writes:
>>> OK, looks like I've come up with something workable: I've added
>>> sslprotocol connection string keyword similar to pre-existing
>>> sslcompression, etc.  Please see attached v2 of the original patch.
>>> I'm having doubts about the name of openssl.h header though,
>>> libpq-openssl.h?
>>
>> Perhaps ssloptions.[ch], unless you plan to add non-option-related code
>> there later?
>
> I don't think anything else than common options-related code fits in
> there, so ssloptions.c makes sense to me.
>
>> BTW, there is no Regent code in your openssl.c, so the copyright
>> statement is incorrect.
>
> Good catch, I just blindly copied that from some other file.
There have been arguments in favor and against this patch... But
marking it as returned with feedback because of a lack of activity in
the last couple of weeks. Feel free to update if you think that's not
correct, and please move this patch to commit fest 2014-12.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: dynahash replacement for buffer table

2014-12-14 Thread Michael Paquier

There hasn't been much activity on this patch these days, and Andres
has provided some feedback, hence switching to Returned with feedback.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar  wrote:
> Peter, while reviewing the better performing patch myself, now i think the
> patch needs more work to be committed. The structuring of the method will be
> confusing in the long term. I think I'll restructure the patch in the next
> commitfest.
> So i propose to break the patch:
> 1. We apply the current patch which uses xmlNodeCopy, so that the
> long-standing bug will be fixed in postgres.
> 2. I'll work with the performance enhancement in the next commitfest.
>
> Maybe for (2), the current better-performing patch can be viewed as PoC of
> the expected performance.

Ali, are you currently working on that? Would you mind re-creating new
entries in the commit fest app for the new set of patches that you are
planning to do?
For now I am switching this patch as returned with feedback.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(numeric, numeric)

2014-12-14 Thread Andrew Gierth

> "Fujii" == Fujii Masao  writes:

 Fujii> Pushed.

Bug found:

regression=# select count(*) from generate_series(1::numeric,10) v, 
generate_series(1,v) w;
 count 
---
 0
(1 row)

regression=# select count(*) from generate_series(1::numeric,10) v, 
generate_series(1,v+0) w;
 count 
---
55
(1 row)

The error is in the use of PG_GETARG_NUMERIC and init_var_from_num
when setting up the multi-call state; init_var_from_num points at the
original num's digits rather than copying them, but start_num and
stop_num have just been (potentially) detoasted in the per-call
context, in which case the storage will have been freed by the next
call.

Obviously this could also be fixed by not detoasting the input until
after switching to the multi-call context, but it looks to me like
that would be unnecessarily complex.

Suggested patch attached.

(Is it also worth putting an explicit warning about this kind of thing
in the SRF docs?)

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index c73f9bc..d841b6f 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -1325,11 +1325,16 @@ generate_series_step_numeric(PG_FUNCTION_ARGS)
 
 		/*
 		 * Use fctx to keep state from call to call. Seed current with the
-		 * original start value.
+		 * original start value. We must copy the start_num and stop_num
+		 * values rather than pointing to them, since we may have detoasted
+		 * them in the per-call context.
 		 */
-		init_var_from_num(start_num, &fctx->current);
-		init_var_from_num(stop_num, &fctx->stop);
+		init_var(&fctx->current);
+		init_var(&fctx->stop);
 		init_var(&fctx->step);
+
+		set_var_from_num(start_num, &fctx->current);
+		set_var_from_num(stop_num, &fctx->stop);
 		set_var_from_var(&steploc, &fctx->step);
 
 		funcctx->user_fctx = fctx;
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index ee6cb50..9d68145 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1461,3 +1461,41 @@ select (i / (10::numeric ^ 131071))::numeric(1,0)
9
 (4 rows)
 
+-- Check usage with variables
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(i,3) j;
+ i | j 
+---+---
+ 1 | 1
+ 1 | 2
+ 1 | 3
+ 2 | 2
+ 2 | 3
+ 3 | 3
+(6 rows)
+
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,i) j;
+ i | j 
+---+---
+ 1 | 1
+ 2 | 1
+ 2 | 2
+ 3 | 1
+ 3 | 2
+ 3 | 3
+(6 rows)
+
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,5,i) j;
+ i | j 
+---+---
+ 1 | 1
+ 1 | 2
+ 1 | 3
+ 1 | 4
+ 1 | 5
+ 2 | 1
+ 2 | 3
+ 2 | 5
+ 3 | 1
+ 3 | 4
+(10 rows)
+
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a7e92ac..1633e4c 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -854,3 +854,7 @@ select (i / (10::numeric ^ 131071))::numeric(1,0)
 	from generate_series(6 * (10::numeric ^ 131071),
 			 9 * (10::numeric ^ 131071),
 			 10::numeric ^ 131071) as a(i);
+-- Check usage with variables
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(i,3) j;
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,i) j;
+select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,5,i) j;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user/role CURRENT_USER

2014-12-14 Thread Michael Paquier

On Fri, Nov 14, 2014 at 5:39 PM, Kyotaro HORIGUCHI
 wrote:
> Hi, this is revised version.
>
>> Kyotaro HORIGUCHI wrote:
>>
>> > - Storage for new information
>> >
>> > The new struct NameId stores an identifier which telling what it
>> > logically is using the new enum NameIdTypes.
>>
>> I think NameId is a bad name for this.  My point is that NameId, as it
>> stands, might be a name for anything, not just a role; and the object it
>> identifies is not an Id either.  Maybe RoleSpec?
>
> Yeah! I felt it no good even if it were a generic type for
> various "Name of something or its oid". RoleSpec sounds much better.
>
>> Do we need a public_ok
>> argument to get_nameid_oid() (get a better name for this function too)
>
> Maybe get_rolespec_oid() as a name ofter its parameter type?
>
>> so that callers don't have to check for InvalidOid argument?  I think
>> the arrangement you propose is not very convenient; it'd be best to
>> avoid duplicating the check for InvalidOid in all callers of the new
>> function, particularly where there was no check before.
>
> I agree that It'd be better keeping away from duplicated
> InvalidOid checks, but public_ok seems a bit myopic. Since
> there's no reasonable border between functions accepting 'public'
> and others, such kind of solution would not be reasonable..
>
> What about checking it being a PUBLIC or not *before* calling
> get_rolespec_oid()?
>
> The attached patch modified in the following points.
>
>  - rename NameId to RoleSpec and NameIdType to RoleSpecTypes.
>  - rename get_nameid_oid() to get_rolespec_oid().
>  - rename roleNamesToIds() to roleSpecsToIds().
>  - some struct members are changed such as authname to authrole.
>  - check if rolespec is "public" or not before calling get_rolespec_oid()
>  - ExecAlterDefaultPrivilegesStmt and ExecuteGrantStmt does
>slightly different things about ACL_ID_PUBLIC but I unified it
>to the latter.
>  - rebased to the current master
A new patch has been sent here but no review occurred, hence moving
this item to CF 2014-12.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-14 Thread Andrew Gierth

> "Michael" == Michael Paquier  writes:

 Michael> Based on those comments, I am marking this patch as
 Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew,
 Michael> feel free to move this entry to CF 2014-12 if you are
 Michael> planning to continue working on it so as it would get
 Michael> additional review. (Note that this patch status was "Waiting
 Michael> on Author" when writing this text).

Moved it to 2014-12 and set it back to "waiting on author". We expect to
submit a revised version, though I have no timescale yet.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Tom Lane

Amit Kapila  writes:
> On Mon, Dec 15, 2014 at 5:39 AM, Tom Lane  wrote:
>> What makes you think quote isn't allowed in tablespace paths?

> Below part of code makes me think that quote is not allowed.

> Oid
> CreateTableSpace(CreateTableSpaceStmt *stmt)
> {
> ..
> /* disallow quotes, else CREATE DATABASE would be at risk */
> if (strchr(location, '\''))
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_NAME),
> errmsg("tablespace location cannot contain single quotes")));
> }

Hm, I think that's left over from defending a *very* ancient version
of CREATE DATABASE.  In any case, as I mentioned, any limitations
we might be putting on tablespace paths during SQL-level operation
are pretty much a dead letter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final Patch for GROUPING SETS

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 12:28 PM, Andrew Gierth
 wrote:
>> "Michael" == Michael Paquier  writes:
>
>  Michael> Based on those comments, I am marking this patch as
>  Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew,
>  Michael> feel free to move this entry to CF 2014-12 if you are
>  Michael> planning to continue working on it so as it would get
>  Michael> additional review. (Note that this patch status was "Waiting
>  Michael> on Author" when writing this text).
>
> Moved it to 2014-12 and set it back to "waiting on author". We expect to
> submit a revised version, though I have no timescale yet.
OK thanks for the update.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Escaping from blocked send() reprised.

2014-12-14 Thread Michael Paquier

On Thu, Oct 30, 2014 at 10:27 PM, Heikki Linnakangas
 wrote:
> On 10/03/2014 06:29 PM, Heikki Linnakangas wrote:
>> [blah]
> About patch 3:
> Looking closer, this design still looks OK to me. As you said yourself, the
> comments need some work (e.g. the step 5. in the top comment in async.c
> needs updating). And then there are a couple of FIXME and XXX comments that
> need to be addressed.
Those patches have not been updated in a while, and I am seeing some
feedback from several people, hence returning as returned with
feedback. Horiguchi-san, feel free to add new entries on the CF app in
2014-12 or move this entry if you feel overwise.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

2014-12-14 Thread Michael Paquier

On Tue, Dec 9, 2014 at 11:56 PM, Robert Haas  wrote:
> On Mon, Dec 8, 2014 at 9:18 PM, Tom Lane  wrote:
>> Barring someone committing to spend the time to improve that situation
>> (time that would be poorly invested IMO), I don't think that we want to
>> open up ignore_system_indexes as USERSET, or do anything else to encourage
>> its use.
>>
>> If we're intent on removing PGC_BACKEND then I'd be okay with
>> reclassifying ignore_system_indexes as SUSET; but I'm not exactly
>> convinced that we should be trying to get rid of PGC_BACKEND.
>
> Well, if you want to discourage its use, I think there's an argument
> that marking it as SUSET would be more restrictive than what we have
> at present, since it would altogether prohibit non-superuser use.
>
> I'm not wedded to the idea of getting rid of PGC_BACKEND, but I do
> like it.  Peter's survey of the landscape seems to show that there's
> very little left in that category and the stuff that is there is
> somewhat uninspiring.  And simplifying things is always nice.

Documentation fixes for the use of local_preload_libraries have been
pushed, now there has been some wider discussion about changing the
mode of a couple of parameters since PGC_SU_BACKEND has been
introduced. Any problems to switch this patch to "Returned with
feedback"? The discussion done here is wider than the simple use of
local_preload_libraries in any case.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS with check option - surprised design

2014-12-14 Thread Michael Paquier

On Sat, Nov 22, 2014 at 5:59 AM, Stephen Frost  wrote:
> * Peter Geoghegan (p...@heroku.com) wrote:
>> On Fri, Nov 21, 2014 at 6:57 AM, Stephen Frost  wrote:
>> > [blah]
>> [re-blah]
> [re-re-blah]
This patch has already been committed, but there are still many
concerns shared, so moving it to CF 2014-12 as it needs more review.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of CF 2014-10 and upcoming 2014-12

2014-12-14 Thread Michael Paquier

On Sun, Dec 14, 2014 at 12:00 AM, Michael Paquier
 wrote:
> In any case, I will be able to do that on Monday.
> Opinions?

So I have been through all the remaining 20~25 patches of CF 2014-10,
and roughly moved them to the next CF 2014-12 or returned them with
feedback depending on their state and their stale period. If you feel
that your patch has not been treated correctly or has faced an
injustice, feel free to complain on this thread or on the thread of
the related patch. In any case, let's now move on to CF 2014-12, and
please be sure to check the status of the patch you submitted
previously if it was still on the works.

PS: Could someone close CF 2014-10 btw?)
Regards,
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

2014-12-15 10:19 GMT+07:00 Michael Paquier :
>
> On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar  wrote:
> > Peter, while reviewing the better performing patch myself, now i think
> the
> > patch needs more work to be committed. The structuring of the method
> will be
> > confusing in the long term. I think I'll restructure the patch in the
> next
> > commitfest.
> > So i propose to break the patch:
> > 1. We apply the current patch which uses xmlNodeCopy, so that the
> > long-standing bug will be fixed in postgres.
> > 2. I'll work with the performance enhancement in the next commitfest.
> >
> > Maybe for (2), the current better-performing patch can be viewed as PoC
> of
> > the expected performance.
>
> Ali, are you currently working on that? Would you mind re-creating new
> entries in the commit fest app for the new set of patches that you are
> planning to do?
> For now I am switching this patch as returned with feedback.
> Thanks,
>

What i mean, the last patch (v7 patch) as it is is enough to fix the bug
(nested xpath namespace problem). I think the performance regression is
still acceptable (in my case it's ~20%), because the bug is severe.
Currently, xpath can return invalid xml because the namespace isn't
included in the output!

What i'll be working is the v4 patch, because it turns out the v4 patch has
better performance (~10%, but Peter's test shows it isn't the case). But,
the problem is the v4 patch is organized wrongly, and hacks around the
libxml's xml node structure (duplicating the namespace on the existing
structure). I'll work on that, but it will affects the performance benefit.

So what i propose is, we close the longstanding bug in this comitfest with
the v7 patch. I'll work on improving the performance, without compromising
good code structure. If the result is good, i'll submit the patch.

Thanks

Regards,
-- 
Ali Akbar

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

2014-12-15 11:02 GMT+07:00 Ali Akbar :
>
> 2014-12-15 10:19 GMT+07:00 Michael Paquier :
>>
>> On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar  wrote:
>> > Peter, while reviewing the better performing patch myself, now i think
>> the
>> > patch needs more work to be committed. The structuring of the method
>> will be
>> > confusing in the long term. I think I'll restructure the patch in the
>> next
>> > commitfest.
>> > So i propose to break the patch:
>> > 1. We apply the current patch which uses xmlNodeCopy, so that the
>> > long-standing bug will be fixed in postgres.
>> > 2. I'll work with the performance enhancement in the next commitfest.
>> >
>> > Maybe for (2), the current better-performing patch can be viewed as PoC
>> of
>> > the expected performance.
>>
>> Ali, are you currently working on that? Would you mind re-creating new
>> entries in the commit fest app for the new set of patches that you are
>> planning to do?
>> For now I am switching this patch as returned with feedback.
>> Thanks,
>>
>
> What i mean, the last patch (v7 patch) as it is is enough to fix the bug
> (nested xpath namespace problem). I think the performance regression is
> still acceptable (in my case it's ~20%), because the bug is severe.
> Currently, xpath can return invalid xml because the namespace isn't
> included in the output!
>

Sorry, typo here. What i mean isn't "invalid xml", but "incomplete xml".
Hence the next call to xpath with the previous result as its input will
become a problem because the namespace will not match.


>
> What i'll be working is the v4 patch, because it turns out the v4 patch
> has better performance (~10%, but Peter's test shows it isn't the case).
> But, the problem is the v4 patch is organized wrongly, and hacks around the
> libxml's xml node structure (duplicating the namespace on the existing
> structure). I'll work on that, but it will affects the performance benefit.
>
> So what i propose is, we close the longstanding bug in this comitfest with
> the v7 patch. I'll work on improving the performance, without compromising
> good code structure. If the result is good, i'll submit the patch.
>
> Thanks
>

Regards,
-- 
Ali Akbar

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar  wrote:
> 2014-12-15 10:19 GMT+07:00 Michael Paquier :
>>
>> On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar  wrote:
>> > Peter, while reviewing the better performing patch myself, now i think
>> > the
>> > patch needs more work to be committed. The structuring of the method
>> > will be
>> > confusing in the long term. I think I'll restructure the patch in the
>> > next
>> > commitfest.
>> > So i propose to break the patch:
>> > 1. We apply the current patch which uses xmlNodeCopy, so that the
>> > long-standing bug will be fixed in postgres.
>> > 2. I'll work with the performance enhancement in the next commitfest.
>> >
>> > Maybe for (2), the current better-performing patch can be viewed as PoC
>> > of
>> > the expected performance.
>>
>> Ali, are you currently working on that? Would you mind re-creating new
>> entries in the commit fest app for the new set of patches that you are
>> planning to do?
>> For now I am switching this patch as returned with feedback.
>> Thanks,
>
>
> What i mean, the last patch (v7 patch) as it is is enough to fix the bug
> (nested xpath namespace problem). I think the performance regression is
> still acceptable (in my case it's ~20%), because the bug is severe.
> Currently, xpath can return invalid xml because the namespace isn't included
> in the output!
>
> What i'll be working is the v4 patch, because it turns out the v4 patch has
> better performance (~10%, but Peter's test shows it isn't the case). But,
> the problem is the v4 patch is organized wrongly, and hacks around the
> libxml's xml node structure (duplicating the namespace on the existing
> structure). I'll work on that, but it will affects the performance benefit.
>
> So what i propose is, we close the longstanding bug in this comitfest with
> the v7 patch. I'll work on improving the performance, without compromising
> good code structure. If the result is good, i'll submit the patch.
OK. Could you then move this patch to the new CF with Needs Review
with Peter as reviewer then? He seems to be looking at it anyway
seeing the update from 12/11.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-12-14 Thread Ali Akbar

2014-12-15 11:06 GMT+07:00 Michael Paquier :
>
> On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar  wrote:
> > 2014-12-15 10:19 GMT+07:00 Michael Paquier :
> >>
> >> On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar  wrote:
> >> > Peter, while reviewing the better performing patch myself, now i think
> >> > the
> >> > patch needs more work to be committed. The structuring of the method
> >> > will be
> >> > confusing in the long term. I think I'll restructure the patch in the
> >> > next
> >> > commitfest.
> >> > So i propose to break the patch:
> >> > 1. We apply the current patch which uses xmlNodeCopy, so that the
> >> > long-standing bug will be fixed in postgres.
> >> > 2. I'll work with the performance enhancement in the next commitfest.
> >> >
> >> > Maybe for (2), the current better-performing patch can be viewed as
> PoC
> >> > of
> >> > the expected performance.
> >>
> >> Ali, are you currently working on that? Would you mind re-creating new
> >> entries in the commit fest app for the new set of patches that you are
> >> planning to do?
> >> For now I am switching this patch as returned with feedback.
> >> Thanks,
> >
> >
> > What i mean, the last patch (v7 patch) as it is is enough to fix the bug
> > (nested xpath namespace problem). I think the performance regression is
> > still acceptable (in my case it's ~20%), because the bug is severe.
> > Currently, xpath can return invalid xml because the namespace isn't
> included
> > in the output!
> >
> > What i'll be working is the v4 patch, because it turns out the v4 patch
> has
> > better performance (~10%, but Peter's test shows it isn't the case). But,
> > the problem is the v4 patch is organized wrongly, and hacks around the
> > libxml's xml node structure (duplicating the namespace on the existing
> > structure). I'll work on that, but it will affects the performance
> benefit.
> >
> > So what i propose is, we close the longstanding bug in this comitfest
> with
> > the v7 patch. I'll work on improving the performance, without
> compromising
> > good code structure. If the result is good, i'll submit the patch.
> OK. Could you then move this patch to the new CF with Needs Review
>
with Peter as reviewer then? He seems to be looking at it anyway
> seeing the update from 12/11.
>

OK. Moved to the new CF.

Regards,
-- 
Ali Akbar

Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-12-14 Thread Amit Kapila

On Sat, Dec 13, 2014 at 9:41 PM, Andrew Dunstan  wrote:
> On 11/20/2014 02:27 AM, Amit Kapila wrote:
>>
>> On Wed, Nov 19, 2014 at 11:46 PM, Robert Haas mailto:robertmh...@gmail.com>> wrote:
>> > On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera
>> > mailto:alvhe...@2ndquadrant.com>> wrote:
>> > >> Right, but they provide same functionality as symlinks and now we
>> > >> are even planing to provide this feature for both linux and windows
as
>> > >> both Tom and Robert seems to feel, it's better that way.  Anyhow,
>> > >> I think naming any entity generally differs based on individual's
>> > >> perspective, so we can go with the name which appeals to more
people.
>> > >> In case, nobody else has any preference, I will change it to what
both
>> > >> of us can agree upon (either 'tablespace catalog',
'tablespace_info' ...).
>> > >
>> > > Well, I have made my argument.  Since you're the submitter, feel
free to
>> > > select what you think is the best name.
>> >
>> > For what it's worth, I, too, dislike having symlink in the name.
>> > Maybe "tablespace_map"?
>>
>> Sounds good to me as well.
>>
>> To summarize the situation of this patch, I have received below comments
>> on which I am planning to work:
>>
>> 1. Change the name of file containing tablespace path information.
>> 2. Store tablespace name as well along with oid and path to make the
>> information Human readable.
>> 3. Make the code generic (Remove #ifdef Win32 macro's and change
>> comments referring this functionality for windows and see if any more
>> changes are required to make it work on linux.)
>>
>> Now the part where I would like to receive feedback before revising the
>> patch is on the coding style.  It seems to me from Tom's comments that
>> he is not happy with the code, now I am not sure which part of the patch
>> he thinks needs change.  Tom if possible, could you be slightly more
>> specific about your concern w.r.t code?
>>
>> I have attached a rebased (on top of commit-8d7af8f) patch, just incase
>> some one wants to apply and check it.
>>
>
>
> In view of the request above for comments from Tom, I have moved this
back to "Needs Review".
>

I am working on fixing the review comments, but I think I won't be
able to handle all as still there is discussion going on for some of
the comments, but I am intended to work on it for CF starting today.
So I have moved this patch to CF (2014-12).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Add generate_series(numeric, numeric)

2014-12-14 Thread Ali Akbar

2014-12-15 10:25 GMT+07:00 Andrew Gierth :
>
> > "Fujii" == Fujii Masao  writes:
>
>  Fujii> Pushed.
>
> Bug found:
>
> regression=# select count(*) from generate_series(1::numeric,10) v,
> generate_series(1,v) w;
>  count
> ---
>  0
> (1 row)
>
> regression=# select count(*) from generate_series(1::numeric,10) v,
> generate_series(1,v+0) w;
>  count
> ---
> 55
> (1 row)
>
> The error is in the use of PG_GETARG_NUMERIC and init_var_from_num
> when setting up the multi-call state; init_var_from_num points at the
> original num's digits rather than copying them, but start_num and
> stop_num have just been (potentially) detoasted in the per-call
> context, in which case the storage will have been freed by the next
> call.
>

Oops.

Obviously this could also be fixed by not detoasting the input until
> after switching to the multi-call context, but it looks to me like
> that would be unnecessarily complex.
>
> Suggested patch attached.
>

Thanks


> (Is it also worth putting an explicit warning about this kind of thing
> in the SRF docs?)
>

 I think yes, it will be good. The alternative is restructuring this
paragraph in the SRF docs:

The memory context that is current when the SRF is called is a transient
> context that will be cleared between calls. This means that you do not need
> to call pfree on everything you allocated using palloc; it will go away
> anyway. However, if you want to allocate any data structures to live across
> calls, you need to put them somewhere else. The memory context referenced
> by multi_call_memory_ctx is a suitable location for any data that needs
> to survive until the SRF is finished running. In most cases, this means
> that you should switch into multi_call_memory_ctx while doing the
> first-call setup.
>

The important part "However, if you want to allocate any data structures to
live across calls, you need to put them somewhere else." is buried in the
docs.

But i think explicit warning is more noticeable for new developer like me.

Regards,
-- 
Ali Akbar

[HACKERS] Commit fest 2014-12, let's begin!

2014-12-14 Thread Michael Paquier

Hi all,

Let's begin the commit fest of December:
https://commitfest.postgresql.org/action/commitfest_view?id=25

Looking at the stats at the top of the above page, there are currently
56 patches to be dealt with:
- Needs Review: 48
- Waiting on Author: 5
- Ready for Committer: 3
In short we are in cruel need of reviewers to make things move on.

To reviewers:
- This commit fest may be a good occasion for you to study new parts
of the code, there are many simple patches in need of love.
- If you are marked as a reviewer for a given patch, but you know that
you won't have time to look at it, it is advised that yoy remove your
name to give room to someone else.
- Be sure to keep the information of the patch you are reviewing fresh
on the CF app. This needs at most 2 minutes.
- Feel free to ping the patch submitter if what you reviewed does not
get updated. People have their daily jobs and are busy, so try to not
be too pushy. A delay of one week may give a good base, but that's
holiday season in many places.

And to patch submitters:
- Please review one patch of equivalent difficulty with your own patch
to help keeping a balance between reviews each patch
- Be sure to keep the information about your patch up-to-date
- Feel free to ping your reviewer if you have no news and need more feedback

To committers: there are 2 patches marked as ready for committer from
previous commit fest:
- Foreign table inheritance
- Point to polygon distance operator

This is going to be a period of vacations for many people here, so
let's do our best.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(numeric, numeric)

2014-12-14 Thread Andrew Gierth

> "Ali" == Ali Akbar  writes:

 Ali>  I think yes, it will be good. The alternative is restructuring
 Ali> this paragraph in the SRF docs:

 >> The memory context that is current when the SRF is called is a
 >> transient context that will be cleared between calls. This means
 >> that you do not need to call pfree on everything you allocated
 >> using palloc; it will go away anyway. However, if you want to
 >> allocate any data structures to live across calls, you need to put
 >> them somewhere else. The memory context referenced by
 >> multi_call_memory_ctx is a suitable location for any data that
 >> needs to survive until the SRF is finished running. In most cases,
 >> this means that you should switch into multi_call_memory_ctx while
 >> doing the first-call setup.

 Ali> The important part "However, if you want to allocate any data
 Ali> structures to live across calls, you need to put them somewhere
 Ali> else." is buried in the docs.

 Ali> But i think explicit warning is more noticeable for new
 Ali> developer like me.

I was thinking something like this, added just after that para:

  While the actual arguments to the function remain unchanged between
  calls, if you detoast the argument values (which is normally done
  transparently by the
  PG_GETARG_xxx macro)
  in the transient context then the detoasted copies will be freed on
  each cycle. Accordingly, if you keep references to such values in
  your user_fctx, you must either copy them into the
  multi_call_memory_ctx after detoasting, or ensure
  that you detoast the values only in that context.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logical Replication Helpers WIP for discussion

2014-12-14 Thread Petr Jelinek


Hello,

we've made few helper functions for making logical replication easier, I 
bundled it into contrib module as this is mainly for discussion at this 
time (I don't expect this to get committed any time soon, but it is good 
way to iron out protocol, etc).


I created sample logical decoding plugin that uses those functions and 
which can be used for passing DML changes in platform/version 
independent (hopefully) format.


I will post sample apply BG worker also once I get some initial feedback 
about this.


It's hard to write tests for this as the binary changes contain 
transaction ids and timestamps so the data changes constantly.


This is of course based on the BDR work Andres, Craig and myself have 
been doing.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/contrib/lrep/Makefile b/contrib/lrep/Makefile
new file mode 100644
index 000..d3f7ba3
--- /dev/null
+++ b/contrib/lrep/Makefile
@@ -0,0 +1,20 @@
+# contrib/lrep/Makefile
+
+MODULE_big = lrep
+OBJS = lrep_utils.o lrep_output.o $(WIN32RES)
+PG_CPPFLAGS = -I$(libpq_srcdir)
+
+#EXTENSION = lrep
+#REGRESS = lrep
+#REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/lrep/logical.conf
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/lrep
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/lrep/lrep.h b/contrib/lrep/lrep.h
new file mode 100644
index 000..e2910c4
--- /dev/null
+++ b/contrib/lrep/lrep.h
@@ -0,0 +1,108 @@
+/*-
+ *
+ * lrep.h
+ *		LREP public interfaces
+ *
+ * Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		contrib/lrep/lrep.h
+ *
+ *-
+ */
+#ifndef LREP_H
+#define LREP_H
+
+#include "libpq-fe.h"
+
+#include "nodes/parsenodes.h"
+
+#include "replication/logical.h"
+#include "replication/output_plugin.h"
+
+#include "storage/lock.h"
+
+#define LREP_PROTO_VERSION_NUM 1
+#define LREP_PROTO_MIN_REMOTE_VERSION_NUM 1
+
+typedef struct
+{
+	MemoryContext context;
+
+	bool allow_binary_protocol;
+	bool allow_sendrecv_protocol;
+	bool int_datetime_mismatch;
+	bool forward_changesets;
+
+	uint32 client_pg_version;
+	uint32 client_pg_catversion;
+	uint32 client_proto_version;
+	uint32 client_min_proto_version;
+	size_t client_sizeof_int;
+	size_t client_sizeof_long;
+	size_t client_sizeof_datum;
+	size_t client_maxalign;
+	bool client_bigendian;
+	bool client_float4_byval;
+	bool client_float8_byval;
+	bool client_int_datetime;
+	char *client_db_encoding;
+
+	void *extra;		/* Additional data */
+} LREPOutputData;
+
+
+typedef struct LREPTupleData
+{
+	Datum	values[MaxTupleAttributeNumber];
+	bool	nulls[MaxTupleAttributeNumber];
+	bool	changed[MaxTupleAttributeNumber];
+} LREPTupleData;
+
+
+extern void lrep_write_begin(StringInfo out, ReorderBufferTXN *txn, int flags,
+			 StringInfo extradata);
+extern void lrep_write_commit(StringInfo out, ReorderBufferTXN *txn,
+			  XLogRecPtr commit_lsn, int flags,
+			  StringInfo extradata);
+
+extern void lrep_write_insert(LREPOutputData *data, StringInfo out,
+			  Relation rel, HeapTuple newtuple);
+extern void lrep_write_update(LREPOutputData *data, StringInfo out,
+			  Relation rel, HeapTuple oldtuple,
+			  HeapTuple newtuple);
+extern void lrep_write_delete(LREPOutputData *data, StringInfo out,
+			  Relation rel, HeapTuple oldtuple);
+
+extern void lrep_write_rel(StringInfo out, Relation rel);
+extern void lrep_write_tuple(LREPOutputData *data, StringInfo out, Relation rel,
+		HeapTuple tuple);
+
+extern int lrep_read_begin(StringInfo in, XLogRecPtr *origlsn,
+		   TimestampTz *committime, TransactionId *remote_xid);
+extern int lrep_read_commit(StringInfo in, XLogRecPtr *commit_lsn,
+			XLogRecPtr *end_lsn, TimestampTz *committime);
+extern Relation lrep_read_insert(StringInfo in, LOCKMODE lockmode,
+ LREPTupleData *tuple);
+extern Relation lrep_read_update(StringInfo in, LOCKMODE lockmode,
+ LREPTupleData *oldtuple,
+ LREPTupleData *newtuple, bool *pkeysent);
+extern Relation lrep_read_delete(StringInfo in, LOCKMODE lockmode,
+ LREPTupleData *tuple, bool *pkeysent);
+
+extern void lrep_read_tuple_parts(StringInfo s, TupleDesc desc,
+  LREPTupleData *tuple);
+extern RangeVar *lrep_read_rel(StringInfo s);
+
+extern bool lrep_send_feedback(PGconn *conn, XLogRecPtr recvpos,
+			   XLogRecPtr writepos, XLogRecPtr flushpos,
+			   int64 now, bool force);
+
+extern void lrep_opt_parse_notnull(DefElem *elem, const char *paramtype);
+extern void lrep_opt_parse_uint32(DefElem *elem, uint32 *res);
+extern void lrep_opt_parse_size_t(DefElem *elem, size_t *res);
+extern void lrep_opt_p

Re: [HACKERS] Commit fest 2014-12, let's begin!

2014-12-14 Thread Tom Lane

Michael Paquier  writes:
> Let's begin the commit fest of December:
> https://commitfest.postgresql.org/action/commitfest_view?id=25
> ...
> To committers: there are 2 patches marked as ready for committer from
> previous commit fest:
> - Foreign table inheritance

I should probably take this one.

> - Point to polygon distance operator

I looked at that briefly during the last fest, but was unsure whether it
was too entangled with the GiST patches that Heikki was looking at.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add transforms feature

2014-12-14 Thread Tom Lane

Peter Eisentraut  writes:
> On 4/4/14 6:21 PM, Andres Freund wrote:

> + /* dependency on transform used by return type, if any */
> + if ((trfid = get_transform_oid(returnType, languageObjectId, true)))

>> Should be compared to InvalidOid imo, rather than implicitly assuming
>> that InvalidOid evaluates to false.

> I think it's widely assumed that InvalidOid is false.

That's not the point; the point is that some nonzero number of compilers
(and probably Coverity too) will warn about this construct, on the not
unreasonable grounds that = might be a typo for ==.  (Those extra parens
might satisfy gcc, but not other tools.)  Please put in an explicit
comparison of the assignment result, as is done in approximately 100% of
the other places where this idiom appears in Postgres.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commit fest 2014-12, let's begin!

2014-12-14 Thread Michael Paquier

On Mon, Dec 15, 2014 at 3:09 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> - Point to polygon distance operator
> I looked at that briefly during the last fest, but was unsure whether it
> was too entangled with the GiST patches that Heikki was looking at.
Recalling my memories of this morning, things are rather independent.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-14 Thread Michael Paquier

Note: this patch has been moved to CF 2014-12 and I marked myself as
an author if that's fine... I've finished by being really involved in
that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-14 Thread Claudio Freire

On Sun, Dec 14, 2014 at 11:12 PM, Amit Langote
 wrote:
>> On egress you need some direct way to compare the scan quals with the
>> partitioning values.  I would imagine this to be similar to how scan
>> quals are compared to the values stored in a BRIN index: each scan qual
>> has a corresponding operator strategy and a scan key, and you can say
>> "aye" or "nay" based on a small set of operations that can be run
>> cheaply, again without any proof or running arbitrary expressions.
>>
>
> My knowledge of this is far from being perfect, though to clear any 
> confusions -
>
> As far as planning is concerned, I could not imagine how index access method 
> way of pruning partitions could be made to work. Of course, I may be missing 
> something.

Let me be overly verbose, don't take it as patronizing, just answering
in lots of detail why this could be a good idea to try.

Normal indexes store a pointer for each key value of sorts. So B-Tree
gets you a set of tids for each key, and so does GIN and hash.

But BRIN is different in that it does the mapping differently. BRIN
stores a compact, approximate representation of the set of keys within
a page range. It can tell with some degree of (in)accuracy whether a
key or key range could be part of that page range or not. The way it
does this is abstracted out, but at its core, it stores a "compressed"
representation of the key set that takes a constant amount of bits to
store, and no more, no matter how many elements. What changes as the
element it represents grows, is its accuracy.

Currently, BRIN only supports min-max representations. It will store,
for each page range, the minimum and maximum of some columns, and when
you query it, you can compare range vs range, and discard whole page
ranges.

Now, what are partitions, if not page ranges?

A BRIN tuple is a min-max pair. But BRIN in more general, it could use
other data structures to hold that "compressed representation", if
someone implemented them. Like bloom filters [0].

A BRIN index is a complex data structure because it has to account for
physically growing tables, but all the complexities vanish when you
fix a "block range" (the BR in BRIN) to a partition. Then, a mere
array of BRIN tuples would suffice.

BRIN already contains the machinery to turn quals into something that
filters out entire partitions, if you provide the BRIN tuples.

And you could even effectively matain a BRIN index for the partitions
(just a BRIN tuple per partition, dynamically updated with every
insertion).

If you do that, you start with empty partitions, and each insert
updates the BRIN tuple. Avoiding concurrency loss in this case would
be tricky, but in theory this could allow very general partition
exclusion. In fact it could even work with constraint exclusion right
now: you'd have a single-tuple BRIN index for each partition and
benefit from it.

But you don't need to pay the price of updating BRIN indexes, as
min-max tuples for each partition can be produced while creating the
partitions if the syntax already provides the information. Then, it's
just a matter of querying this meta-data which just happens to have
the form of a BRIN tuple for each partition.

[0] http://en.wikipedia.org/wiki/Bloom_filter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Comment typo in typedef struct BrinTuple

2014-12-14 Thread Amit Langote


Hi,

Find attached that does:

-* mt_info is laid out in the following fashion:
+* bt_info is laid out in the following fashion:

uint8   bt_info;
} BrinTuple;

Thanks,
Amit


BrinTuple-typo-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)

2014-12-14 Thread Kouhei Kaigai

This attached patch adds a code example of custom-scan interface.

This custom-scan provider ("ctidscan") performs almost same as
built-in SeqScan plan, but can produce same results with less
page scan in case when qualifier of relation scan has inequality
operators (>, >=, < or <=) on "ctid" system column, therefore,
range to be scanned is less than case of full-table scan.

Below is an example:

postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid > '(10,0)'::tid AND b like 
'%abc%';
 QUERY PLAN
-
 Seq Scan on t1  (cost=0.00..10.00 rows=1 width=37)
   Filter: ((ctid > '(10,0)'::tid) AND (b ~~ '%abc%'::text))
(2 rows)

Once ctidscan is loaded, it can provide cheaper cost to scan
the "t1" table than SeqScan, so planner chooses the custom logic.

postgres=# LOAD 'ctidscan';
LOAD
postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid > '(10,0)'::tid AND b like 
'%abc%';
   QUERY PLAN
-
 Custom Scan (ctidscan) on t1  (cost=0.00..5.00 rows=1 width=37)
   Filter: ((ctid > '(10,0)'::tid) AND (b ~~ '%abc%'::text))
   ctid quals: (ctid > '(10,0)'::tid)
(3 rows)

Like other query execution logic, it also provides "enable_ctidscan"
parameter to turn on/off this feature.

I'm not certain whether we should have this functionality in contrib
from the perspective of workload that can help, but its major worth is
for an example of custom-scan interface.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Thursday, December 11, 2014 11:58 AM
> To: Simon Riggs
> Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Tom Lane; Alvaro
> Herrera; Shigeru Hanada; Stephen Frost; Andres Freund; PgHacker; Jim
> Mlodgenski; Peter Eisentraut
> Subject: ##freemail## Re: [HACKERS] [v9.5] Custom Plan API
> 
> On Tue, Dec 9, 2014 at 3:24 AM, Simon Riggs  wrote:
> > Feedback I am receiving is that the API is unusable. That could be
> > because it is impenetrable, or because it is unusable. I'm not sure it
> > matters which.
> 
> It would be nice to here what someone is trying to use it for and what 
> problems
> that person is encountering.  Without that, it's pretty much impossible
> for anyone to fix anything.
> 
> As for sample code, KaiGai had a working example, which of course got broken
> when Tom changed the API, but it didn't look to me like Tom's changes would
> have made anything impossible that was possible before.
> I'm frankly kind of astonished by the tenor of this entire conversation;
> there is certainly plenty of code in the backend that is less
> self-documenting than this is; and KaiGai did already put up a wiki page
> with documentation as you requested.  From his response, it sounds like
> he has updated the ctidscan code, too.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company


pgsql-v9.5-contrib-ctidscan.v1.patch
Description: pgsql-v9.5-contrib-ctidscan.v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

83 matches

Mail list logo