date:20151106

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Fabien COELHO




Thanks.  Part 1 looks, on the whole, fine to me, although I think the
changes to use less whitespace and removing decimal places in the
documentation are going in the wrong direction.  That is:

-  About 67% of values are drawn from the middle 1.0 / 
threshold

+  About 67% of values are drawn from the middle 1/param,

I would say 1.0 / param, just as we used to say 1.0 / threshold.  Any
reason why not?


For the 1.0 -> 1, this because in the example afterwards I set param to 2.0 
and I wanted it clear where the one half was coming from, and ISTM that the 
2.0 stands out more with "1 / 2.0" than with "1.0 / 2.0".


For the spaces, this is because with just "1/" the space seemed less 
necessary for clarity, but it seemed necessary with "1.0 /"


Now it is easy to backtrack.


After looking at the generated html version, I find that the "1/param" and 
"2/param" formula are very simple and pretty easy to read, and they would 
not be really enhanced with additional spacing.


ISTM that adaptative spacing (no spacing for level 1 operations, some for 
higher level) is a good approach for readability, ie:


   f(i) - f(i+1)
 ^ no spacing here
^ some spacing here

So I would suggest to keep the submitted version, unless this is a 
blocker.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2015-11-06 Thread Amit Kapila

On Sat, Nov 7, 2015 at 12:07 AM, Robert Haas  wrote:
>
> On Wed, Aug 12, 2015 at 6:25 AM, Ashutosh Bapat
>  wrote:
> > The previous patch would not compile on the latest HEAD. Here's updated
> > patch.
>
> Perhaps unsurprisingly, this doesn't apply any more.  But we have
> bigger things to worry about.
>
> The recent eXtensible Transaction Manager and the slides shared at the
> Vienna sharding summit, now posted at
> https://drive.google.com/file/d/0B8hhdhUVwRHyMXpRRHRSLWFXeXc/view make
> me think that some careful thought is needed here about what we want
> and how it should work. Slide 10 proposes a method for the extensible
> transaction manager API to interact with FDWs.  The FDW would do this:
>
> select dtm_join_transaction(xid);
> begin transaction;
> update...;
> commit;
>
> I think the idea here is that the commit command doesn't really
> commit; it just escapes the distributed transaction while leaving it
> marked not-committed.  When the transaction subsequently commits on
> the local server, the XID is marked committed and the effects of the
> transaction become visible on all nodes.
>

As per my reading of the slides shared by you, the commit in above
context would send a message to Arbiter which indicates it's Vote
for being ready to commit and when Arbiter gets the votes from all
nodes participating in transaction, it sends back an ok message
(this is what I could understand from slides 12 and 13). I think on
receiving ok message each node will mark the transaction as
committed.

> I think that this API is intended to provide not only consistent
> cross-node decisions about whether a particular transaction has
> committed, but also consistent visibility.  If the API is sufficient
> for that and if it can be made sufficiently performant, that's a
> strictly stronger guarantee than what this proposal would provide.
>
>
>
> On the whole, I'm inclined to think that the XTM-based approach is
> probably more useful and more general, if we can work out the problems
> with it.  I'm not sure that I'm right, though, nor am I sure how hard
> it will be.
>

If I understood correctly, then the main difference between 2PC idea
used in this patch (considering we find some way of sharing snapshots
in this approach) and what is shared in slides is that XTM-based
approach relies on an external identity which it refers to as Arbiter for
performing consistent transaction commit/abort and sharing of snapshots
across all the nodes whereas in the approach in this patch, the transaction
originator (or we can call it as coordinator) is responsible for consistent
transaction commit/abort.  I think the plus-point of XTM based approach is
that it provides way of sharing snapshots, but I think we still needs to
evaluate
what is the overhead of communication between these methods, as far as I
can see, in Arbiter based approach, Arbiter could become single point of
contention for coordinating messages for all the transactions in a system
whereas if we extend this approach such a contention could be avoided.
Now it is very well possible that the number of messages shared between
nodes in Arbiter based approach are lesser, but still contention could play
a
major role.   Also another important point which needs some more thought
before concluding on any approach is detection of deadlocks
between different
nodes, in the slides shared by you, there is no discussion of deadlocks,
so it is not clear whether it will work as it is without any modification
or do
we need any modifications and deadlock detection system and if yes, then
how that will be achieved.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Fabien COELHO




1. It's really not appropriate to fold the documentation changes
raised on the other thread into this patch.  I'm not going to commit
something where the commit message is a laundry list of unrelated
changes.  Please separate out the documentation changes as a separate
patch.  Let's do that first, and then make this patch apply on top of
those changes.  The related changes in getGaussianRand etc. should
also be part of that patch, not this one.


Hmmm. Attached is a two-part v16.


Thanks.  Part 1 looks, on the whole, fine to me, although I think the
changes to use less whitespace and removing decimal places in the
documentation are going in the wrong direction.  That is:

-  About 67% of values are drawn from the middle 1.0 / threshold
+  About 67% of values are drawn from the middle 1/param,

I would say 1.0 / param, just as we used to say 1.0 / threshold.  Any
reason why not?


For the 1.0 -> 1, this because in the example afterwards I set param to 
2.0 and I wanted it clear where the one half was coming from, and ISTM 
that the 2.0 stands out more with "1 / 2.0" than with "1.0 / 2.0".


For the spaces, this is because with just "1/" the space seemed less 
necessary for clarity, but it seemed necessary with "1.0 /"


Now it is easy to backtrack.

That's easier to read IMHO and makes it more clear that it's integer 
division.


It is not. One benefit of "1.0" makes it clear that this is a double 
division.



I'm copying Tomas Vondra on this email since he was the one who kicked
off the other thread where this was previously being discussed.


Fine.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Michael Paquier

On Sat, Nov 7, 2015 at 1:52 AM, Andres Freund  wrote:
> On 2015-11-06 11:42:56 -0500, Robert Haas wrote:
>> On Fri, Nov 6, 2015 at 2:47 AM, Michael Paquier
>>  wrote:
>> > I have as well thought a bit about adding a space-related constraint
>> > on the standby snapshot generated by the bgwriter, so as to not rely
>> > entirely on the interval of 15s. I finished with the attached that
>> > uses a check based on CheckPointSegments / 8 to be sure that at least
>> > this number of segments has been generated since the last checkpoint
>> > before logging a new snapshot. I guess that's less brittle than the
>> > last patch. Thoughts?
>>
>> I can't see why that would be a good idea.  My understanding is that
>> the logical decoding code needs to get those messages pretty
>> regularly, and I don't see why that need would be reduced on systems
>> where CheckPointSegments is large.
>
> Precisely.
>
>
> What I'm thinking of right now is a marker somewhere in shared memory,
> that tells whether anything worthwhile has happened since the last
> determination of the redo pointer. Where standby snapshots don't
> count. That seems like it'd be to maintain going forward than doing
> precise size calculations like CreateCheckPoint() already does, and
> would additionally need to handle its own standby snapshot, not to speak
> of the background ones.

I thought about something like that at some point by saving a minimum
activity pointer in XLogCtl, updated each time a segment was forcibly
switched or after inserting a checkpoint record. Then the bgwriter
looked at if the current insert position matched this minimum activity
pointer, skipping LogStandbySnapshot if both positions match. Does
this match your line of thoughts?

> Seems like it'd be doable in ReserveXLogInsertLocation().
> Whether it's actually worthwhile I'm not all that sure tho.

I am not sure doing extra work there is a good idea. There are already
5 instructions within the spinlock section, and this code path is
already high in many profiles for busy systems.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER INDEX...SET tab completion

2015-11-06 Thread Robert Haas

On Tue, Sep 8, 2015 at 4:23 PM, Jeff Janes  wrote:
> I can never remember the syntax for setting index storage parameters.  Is it
> =, TO, or just a space between the parameter name and the setting?
>
> This makes the tab completion more helpful, by providing the (mandatory)
> equals sign.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor regexp bug

2015-11-06 Thread David G. Johnston

On Fri, Nov 6, 2015 at 7:32 PM, Tom Lane  wrote:

> What I'm wondering about is whether to back-patch this.  It's possible
> that people have written patterns like this and not realized that they
> aren't doing quite what's expected.  Getting a failure instead might not
> be desirable in a minor release.  On the other hand, wrong answers are
> wrong answers.
>

I'd vote to back-patch this.  The unscientific reason on my end is that I
suspect very few patterns in the wild would be affected and furthermore any
using such patterns is likely to be in a position to change it match the
existing behavior by replace the "(\1)" with the corresponding "(\w)" as
you used in you example.  We should probably suggest just that in the
release notes.  It is not a strongly held position and my first reaction
was that introducing an error should be avoided.  But regular expressions
are tricky enough to get right when the engine does what you tell it...

David J.

Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-06 Thread Amit Kapila

On Fri, Nov 6, 2015 at 10:28 AM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:
>
> Hello,
>
> At Fri, 6 Nov 2015 09:49:30 +0530, Amit Kapila 
wrote in 
> > Apart from other problems discussed, I think it could also lead to
> > a performance penality for the cases when the qual condition is
> > costly as evaluating such a qual against lot of dead rows would be a
> > time consuming operation.  I am not sure, but I think some of the
> > other well know databases might not have any such problems as
> > they store visibility info inside index due to which they don't need to
> > fetch the heap tuple for evaluating such quals.
>
> I was junst thinking of the same thing. Can we estimate the
> degree of the expected penalty using heap statistics? Of couse
> not so accurate though.
>

I think so. Information related to number of tuples inserted, updated,
hot-updated, deleted and so on is maintained and is shown by
pg_stat_all_tables.  By the way, whats your point here, do you mean to
say that we can estimate approximate penality and then decide whether
quals should be evaluated at index level?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-06 Thread Tomas Vondra


Hi,

On 11/07/2015 02:18 AM, Robert Haas wrote:

On Fri, Nov 6, 2015 at 7:11 PM, Tomas Vondra
 wrote:

I think LEAKPROOF is probably fine for this.  How would the new thing
be different?


I don't think so - AFAIK "leakproof" is about not revealing information
about arguments, nothing more and nothing less. It does not say whether it's
safe to evaluate on indexes, and I don't think we should extend the meaning
in this direction.


That seems like a non-answer answer.


I'm not claiming I have an answer, really. My knowledge of leakproof 
stuff is a bit shallow. Also, I had a few beers earlier today, which 
does not really improve the depth of knowledge on any topic except 
politics and football (aka soccer). So you may be perfectly right.



Clearly, if a function can leak information about it's arguments,
for example by throwing an error, we can't call it on tuples that
might not even be visible, or the behavior of the query might be
change. So any function that is not leakproof is also not safe for
this purpose.

Now that doesn't rule out the possibility that the functions for
which this optimization is safe are a subset of the leakproof
functions - but offhand I don't see why that should be the case. The
index tuple is just a tuple, and the values it contains are just
datums, no more or less than in a heap tuple. There could be a reason
for concern here, but saying that it might not be "safe to evaluate
on indexes" just begs the question: WHY wouldn't that be safe?


Ah. For some reason I thought the "division by zero" example is one of 
the non-safe cases, but now I see int4div is not marked as leakproof, so 
we couldn't push it to index anyway.


I've however also noticed that all the 'like' procedures are marked as 
not leak proof, which is a bit unfortunate because that's the example 
from Jeff's e-mail that started this thread.


regards

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some questions about the array.

2015-11-06 Thread Jim Nasby


On 11/5/15 10:55 PM, Craig Ringer wrote:

Omitted bounds are common in other languages and would be handy. I
don't think they'd cause any issues with multi-dimensional arrays or
variable start-pos arrays.


+1


I'd love negative indexes, but the variable-array-start (mis)feature
means we can't have those. I wouldn't shed a tear if
variable-start-position arrays were deprecated and removed, but that's
a multi-year process, and I'm not convinced negative indexes justify
it even though the moveable array start pos feature seems little-used.


I'm all for ditching variable start, full stop.


Since the start-pos is recorded in the array, I wonder if it's worth
supporting negative indexing for arrays with the default 1-indexed
element numbering, and just ERRORing for others. Does anyone really
use anything else?


I'd prefer that over using something like ~.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request: pg_cancel_backend variant that handles 'idle in transaction' sessions

2015-11-06 Thread Jim Nasby

On 11/3/15 8:44 AM, Merlin Moncure wrote:

Actually, one other thing that would help is to have the ability to turn
>this into an ERROR:
>
>begin;
>WARNING:  there is already a transaction in progress

curious: does the SQL standard define this behavior?

Anyways, we've pretty studiously avoided (minus a couple of
anachronisms) .conf setting thats control behavior of SQL commands in
a non performance way.

If we had an event trigger on BEGIN and a way to tell whether we were 
already in a transaction this wouldn't need to be a config setting.

IMO, this as yet another case for 'stored procedures' that can manage
transaction state: you could rig up your own procedure: CALL
begin_tx_safe(); which would test transaction state and fail if
already in one.  This doesn't help you if you're not in direct control
of application generated SQL but it's a start.  Barring that, at least

Even then it would be very easy to mess this up.

warnings tend to stand out in the database log.

That depends greatly on how much other stuff is in the log. Something 
else I wish we had was the ability to send different log output to 
different places.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor regexp bug

2015-11-06 Thread Tom Lane

Happened across this while investigating something else ...

The regexp documentation says:

Lookahead and lookbehind constraints cannot contain back
references (see ),
and all parentheses within them are considered non-capturing.

This is true if you try a simple case, eg

regression=# select 'xyz' ~ 'x(\w)(?=\1)';
ERROR:  invalid regular expression: invalid backreference number

(That's not the greatest choice of error code, perhaps, but the point
is it rejects the backref.)  However, stick an extra set of parentheses
in there, and the regexp parser forgets all about the restriction:

regression=# select 'xyz' ~ 'x(\w)(?=(\1))';
 ?column? 
--
 t
(1 row)

Since the execution machinery has no hope of executing such a backref
properly, you silently get a wrong answer.  (The actual behavior in
a case like this is as though the backref had been replaced by a copy
of the bit of regexp pattern it'd referred to, ie this pattern works
like 'x(\w)(?=\w)', without any enforcement that the two \w's are
matching identical text.)

The fix is a one-liner, as per the attached patch: when recursing
to parse the innards of a parenthesized subexpression, we have to
pass down the current "type" not necessarily PLAIN, so that any
type-related restrictions still apply inside the subexpression.

What I'm wondering about is whether to back-patch this.  It's possible
that people have written patterns like this and not realized that they
aren't doing quite what's expected.  Getting a failure instead might not
be desirable in a minor release.  On the other hand, wrong answers are
wrong answers.

Thoughts?

regards, tom lane

diff --git a/src/backend/regex/regcomp.c b/src/backend/regex/regcomp.c
index aa759c2..a165b3b 100644
*** a/src/backend/regex/regcomp.c
--- b/src/backend/regex/regcomp.c
*** parseqatom(struct vars * v,
*** 951,957 
  			EMPTYARC(lp, s);
  			EMPTYARC(s2, rp);
  			NOERR();
! 			atom = parse(v, ')', PLAIN, s, s2);
  			assert(SEE(')') || ISERR());
  			NEXT();
  			NOERR();
--- 951,957 
  			EMPTYARC(lp, s);
  			EMPTYARC(s2, rp);
  			NOERR();
! 			atom = parse(v, ')', type, s, s2);
  			assert(SEE(')') || ISERR());
  			NEXT();
  			NOERR();
diff --git a/src/test/regress/expected/regex.out b/src/test/regress/expected/regex.out
index f0e2fc9..07fb023 100644
*** a/src/test/regress/expected/regex.out
--- b/src/test/regress/expected/regex.out
*** select 'a' ~ '()+\1';
*** 490,492 
--- 490,497 
   t
  (1 row)
  
+ -- Error conditions
+ select 'xyz' ~ 'x(\w)(?=\1)';  -- no backrefs in LACONs
+ ERROR:  invalid regular expression: invalid backreference number
+ select 'xyz' ~ 'x(\w)(?=(\1))';
+ ERROR:  invalid regular expression: invalid backreference number
diff --git a/src/test/regress/sql/regex.sql b/src/test/regress/sql/regex.sql
index d3030af..c45bdc9 100644
*** a/src/test/regress/sql/regex.sql
--- b/src/test/regress/sql/regex.sql
*** select 'a' ~ '$()|^\1';
*** 117,119 
--- 117,123 
  select 'a' ~ '.. ()|\1';
  select 'a' ~ '()*\1';
  select 'a' ~ '()+\1';
+ 
+ -- Error conditions
+ select 'xyz' ~ 'x(\w)(?=\1)';  -- no backrefs in LACONs
+ select 'xyz' ~ 'x(\w)(?=(\1))';

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nodes/*funcs.c inconsistencies

2015-11-06 Thread Peter Geoghegan

On Mon, Aug 3, 2015 at 10:00 AM, Alvaro Herrera
 wrote:
>> Couldn't we adopt
>> AssertVariableIsOfType()/AssertVariableIsOfTypeMacro() to macros like
>> READ_UINT_FIELD()?
>>
>> I'm surprised that this stuff was only ever used for logical decoding
>> infrastructure so far.
>
> The reason it's only used there is that Andres is the one who introduced
> those macros precisely for that code.  We've not yet had time to adjust
> the rest of the code to have more sanity checks.

Any chance of picking this up, Andres?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 7:11 PM, Tomas Vondra
 wrote:
>> I think LEAKPROOF is probably fine for this.  How would the new thing
>> be different?
>
> I don't think so - AFAIK "leakproof" is about not revealing information
> about arguments, nothing more and nothing less. It does not say whether it's
> safe to evaluate on indexes, and I don't think we should extend the meaning
> in this direction.

That seems like a non-answer answer.

Clearly, if a function can leak information about it's arguments, for
example by throwing an error, we can't call it on tuples that might
not even be visible, or the behavior of the query might be change.  So
any function that is not leakproof is also not safe for this purpose.

Now that doesn't rule out the possibility that the functions for which
this optimization is safe are a subset of the leakproof functions -
but offhand I don't see why that should be the case.  The index tuple
is just a tuple, and the values it contains are just datums, no more
or less than in a heap tuple.  There could be a reason for concern
here, but saying that it might not be "safe to evaluate on indexes"
just begs the question: WHY wouldn't that be safe?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2015-11-06 Thread Peter Geoghegan

On Wed, Aug 19, 2015 at 7:24 PM, Peter Geoghegan  wrote:
> I'll start a new thread for this, since my external sorting patch has
> now evolved well past the original "quicksort with spillover"
> idea...although not quite how I anticipated it would. It seems like
> I've reached a good point to get some feedback.

Corey Huinker has once again assisted me with this work, by doing some
benchmarking on an AWS instance of his:

32 cores (c3.8xlarge, I suppose)
MemTotal:   251902912 kB

I believe it had one EBS volume.

This testing included 2 data sets:

* A data set that he happens to have that is representative of his
production use-case. Corey had some complaints about the sort
performance of PostgreSQL, particularly prior to 9.5, and I like to
link any particular performance optimization to an improvement in an
actual production workload, if at all possible.

* A tool that I wrote, that works on top of sortbenchmark.org's
"gensort" [1] data generation tool. It seems reasonable to me to drive
this work in part with a benchmark devised by Jim Gray. He did after
all receive a Turing award for this contribution to transaction
processing. I'm certainly a fan of his work. A key practical advantage
of that is that is has reasonable guarantees about determinism, making
these results relatively easy to recreate independently.

The modified "gensort" is available from
https://github.com/petergeoghegan/gensort

The python script postgres_load.py, which performs bulk-loading for
Postgres using COPY FREEZE. It ought to be fairly self-documenting:

$:~/gensort$ ./postgres_load.py --help
usage: postgres_load.py [-h] [-w WORKERS] [-m MILLION] [-s] [-l] [-c]

optional arguments:
  -h, --helpshow this help message and exit
  -w WORKERS, --workers WORKERS
Number of gensort workers (default: 4)
  -m MILLION, --million MILLION
Generate n million tuples (default: 100)
  -s, --skewSkew distribution of output keys (default: False)
  -l, --logged  Use logged PostgreSQL table (default: False)
  -c, --collate Use default collation rather than C collation
(default: False)

For this initial report to the list, I'm going to focus on a case
involving 16 billion non-skewed tuples generated using the gensort
tool. I wanted to see how a sort of a ~1TB table (1017GB as reported
by psql, actually) could be improved, as compared to relatively small
volumes of data (in the multiple gigabyte range) that were so improved
by sorts on my laptop, which has enough memory to avoid blocking on
physical I/O much of the time. How the new approach deals with
hundreds of runs that are actually reasonably sized is also of
interest. This server does have a lot of memory, and many CPU cores.
It was kind of underpowered on I/O, though.

The initial load of 16 billion tuples (with a sortkey that is "C"
locale text) took about 10 hours. My tool supports parallel generation
of COPY format files, but serial performance of that stage isn't
especially fast. Further, in order to support COPY FREEZE, and in
order to ensure perfect determinism, the COPY operations occur
serially in a single transaction that creates the table that we
performed a CREATE INDEX on.

Patch, with 3GB maintenance_work_mem:

...
LOG:  performsort done (except 411-way final merge): CPU
1017.95s/17615.74u sec elapsed 23910.99 sec
STATEMENT:  create index on sort_test (sortkey );
LOG:  external sort ended, 54740802 disk blocks used: CPU
2001.81s/31395.96u sec elapsed 41648.05 sec
STATEMENT:  create index on sort_test (sortkey );

So just over 11 hours (11:34:08), then. The initial sorting for 411
runs took 06:38:30.99, as you can see.

Master branch:

...
LOG:  finished writing run 202 to tape 201: CPU 1224.68s/31060.15u sec
elapsed 34409.16 sec
LOG:  finished writing run 203 to tape 202: CPU 1230.48s/31213.55u sec
elapsed 34580.41 sec
LOG:  finished writing run 204 to tape 203: CPU 1236.74s/31366.63u sec
elapsed 34750.28 sec
LOG:  performsort starting: CPU 1241.70s/31501.61u sec elapsed 34898.63 sec
LOG:  finished writing run 205 to tape 204: CPU 1242.19s/31516.52u sec
elapsed 34914.17 sec
LOG:  finished writing final run 206 to tape 205: CPU
1243.23s/31564.23u sec elapsed 34963.03 sec
LOG:  performsort done (except 206-way final merge): CPU
1243.86s/31570.58u sec elapsed 34974.08 sec
LOG:  external sort ended, 54740731 disk blocks used: CPU
2026.98s/48448.13u sec elapsed 55299.24 sec
CREATE INDEX
Time: 55299315.220 ms

So 15:21:39 for master -- it's much improved, but this was still
disappointing given the huge improvements on relatively small cases.

Finished index was fairly large, which can be seen here by working
back from "total relation size":

postgres=# select pg_size_pretty(pg_total_relation_size('sort_test'));
 pg_size_pretty

 1487 GB
(1 row)

I think that this is probably due to the relatively slow I/O on this
server, and because the merge step is more o

Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-06 Thread Tomas Vondra


Hi,

On 11/05/2015 07:36 PM, Robert Haas wrote:

On Thu, Nov 5, 2015 at 1:29 PM, Tomas Vondra
 wrote:


But then again, can we come up with a way to distinguish operators that are
safe to evaluate on indexes - either automatic or manual? We already do that
with the indexable operators with explicitly listing them in the opclass,
and we also explicitly use the LEAKPROOF for a similar thing. I don't think
extending the opclass is a really good solution, but why not to invent
another *PROOF flag?


I think LEAKPROOF is probably fine for this.  How would the new thing
be different?


I don't think so - AFAIK "leakproof" is about not revealing information 
about arguments, nothing more and nothing less. It does not say whether 
it's safe to evaluate on indexes, and I don't think we should extend the 
meaning in this direction.


I find it perfectly plausible that there will be leakproof functions 
that can't be pushed to indexes, and that would not be possible with a 
single marker. Of course, "leakproof" may be a minimum requirement, but 
another marker is needed to actually enable pushing the function to index.


Of course, we may eventually realize that leakproof really is 
sufficient, and that we can push all leakproof functions to indexes. In 
that case we may deprecate the other marker and just use leakproof. But 
if we go just with leakproof and later find that we really need two 
markers because not all leakproof functions are not index-pushable, 
it'll be much harder to fix because it will cause performance 
regressions for the users (some of the expressions won't be pushed to 
indexes anymore).


But I think the marker is the last thing we need to worry about.

regards

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-11-06 Thread Robert Haas

On Fri, Oct 23, 2015 at 9:22 PM, Amit Kapila  wrote:
> The base rel's consider_parallel flag won't be percolated to childrels, so
> even
> if we mark base rel as parallel capable, while generating the path it won't
> be considered.  I think we need to find a way to pass on that information if
> we want to follow this way.

Fixed in the attached version.  I added a max_parallel_degree check,
too, per your suggestion.

> True, we can do that way.  What I was trying to convey by above is
> that we anyway need checks during path creation atleast in some
> of the cases, so why not do all the checks at that time only as I
> think all the information will be available at that time.
>
> I think if we store parallelism related info in RelOptInfo, that can also
> be made to work, but the only worry I have with that approach is we
> need to have checks at two levels one at RelOptInfo formation time
> and other at Path formation time.

I don't really see that as a problem.  What I'm thinking about doing
(but it's not implemented in the attached patch) is additionally
adding a ppi_consider_parallel flag to the ParamPathInfo.  This would
be meaningful only for baserels, and would indicate whether the
ParamPathInfo's ppi_clauses are parallel-safe.

If we're thinking about adding a parallel path to a baserel, we need
the RelOptInfo to have consider_parallel set and, if there is a
ParamPathInfo, we need the ParamPathInfo's ppi_consider_parallel flag
to be set also.  That shows that both the rel's baserestrictinfo and
the paramaterized join clauses are parallel-safe.  For a joinrel, we
can add a path if (1) the joinrel has consider_parallel set and (2)
the paths being joined are parallel-safe.  Testing condition (2) will
require a per-Path flag, so we'll end up with one flag in the
RelOptInfo, a second in the ParamPathInfo, and a third in the Path.
That doesn't seem like a problem, though: it's a sign that we're doing
this in a way that fits into the existing infrastructure, and it
should be pretty efficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From e31d5f3f4c53d80e87a74925db88bcaf2e6fa564 Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Fri, 2 Oct 2015 23:57:46 -0400
Subject: [PATCH 2/7] Strengthen planner infrastructure for parallelism.

Add a new flag, consider_parallel, to each RelOptInfo, indicating
whether a plan for that relation could conceivably be run inside of
a parallel worker.  Right now, we're pretty conservative: for example,
it might be possible to defer applying a parallel-restricted qual
in a worker, and later do it in the leader, but right now we just
don't try to parallelize access to that relation.  That's probably
the right decision in most cases, anyway.
---
 src/backend/nodes/outfuncs.c  |   1 +
 src/backend/optimizer/path/allpaths.c | 155 +++-
 src/backend/optimizer/plan/planmain.c |  12 +++
 src/backend/optimizer/plan/planner.c  |   9 +-
 src/backend/optimizer/util/clauses.c  | 183 +++---
 src/backend/optimizer/util/relnode.c  |  21 
 src/backend/utils/cache/lsyscache.c   |  22 
 src/include/nodes/relation.h  |   1 +
 src/include/optimizer/clauses.h   |   2 +-
 src/include/utils/lsyscache.h |   1 +
 10 files changed, 364 insertions(+), 43 deletions(-)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 3e75cd1..0030a9b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1868,6 +1868,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_INT_FIELD(width);
 	WRITE_BOOL_FIELD(consider_startup);
 	WRITE_BOOL_FIELD(consider_param_startup);
+	WRITE_BOOL_FIELD(consider_parallel);
 	WRITE_NODE_FIELD(reltargetlist);
 	WRITE_NODE_FIELD(pathlist);
 	WRITE_NODE_FIELD(ppilist);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8fc1cfd..105e544 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -21,6 +21,7 @@
 #include "access/tsmapi.h"
 #include "catalog/pg_class.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_proc.h"
 #include "foreign/fdwapi.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -71,6 +72,9 @@ static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
  Index rti, RangeTblEntry *rte);
 static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
    RangeTblEntry *rte);
+static void set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
+		  RangeTblEntry *rte);
+static bool function_rte_parallel_ok(RangeTblEntry *rte);
 static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	   RangeTblEntry *rte);
 static void set_tablesample_rel_size(PlannerInfo *root, RelOptInfo *rel,
@@ -158,7 +162,8 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	set_base_rel_consider_startup(root);
 
 	/*
-	 * Generate access paths for t

Re: [HACKERS] Better name for PQsslAttributes()

2015-11-06 Thread Tom Lane

Heikki Linnakangas  writes:
> On 11/06/2015 11:31 PM, Lars Kanis wrote:
>> As a co-maintainer of the PostgreSQL adapter for Ruby, I would like to
>> bridge the new SSL related functions to Ruby methods. However I wonder
>> whether PQsslAttributes() is the best name for the function. Based on
>> this name, I would expect to get key+value pairs instead of only the
>> keys. IMHO PQsslAttributeNames() would express better, what the function
>> does.

> Hmm, I think you're right.

> The question is, do we want to still change it? It's a new function in 
> 9.5, and we're just about to enter beta, so I guess we could, although 
> there might already be applications out there using it. If we do want to 
> rename it, now is the last chance to do it.

> Thoughts? I'm leaning towards changing it now.

I agree that this is about the last possible chance to rename it, if
indeed that chance is not already past.

However, it seems somewhat unlikely that anyone would be depending on the
thing already, so I think probably we could get away with renaming it.

+0.5 or so to changing it.  But if we do, it has to happen before Monday.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a raft of parallelism-related bug fixes

2015-11-06 Thread Robert Haas

On Mon, Nov 2, 2015 at 9:29 PM, Robert Haas  wrote:
> On Wed, Oct 28, 2015 at 10:23 AM, Robert Haas  wrote:
>> On Sun, Oct 18, 2015 at 12:17 AM, Robert Haas  wrote:
 So reviewing patch 13 isn't possible without prior knowledge.
>>>
>>> The basic question for patch 13 is whether ephemeral record types can
>>> occur in executor tuples in any contexts that I haven't identified.  I
>>> know that a tuple table slot can contain have a column that is of type
>>> record or record[], and those records can themselves contain
>>> attributes of type record or record[], and so on as far down as you
>>> like.  I *think* that's the only case.  For example, I don't believe
>>> that a TupleTableSlot can contain a *named* record type that has an
>>> anonymous record buried down inside of it somehow.  But I'm not
>>> positive I'm right about that.
>>
>> I have done some more testing and investigation and determined that
>> this optimism was unwarranted.  It turns out that the type information
>> for composite and record types gets stored in two different places.
>> First, the TupleTableSlot has a type OID, indicating the sort of the
>> value it expects to be stored for that slot attribute.  Second, the
>> value itself contains a type OID and typmod.  And these don't have to
>> match.  For example, consider this query:
>>
>> select row_to_json(i) from int8_tbl i(x,y);
>>
>> Without i(x,y), the HeapTuple passed to row_to_json is labelled with
>> the pg_type OID of int8_tbl.  But with the query as written, it's
>> labeled as an anonymous record type.  If I jigger things by hacking
>> the code so that this is planned as Gather (single-copy) -> SeqScan,
>> with row_to_json evaluated at the Gather node, then the sequential
>> scan kicks out a tuple with a transient record type and stores it into
>> a slot whose type OID is still that of int8_tbl.  My previous patch
>> failed to deal with that; the attached one does.
>>
>> The previous patch was also defective in a few other respects.  The
>> most significant of those, maybe, is that it somehow thought it was OK
>> to assume that transient typmods from all workers could be treated
>> interchangeably rather than individually.  To fix this, I've changed
>> the TupleQueueFunnel implemented by tqueue.c to be merely a
>> TupleQueueReader which handles reading from a single worker only.
>> nodeGather.c therefore creates one TupleQueueReader per worker instead
>> of a single TupleQueueFunnel for all workers; accordingly, the logic
>> for multiplexing multiple queues now lives in nodeGather.c.  This is
>> probably how I should have done it originally - someone, I think Jeff
>> Davis - complained previously that tqueue.c had no business embedding
>> the round-robin policy decision, and he was right.  So this addresses
>> that complaint as well.
>
> Here is an updated version.  This is rebased over recent commits, and
> I added a missing check for attisdropped.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better name for PQsslAttributes()

2015-11-06 Thread Peter Geoghegan

On Fri, Nov 6, 2015 at 1:38 PM, Heikki Linnakangas  wrote:
> Thoughts? I'm leaning towards changing it now.

+1 to the idea of changing it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better name for PQsslAttributes()

2015-11-06 Thread Magnus Hagander

On Fri, Nov 6, 2015 at 10:38 PM, Heikki Linnakangas  wrote:

> On 11/06/2015 11:31 PM, Lars Kanis wrote:
>
>> As a co-maintainer of the PostgreSQL adapter for Ruby, I would like to
>> bridge the new SSL related functions to Ruby methods. However I wonder
>> whether PQsslAttributes() is the best name for the function. Based on
>> this name, I would expect to get key+value pairs instead of only the
>> keys. IMHO PQsslAttributeNames() would express better, what the function
>> does.
>>
>
> Hmm, I think you're right.
>
> The question is, do we want to still change it? It's a new function in
> 9.5, and we're just about to enter beta, so I guess we could, although
> there might already be applications out there using it. If we do want to
> rename it, now is the last chance to do it.
>
> Thoughts? I'm leaning towards changing it now.


Uh, just to be clear, we been in beta for a month now, beta1 was released
Oct 8.  We are not just about to enter it...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Better name for PQsslAttributes()

2015-11-06 Thread Heikki Linnakangas


On 11/06/2015 11:31 PM, Lars Kanis wrote:

As a co-maintainer of the PostgreSQL adapter for Ruby, I would like to
bridge the new SSL related functions to Ruby methods. However I wonder
whether PQsslAttributes() is the best name for the function. Based on
this name, I would expect to get key+value pairs instead of only the
keys. IMHO PQsslAttributeNames() would express better, what the function
does.


Hmm, I think you're right.

The question is, do we want to still change it? It's a new function in 
9.5, and we're just about to enter beta, so I guess we could, although 
there might already be applications out there using it. If we do want to 
rename it, now is the last chance to do it.


Thoughts? I'm leaning towards changing it now.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Better name for PQsslAttributes()

2015-11-06 Thread Lars Kanis

As a co-maintainer of the PostgreSQL adapter for Ruby, I would like to
bridge the new SSL related functions to Ruby methods. However I wonder
whether PQsslAttributes() is the best name for the function. Based on
this name, I would expect to get key+value pairs instead of only the
keys. IMHO PQsslAttributeNames() would express better, what the function
does.

--
Kind Regards,
Lars




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 3:44 PM, Fabien COELHO  wrote:
>> 1. It's really not appropriate to fold the documentation changes
>> raised on the other thread into this patch.  I'm not going to commit
>> something where the commit message is a laundry list of unrelated
>> changes.  Please separate out the documentation changes as a separate
>> patch.  Let's do that first, and then make this patch apply on top of
>> those changes.  The related changes in getGaussianRand etc. should
>> also be part of that patch, not this one.
>
> Hmmm. Attached is a two-part v16.

Thanks.  Part 1 looks, on the whole, fine to me, although I think the
changes to use less whitespace and removing decimal places in the
documentation are going in the wrong direction.  That is:

-  About 67% of values are drawn from the middle 1.0 / threshold
+  About 67% of values are drawn from the middle 1/param,

I would say 1.0 / param, just as we used to say 1.0 / threshold.  Any
reason why not?  That's easier to read IMHO and makes it more clear
that it's integer division.

I'm copying Tomas Vondra on this email since he was the one who kicked
off the other thread where this was previously being discussed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Gavin Flower


On 07/11/15 09:59, Nathan Wagner wrote:
[...]
My day to make a fool of myself in public I guess. You're right of 
course. I can only plead distraction by having too many projects in 
mind at once and not focusing properly. Sorry for taking up your time 
on things I should have checked better.

[...]

There are two types of people:
those that don't bother
and
those that try, & fail!


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Nathan Wagner

On Fri, Nov 06, 2015 at 02:16:41PM -0500, Tom Lane wrote:
> Uh, what?  It's not by any means turned off by default.
> 
> postgres=# select name,setting from pg_settings where name like '%geqo%';
> name | setting 
> -+-
>  geqo| on

[snip]

My day to make a fool of myself in public I guess.  You're right of
course.  I can only plead distraction by having too many projects in
mind at once and not focusing properly.  Sorry for taking up your time
on things I should have checked better.

> I'm inclined to think that removing all the ifdefd-out-by-default logic
> would be a fine thing to do, though.

I'll work up a patch.

-- 
nw

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 10:38 AM, Alvaro Herrera
 wrote:
> Craig Ringer wrote:
>> On 6 November 2015 at 13:34, Robert Haas  wrote:
>>
>> >> But some options control how
>> >> next host should be choosen (i.e. use random order for load-balancing
>> >> or sequential order for high availability), so they should be specified
>> >> only once per connect string.
>> >
>> > But this seems like a point worthy of consideration.
>>
>> This makes me think that trying to wedge this into the current API
>> using a funky connection string format might be a mistake.
>>
>> Lots of these issues would go away if you could provide more than just
>> a connstring.
>
> Yes, I agree.  I wonder if the failover aspect couldn't be better
> covered by something more powerful than a single URI, such as the
> service file format.  Maybe just allow the contents of a service file to
> be passed as a "connection string", so that the application/environment
> can continue to maintain the connection info as a string somewhere
> instead of having to have an actual file.

This gets pretty far from the concept of a connection string, which is
supposed to be a sort of universal format for telling anything
PostgreSQL how to find the database server.  For example, psql accepts
a connection string, but you wouldn't want to have to pass the entire
contents of a file, newlines and all, as a command-line argument.
Now, we could design some kind of new connection string format that
would work, like:

psql 'alternatives failovertime=1 (host=192.168.10.49 user=rhaas)
(host=192.168.10.52 user=bob)'

There are a couple of problems with this.  One, it can't be parsed
anything like a normal connection string.  Lots of software will have
to be rewritten to use whatever parser we come up with for the new
format.  Two, it's nothing like the existing JDBC syntax that already
does more or less what people want here.  Three, I think that
implementing it is likely to be an extraordinarily large project.  The
whole data structure that libpq uses to store a parsed connection
string probably needs to change in fairly major ways.

So, I really wonder why we're not happy with the ability to substitute
out just the host and IP.

http://www.postgresql.org/message-id/4dfa5a86.9030...@acm.org
https://jdbc.postgresql.org/documentation/94/connect.html (bottom of page)

Yes, it's possible that some people might need to change some other
parameter when the host or IP changes.  But those people aren't
prohibited from writing their own function that tries to connect to
multiple PostgreSQL servers one after the other using any system
they'd like.  I don't think we really need to cater to every possible
use case in core.  The JDBC feature has been out for several years and
people seem to like it OK - the fact that we can think of things it
doesn't do doesn't make it a bad idea.  I'd rather have a feature that
can do 95% of what people want in the real world next month than 99%
of what people want in 2019, and I'm starting to think we're setting
the bar awfully high.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-11-06 Thread Jesper Pedersen


Hi,

On 11/06/2015 03:38 PM, Andres Freund wrote:

While I saw an improvement for the 'synchronous_commit = on' case -
there is a small regression for 'off', using -M prepared + Unix Domain
Socket. If that is something that should be considered right now.


What tests where you running, in which order? I presume it's a read/write 
pgbench? What scale, shared buffers?



Scale is 3000, and shared buffer is 64Gb, effective is 160Gb.

Order was master/off -> master/on -> pinunpin/off -> pinunpin/on.


I right now can't see any reason sc on/off should be relevant for the patch. 
Could it be an artifact of the order you ran tests in?



I was puzzled too, hence the post.


Did you initdb between tests? Pgbench -i? Restart the database?


I didn't initdb / pgbench -i between the tests, so that it is likely it.

I'll redo.

Best regards,
 Jesper




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Fabien COELHO



Hello Robert,


1. It's really not appropriate to fold the documentation changes
raised on the other thread into this patch.  I'm not going to commit
something where the commit message is a laundry list of unrelated
changes.  Please separate out the documentation changes as a separate
patch.  Let's do that first, and then make this patch apply on top of
those changes.  The related changes in getGaussianRand etc. should
also be part of that patch, not this one.


Hmmm. Attached is a two-part v16.


2. Please reduce the churn in the pgbench output example.  Most of the
lines that you've changed don't actually need to be changed.


I did a real run to get consistant figures, especially as now more 
informations are shown. I did some computations to try to generate 
something consistent without changing too many lines, but just taking a 
real output would make more sense.



3. I think we should not have ENODE_INTEGER_CONSTANT and
ENODE_DOUBLE_CONSTANT.  We should just have ENODE_CONSTANT, and it
should store the same datatype we use to represent values in the
evaluator.


Why not. This induces a two level tag structure, which I do not find that 
great, but it simplifies the evaluator code to have one less case.



4. The way you've defined the value type is not great.  int64_t isn't
used anywhere in our source base.  Don't start here.


Indeed. I really meant "int64" which is used elsewhere in pgbench.

I think the is_none, is_int, is_double naming is significantly inferior 
to what I suggested, and unlike what we do through the rest of our code.


Hmmm. I think that it is rather a matter of taste, and PGBT_DOUBLE does 
not strike me as intrinsically superior, although possibly more in style.


I changed value_t and value_type_t to PgBenchValue and ~Type to blend in, 
even if I find it on the heavy side.



Similarly, the coercion functions are not very descriptive named,


I do not understand. "INT(value)" and "DOUBLE(value) both look pretty 
explicit to me...


The point a choosing short names is to avoid newlines to stay (or try to 
stay) under 80 columns, especially as pg indentations rules tend to move 
things quickly on the right in case constructs, which are used in the 
evaluation function.


I use "explicitely named" functions, but used short macros in the 
evaluator.


and I don't see why we'd want those to be macros rather than static 
functions.


Can be functions, sure. Switched.


5. The declaration of PgBenchExprList has a cuddled opening brace,
which is not PostgreSQL style.


Indeed. There were other instances.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..da3c792 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -788,7 +788,7 @@ pgbench  options  dbname
 

 
- \setrandom varname min max [ uniform | { gaussian | exponential } threshold ]
+ \setrandom varname min max [ uniform | { gaussian | exponential } param ]
  
 
 
@@ -804,54 +804,63 @@ pgbench  options  dbname
   By default, or when uniform is specified, all values in the
   range are drawn with equal probability.  Specifying gaussian
   or  exponential options modifies this behavior; each
-  requires a mandatory threshold which determines the precise shape of the
+  requires a mandatory parameter which determines the precise shape of the
   distribution.
  
 
  
   For a Gaussian distribution, the interval is mapped onto a standard
   normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -threshold on the left and +threshold
+  at -param on the left and +param
   on the right.
+  Values in the middle of the interval are more likely to be drawn.
   To be precise, if PHI(x) is the cumulative distribution
   function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, then value i
-  between min and max inclusive is drawn
-  with probability:
-  
-(PHI(2.0 * threshold * (i - min - mu + 0.5) / (max - min + 1)) -
- PHI(2.0 * threshold * (i - min - mu - 0.5) / (max - min + 1))) /
- (2.0 * PHI(threshold) - 1.0).
-  Intuitively, the larger the threshold, the more
+  defined as (max+min)/2, with
+
+f(x) = PHI(2 * param * (x-mu) / (max-min+1)) / (2 * PHI(param) - 1)
+
+  then value i between min and
+  max inclusive is drawn with probability:
+  f(i+0.5) - f(i-0.5).
+  Intuitively, the larger the param, the more
   frequently values close to the middle of the interval are drawn, and the
   less frequently values close to the min and
   max bounds.
-  About 67% of values are drawn from the middle 1.0 / threshold
-  and 95% in the middle 2.0 / threshold; for instance, if
-  threshold is 4.0, 67% of values are drawn from the middle
-  quarter and 95% from the middle half of the interval.
-  The minimum threshold is 2.0 for per

Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-11-06 Thread Andres Freund

Hi,

On November 6, 2015 9:31:37 PM GMT+01:00, Jesper Pedersen 
 wrote:
>I have been testing this on a smaller system than yours - 2 socket 
>Intel(R) Xeon(R) CPU E5-2683 v3 w/ 2 x RAID10 SSD disks (data + xlog), 
>so focused on a smaller number of clients.

Thanks for running tests!

>While I saw an improvement for the 'synchronous_commit = on' case - 
>there is a small regression for 'off', using -M prepared + Unix Domain 
>Socket. If that is something that should be considered right now.

What tests where you running, in which order? I presume it's a read/write 
pgbench? What scale, shared buffers?

I right now can't see any reason sc on/off should be relevant for the patch. 
Could it be an artifact of the order you ran tests in?

Did you initdb between tests? Pgbench -i? Restart the database?

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2015-11-06 Thread Jesper Pedersen


On 10/29/2015 01:18 PM, Alexander Korotkov wrote:

We got a consensus with Andres that we should commit the CAS version first
and look to other optimizations.
Refactored version of atomic state patch is attached. The changes are
following:
1) Macros are used for access refcount and usagecount.
2) likely/unlikely were removed. I think introducing of likely/unlikely
should be a separate patch since it touches portability. Also, I didn't see
any performance effect of this.
3) LockBufHdr returns the state after taking lock. Without using atomic
increments it still can save some loops on skip atomic value reading.



I have been testing this on a smaller system than yours - 2 socket 
Intel(R) Xeon(R) CPU E5-2683 v3 w/ 2 x RAID10 SSD disks (data + xlog), 
so focused on a smaller number of clients.


While I saw an improvement for the 'synchronous_commit = on' case - 
there is a small regression for 'off', using -M prepared + Unix Domain 
Socket. If that is something that should be considered right now.


Maybe it is worth to update the README to mention that the flags are 
maintained in an atomic uint32 now.


BTW, there are two CommitFest entries for this submission:

 https://commitfest.postgresql.org/7/370/
 https://commitfest.postgresql.org/7/408/

Best regards,
 Jesper


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SortSupport for UUID type

2015-11-06 Thread Peter Geoghegan

On Fri, Nov 6, 2015 at 12:35 AM, Kyotaro HORIGUCHI
 wrote:
> Hello, I tried to look on this as far as I can referring to
> numeric.c..

Thank you for the review, Horiguchi-san.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SortSupport for UUID type

2015-11-06 Thread Peter Geoghegan

On Fri, Nov 6, 2015 at 9:19 AM, Robert Haas  wrote:
> This is a good catch, so I pushed a fix.

Thanks for your help.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Tom Lane

Nathan Wagner  writes:
> On Fri, Nov 06, 2015 at 11:45:38AM -0500, Tom Lane wrote:
>> (There's a fair amount of dead code in /geqo/, which I've never had
>> the energy to clean up, but maybe we should do that sometime.  It
>> seems unlikely that anyone will ever be interested in experimenting
>> with the ifdef'ed-out code paths.)

> I also note that in  src/backend/optimizer/path/allpaths.c there is a
> join_search_hook variable apparently intended for plugins (extensions?)
> to be able to control the search path optimizer.  And the geqo code is
> AFAICT turned off by default anyway, so none of the code is used in
> probably the vast majority of systems, with standard_join_search() being
> called instead.

Uh, what?  It's not by any means turned off by default.

postgres=# select name,setting from pg_settings where name like '%geqo%';
name | setting 
-+-
 geqo| on
 geqo_effort | 5
 geqo_generations| 0
 geqo_pool_size  | 0
 geqo_seed   | 0
 geqo_selection_bias | 2
 geqo_threshold  | 12
(7 rows)

You do need at least 12 tables in the FROM list to get it to be exercised
with the default settings, which among other things means that our
regression tests don't stress it much at all.  But it's definitely
reachable.

> Would it be worth either of removing at least the non-ERX portions of
> the geqo code, or removing the geqo code entirely (presumably with a
> deprecation cycle) and moving it to an extension?

Removing it is right out, as you'll soon find if you try to plan a query
with a couple dozen tables with geqo turned off.  The standard exhaustive
search just gets too slow.

I'm inclined to think that removing all the ifdefd-out-by-default logic
would be a fine thing to do, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-06 Thread Kevin Grittner

On Friday, November 6, 2015 10:32 AM, Robert Haas  wrote:

> I think this approach is generally reasonable, but I suggested
> parts of it, so may be biased.  I would be interested in hearing
> the opinions of others.

Has anyone taken a close look at what happens if the two sides of
the merge join have different implementations of the same collation
name?  Is there anything we should do to defend against the
problem?

We already face the issue of corrupted indexes when we have
different revisions of glibc on a primary and a standby or when the
OS on a server is updated, so this wouldn't be entirely a *new*
problem:

http://www.postgresql.org/message-id/ba6132ed-1f6b-4a0b-ac22-81278f5ab...@tripadvisor.com

... but it would be a brand-new way to hit it, and we might be able
to spot the problem in a merge join by watching for rows being fed
to either side of the join which are not in order according to the
machine doing the join.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adjust errorcode in background worker code

2015-11-06 Thread Robert Haas

On Sun, Jun 28, 2015 at 10:43 PM, Amit Langote
 wrote:
> On 2015-06-29 AM 11:36, Amit Langote wrote:
>> Hi,
>>
>> How about the attached that adjusts errorcode for the error related to
>> checking the flag bgw_flags in BackgroundWorkerInitializeConnection*()
>> functions so that it matches the treatment in SanityCheckBackgroundWorker()?
>>
>> s/ERRCODE_PROGRAM_LIMIT_EXCEEDED/ERRCODE_INVALID_PARAMETER_VALUE/g
>>
>> There is already a "/* XXX is this the right errcode? */" there.
>
> Oops, a wrong thing got attached.
>
> Please find correct one attached this time.

Well, I'm just catching up on some old email and saw this thread.  I
like the idea of trying to use the best possible error code, but I'm
not so sure this is an improvement.  One problem is that
ERRCODE_INVALID_PARAMETER_VALUE is that we use it, uh, a lot:

[rhaas pgsql]$ git grep ERRCODE_ | sed 's/.*ERRCODE_/ERRCODE_/;
s/[^A-Z0-9_].*//;' | sort  | uniq -c | sort -n -r | head
 540 ERRCODE_FEATURE_NOT_SUPPORTED
 442 ERRCODE_INVALID_PARAMETER_VALUE
 380 ERRCODE_SYNTAX_ERROR
 194 ERRCODE_WRONG_OBJECT_TYPE
 194 ERRCODE_UNDEFINED_OBJECT
 181 ERRCODE_DATATYPE_MISMATCH
 180 ERRCODE_INSUFFICIENT_PRIVILEGE
 150 ERRCODE_INVALID_TEXT_REPRESENTATION
 137 ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE
 123 ERRCODE_PROGRAM_LIMIT_EXCEEDED

I wonder if we need to think about inventing some new error codes.  I
can sort of understand that "feature not supported" is something that
can come in a large number of different contexts and mean pretty much
the same all the time, but I'm betting that things like "invalid
parameter value" and "invalid text representation" and "object not in
prerequisite state" cover an amazing breadth of errors that may not
actually be that similar to each other.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2015-11-06 Thread Robert Haas

On Wed, Aug 12, 2015 at 6:25 AM, Ashutosh Bapat
 wrote:
> The previous patch would not compile on the latest HEAD. Here's updated
> patch.

Perhaps unsurprisingly, this doesn't apply any more.  But we have
bigger things to worry about.

The recent eXtensible Transaction Manager and the slides shared at the
Vienna sharding summit, now posted at
https://drive.google.com/file/d/0B8hhdhUVwRHyMXpRRHRSLWFXeXc/view make
me think that some careful thought is needed here about what we want
and how it should work. Slide 10 proposes a method for the extensible
transaction manager API to interact with FDWs.  The FDW would do this:

select dtm_join_transaction(xid);
begin transaction;
update...;
commit;

I think the idea here is that the commit command doesn't really
commit; it just escapes the distributed transaction while leaving it
marked not-committed.  When the transaction subsequently commits on
the local server, the XID is marked committed and the effects of the
transaction become visible on all nodes.

I think that this API is intended to provide not only consistent
cross-node decisions about whether a particular transaction has
committed, but also consistent visibility.  If the API is sufficient
for that and if it can be made sufficiently performant, that's a
strictly stronger guarantee than what this proposal would provide.

On the other hand, I see a couple of problems:

1. The extensible transaction manager API is meant to be pluggable.
Depending on which XTM module you choose to load, the SQL that needs
to be executed by postgres_fdw on the remote node will vary.
postgres_fdw shouldn't have knowledge of all the possible XTMs out
there, so it would need some way to know what SQL to send.

2. If the remote server isn't running the same XTM as the local
server, or if it is running the same XTM but is not part of the same
group of cooperating nodes as the local server, then we can't send a
command to join the distributed transaction at all.  In that case, the
2PC for FDW approach is still, maybe, useful.

On the whole, I'm inclined to think that the XTM-based approach is
probably more useful and more general, if we can work out the problems
with it.  I'm not sure that I'm right, though, nor am I sure how hard
it will be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Nathan Wagner

On Fri, Nov 06, 2015 at 11:45:38AM -0500, Tom Lane wrote:

> However, really the whole argument is moot, because I notice that
> geqo_mutation() is only called in the "#ifdef CX" code path, which
> we don't use.

I suppose someone could turn it on via a compiler define.

> So there's little point in improving it.

No, probably not.

> (There's a fair amount of dead code in /geqo/, which I've never had
> the energy to clean up, but maybe we should do that sometime.  It
> seems unlikely that anyone will ever be interested in experimenting
> with the ifdef'ed-out code paths.)

I also note that in  src/backend/optimizer/path/allpaths.c there is a
join_search_hook variable apparently intended for plugins (extensions?)
to be able to control the search path optimizer.  And the geqo code is
AFAICT turned off by default anyway, so none of the code is used in
probably the vast majority of systems, with standard_join_search() being
called instead.

Would it be worth either of removing at least the non-ERX portions of
the geqo code, or removing the geqo code entirely (presumably with a
deprecation cycle) and moving it to an extension?  If there's any
interest, I can work up a patch for either or both.

There is only one test in the regression suite that turns on geqo that I
could find.  It's labeled "check for failure to generate a plan with
multiple degenerate IN clauses" in join.sql.

-- 
nw

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Nathan Wagner

On Fri, Nov 06, 2015 at 11:19:00AM -0500, Tom Lane wrote:
> Nathan Wagner  writes:
> > I see you committed a modified version of my patch in commit
> > 59464bd6f928ad0da30502cbe9b54baec9ca2c69.
> 
> > You changed the tour[0] to be hardcoded to 1, but it should be any
> > of the possible gene numbers from 0 to remainder.
> 
> How so?  The intent is to replace the first iteration of the
> Fisher-Yates loop, not the old loop.  That iteration will certainly
> end by assigning 1 to tour[0], because it must choose j = i = 0.

You are correct.  I got confused between reading the original code, my
patch, and your modified patch.

I wonder why the algorithm bothers with the first iteration at all, in
the case of an initialized array, it would just swap the first element
with itself.  I must be missing something.  I'll need to do some more
reading.

-- 
nw

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #12989: pg_size_pretty with negative values

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 12:44 PM, Adrian Vondendriesch
 wrote:
> Am 06.11.2015 um 17:06 schrieb Robert Haas:
>> On Thu, Nov 5, 2015 at 4:19 PM, Adrian.Vondendriesch
>>  wrote:
>>> New patch attached and rebased on HEAD
>>> (8c75ad436f75fc629b61f601ba884c8f9313c9af).
>>
>> I've committed this with some modifications:
>>
>> - I changed the comment for the half_rounded() macros because the one
>> you had just restated the code.
>> - I tightened up the coding in numeric_half_rounded() very slightly.
>> - You didn't, as far as I can see, modify the regression test schedule
>> to execute the files you added.  I consolidated them into one file,
>> added it to the schedule, and tightened up the SQL a bit.
>
> Looks much better now.
>
>>
>> Thanks for the patch, and please let me know if I muffed anything.
>
> Thanks for reviewing and improving the changes.
>
> I changed the status to committed.

Great, thank you for working on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 6:27 AM, Ildus Kurbangaliev
 wrote:
> There is a patch that splits SLRU LWLocks to separate tranches and
> moves them to SLRU Ctl. It does some work from the main patch from
> this thread, but can be commited separately. It also simplifies
> lwlock.c.

Thanks.  I like the direction this is going.

-   char   *ptr;
-   Sizeoffset;
-   int slotno;
+   char*ptr;
+   Size offset;
+   int  slotno;
+   int  tranche_id;
+   LWLockPadded*locks;

Please don't introduce this kind of churn.  pgindent will undo it.

This isn't going to work for EXEC_BACKEND builds, I think.  It seems
to rely on the LWLockRegisterTranche() performed !IsUnderPostmaster
being inherited by subsequent children, which won't work under
EXEC_BACKEND.  Instead, store the tranche ID in SlruSharedData.  Move
the LWLockRegisterTranche call out from the (!IsUnderPostmaster) case
and call it based on the tranche ID from SlruSharedData.

I would just drop the add_postfix stuff.  I think it's fine if the
names of the shared memory checks are just "CLOG" etc. rather than
"CLOG Slru Ctl", and similarly I think the locks can be registered
without the "Locks" suffix.  It'll be clear from context that they are
locks.  I suggest also that we just change all of these names to be
lower case, though I realize that's a debatable and cosmetic point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #12989: pg_size_pretty with negative values

2015-11-06 Thread Adrian Vondendriesch

Am 06.11.2015 um 17:06 schrieb Robert Haas:
> On Thu, Nov 5, 2015 at 4:19 PM, Adrian.Vondendriesch
>  wrote:
>> New patch attached and rebased on HEAD
>> (8c75ad436f75fc629b61f601ba884c8f9313c9af).
> 
> I've committed this with some modifications:
> 
> - I changed the comment for the half_rounded() macros because the one
> you had just restated the code.
> - I tightened up the coding in numeric_half_rounded() very slightly.
> - You didn't, as far as I can see, modify the regression test schedule
> to execute the files you added.  I consolidated them into one file,
> added it to the schedule, and tightened up the SQL a bit.

Looks much better now.

> 
> Thanks for the patch, and please let me know if I muffed anything.

Thanks for reviewing and improving the changes.

I changed the status to committed.

Regards,
 - Adrian



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 5:00 AM, Fabien COELHO  wrote:
>> Those can be avoided in other ways.  For example:
>
> Ok, ok, I surrender:-)
>
> Here is a v15 which hides conversions and assignment details in macros and
> factors out type testing of overloaded operators so that the code expansion
> is minimal (basically the operator evaluation is duplicated for int &
> double, but the rest is written once). The evaluation cost is probably
> slightly higher than the previous version because of the many hidden type
> tests.
>
> Note that variables are only int stored as text. Another patch may try to
> propagate the value structure for variables, but then it changes the query
> expansion code, it is more or less orthogonal to add functions. Moreover
> double variables would not be really useful anyway.

OK, comments on this version:

1. It's really not appropriate to fold the documentation changes
raised on the other thread into this patch.  I'm not going to commit
something where the commit message is a laundry list of unrelated
changes.  Please separate out the documentation changes as a separate
patch.  Let's do that first, and then make this patch apply on top of
those changes.  The related changes in getGaussianRand etc. should
also be part of that patch, not this one.

2. Please reduce the churn in the pgbench output example.  Most of the
lines that you've changed don't actually need to be changed.

3. I think we should not have ENODE_INTEGER_CONSTANT and
ENODE_DOUBLE_CONSTANT.  We should just have ENODE_CONSTANT, and it
should store the same datatype we use to represent values in the
evaluator.

4. The way you've defined the value type is not great.  int64_t isn't
used anywhere in our source base.  Don't start here.  I think the
is_none, is_int, is_double naming is significantly inferior to what I
suggested, and unlike what we do through the rest of our code.
Similarly, the coercion functions are not very descriptive named, and
I don't see why we'd want those to be macros rather than static
functions.

5. The declaration of PgBenchExprList has a cuddled opening brace,
which is not PostgreSQL style.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 12:26 PM, Andres Freund  wrote:
> On November 6, 2015 6:21:50 PM GMT+01:00, Robert Haas  
> wrote:
>>On Fri, Nov 6, 2015 at 11:52 AM, Andres Freund 
>>wrote:
>>> Seems like it'd be doable in ReserveXLogInsertLocation().
>>>
>>> Whether it's actually worthwhile I'm not all that sure tho.
>>
>>Why not?
>
> Adds another instruction in one of the hottest spinlock protected sections of 
> PG. Probably won't be significant, but...

Oh.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 9:42 AM, Kouhei Kaigai  wrote:
> This patch needs to be rebased.
> One thing different from the latest version is fdw_recheck_quals of
> ForeignScan was added. So, ...
>
> (1) Principle is that FDW driver knows what qualifiers were pushed down
> and how does it kept in the private field. So, fdw_recheck_quals is
> redundant and to be reverted.
>
> (2) Even though the principle is as described in (1), however,
> wired logic in ForeignRecheck() and fdw_recheck_quals are useful
> default for most of FDW drivers. So, it shall be kept and valid
> only if RecheckForeignScan callback is not defined.
>
> Which is better approach for the v3 patch?
> My preference is (1), because fdw_recheck_quals is a new feature,
> thus, FDW driver has to be adjusted in v9.5 more or less, even if
> it already supports qualifier push-down.
> In general, interface becomes more graceful to stick its principle.

fdw_recheck_quals seems likely to be very convenient for FDW authors,
and I think ripping it out would be a terrible decision.

I think ForeignRecheck should first call ExecQual to test
fdw_recheck_quals.  If it returns false, return false.  If it returns
true, then give the FDW callback a chance, if one is defined.  If that
returns false, return false.   If we haven't yet returned false,
return true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Andres Freund

On November 6, 2015 6:21:50 PM GMT+01:00, Robert Haas  
wrote:
>On Fri, Nov 6, 2015 at 11:52 AM, Andres Freund 
>wrote:
>> Seems like it'd be doable in ReserveXLogInsertLocation().
>>
>> Whether it's actually worthwhile I'm not all that sure tho.
>
>Why not?

Adds another instruction in one of the hottest spinlock protected sections of 
PG. Probably won't be significant, but...

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 11:52 AM, Andres Freund  wrote:
> On 2015-11-06 11:42:56 -0500, Robert Haas wrote:
>> On Fri, Nov 6, 2015 at 2:47 AM, Michael Paquier
>>  wrote:
>> > I have as well thought a bit about adding a space-related constraint
>> > on the standby snapshot generated by the bgwriter, so as to not rely
>> > entirely on the interval of 15s. I finished with the attached that
>> > uses a check based on CheckPointSegments / 8 to be sure that at least
>> > this number of segments has been generated since the last checkpoint
>> > before logging a new snapshot. I guess that's less brittle than the
>> > last patch. Thoughts?
>>
>> I can't see why that would be a good idea.  My understanding is that
>> the logical decoding code needs to get those messages pretty
>> regularly, and I don't see why that need would be reduced on systems
>> where CheckPointSegments is large.
>
> Precisely.
>
> What I'm thinking of right now is a marker somewhere in shared memory,
> that tells whether anything worthwhile has happened since the last
> determination of the redo pointer. Where standby snapshots don't
> count. That seems like it'd be to maintain going forward than doing
> precise size calculations like CreateCheckPoint() already does, and
> would additionally need to handle its own standby snapshot, not to speak
> of the background ones.

Good idea.

> Seems like it'd be doable in ReserveXLogInsertLocation().
>
> Whether it's actually worthwhile I'm not all that sure tho.

Why not?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SortSupport for UUID type

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 3:35 AM, Kyotaro HORIGUCHI
 wrote:
> Hello, I tried to look on this as far as I can referring to
> numeric.c..

Oops, I didn't see this review before committing.

> 6. uuid_abbrev_convert()
>
>  > memcpy((char *) &res, authoritative->data, sizeof(Datum));
>
>  memcpy's prototype is "memcpy(void *dest..." so the cast to
>  (char *) is not necessary.

This is a good catch, so I pushed a fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SortSupport for UUID type

2015-11-06 Thread Robert Haas

On Thu, Nov 5, 2015 at 7:10 PM, Peter Geoghegan  wrote:
> On Thu, Oct 8, 2015 at 5:27 PM, Peter Geoghegan  wrote:
>> This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
>> you should change it there too. The extra set of parenthesis are
>> removed in the attached patch. The patch also mechanically updates
>> things to be consistent with the text changes on the text thread [1]
>> -- I had to rebase.
>
> Attached is almost the same patch, but rebased. This was required
> because the name of our new macro was changed to
> DatumBigEndianToNative() at the last minute.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Andres Freund

On 2015-11-06 11:42:56 -0500, Robert Haas wrote:
> On Fri, Nov 6, 2015 at 2:47 AM, Michael Paquier
>  wrote:
> > I have as well thought a bit about adding a space-related constraint
> > on the standby snapshot generated by the bgwriter, so as to not rely
> > entirely on the interval of 15s. I finished with the attached that
> > uses a check based on CheckPointSegments / 8 to be sure that at least
> > this number of segments has been generated since the last checkpoint
> > before logging a new snapshot. I guess that's less brittle than the
> > last patch. Thoughts?
> 
> I can't see why that would be a good idea.  My understanding is that
> the logical decoding code needs to get those messages pretty
> regularly, and I don't see why that need would be reduced on systems
> where CheckPointSegments is large.

Precisely.

What I'm thinking of right now is a marker somewhere in shared memory,
that tells whether anything worthwhile has happened since the last
determination of the redo pointer. Where standby snapshots don't
count. That seems like it'd be to maintain going forward than doing
precise size calculations like CreateCheckPoint() already does, and
would additionally need to handle its own standby snapshot, not to speak
of the background ones.

Seems like it'd be doable in ReserveXLogInsertLocation().

Whether it's actually worthwhile I'm not all that sure tho.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Tom Lane

Nathan Wagner  writes:
> On Wed, Nov 04, 2015 at 12:51:52PM -0500, Tom Lane wrote:
>> I'm not very impressed with the first patch: it might save a few
>> geqo_randint() calls, but it seems to do so at the price of making the
>> swap choices less random --- for instance it sure looks to me like the
>> last array element is now less likely to participate in swaps than
>> other elements.  Unless you can prove that actually the swapping is
>> still unbiased, I'm inclined to reject this part.

> If I have understood the original code correctly, we need to select two
> different random integers between 0 and num_gene-1, inclusive.  That
> happens to be num_gene possible results.

> Having chosen the first one, which I will call "swap1", we now only have
> num_gene-1 possible results, which need to range from either 0 to
> swap1-1 or from swap1+1 to num_gene-1, which is num_gene-1 possible
> results.  I treat this as a single range from 0 to num_gene-2 and
> generate a number within that range, which I will call "swap2".

> If swap2 is between 0 and swap1-1, it is in the first range, and no
> adjustment is necessary.  If it is greater than or equal to swap1, then
> it is in the second range.  However the generated swap2 in the second
> range will be between swap1 and num_gene-2, whereas we need it to be
> between swap1+1 and num_gene-1, so I add one to swap2, adjusting the
> range to the needed range.

Ah, after thinking some more, I see how that works.  I tend to think
that your other proposal of

swap1 = geqo_randint(root, num_gene - 1, 0);
swap2 = geqo_randint(root, num_gene - 2, 0);
if (swap2 === swap1)
swap2 = num_gene - 1;

would be clearer, since only the forbidden case gets remapped.

However, really the whole argument is moot, because I notice that
geqo_mutation() is only called in the "#ifdef CX" code path, which
we don't use.  So there's little point in improving it.

(There's a fair amount of dead code in /geqo/, which I've never had
the energy to clean up, but maybe we should do that sometime.  It
seems unlikely that anyone will ever be interested in experimenting
with the ifdef'ed-out code paths.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 2:47 AM, Michael Paquier
 wrote:
> I have as well thought a bit about adding a space-related constraint
> on the standby snapshot generated by the bgwriter, so as to not rely
> entirely on the interval of 15s. I finished with the attached that
> uses a check based on CheckPointSegments / 8 to be sure that at least
> this number of segments has been generated since the last checkpoint
> before logging a new snapshot. I guess that's less brittle than the
> last patch. Thoughts?

I can't see why that would be a good idea.  My understanding is that
the logical decoding code needs to get those messages pretty
regularly, and I don't see why that need would be reduced on systems
where CheckPointSegments is large.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CustomScan support on readfuncs.c

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 2:02 AM, Kouhei Kaigai  wrote:
> I tried to split the previous version into two portions.
>
> - custom-scan-on-readfuncs.v2.patch
> It allows to serialize/deserialize CustomScan node as discussed upthread.
> Regarding of get_current_library_filename(), I keep this feature as
> the previous version right now, because I have no good alternatives.

Why can't the library just pass its name as a constant string?

> In this patch, the role of TextReadCustomScan callback is to clean out
> any tokens generated by TextOutCustomScan. The CustomScan node itself
> can be reconstructed with common portion because we expect custom_exprs
> and custom_private have objects which are safe to copyObject().

Some of the documentation changes for the embed-the-struct changes are
still present in the readfuncs patch.

Rather than adding TextReadCustomScan, I think we should rip
TextOutputCustomScan out.  It seems like a useless appendage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Within CF app, "Bug Fixes" should be "Bug Fixes/Refactoring"

2015-11-06 Thread Robert Haas

On Fri, Nov 6, 2015 at 12:52 AM, Michael Paquier
 wrote:
>> I guess I'm wondering whether there's really enough of this to need
>> its own category.
>
> We have a category "Code comments" as well. Let's give it a shot so I
> am adding it. We could always remove it later if necessary.

Ugh, OK, whatever.  That sounds like we have too many categories.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-06 Thread Robert Haas

On Thu, Nov 5, 2015 at 11:54 PM, Ashutosh Bapat
 wrote:
> Hi All,
> PFA patch to get data sorted from the foreign server (postgres_fdw)
> according to the pathkeys useful for merge join.
>
> For a given base relation (extendable to join when that becomes available in
> postgres_fdw), the patch tries to find merge joinable clauses. It then adds
> paths with pathkeys extracted out of these merge joinable clauses. The merge
> joinable clauses form equivalence classes. The patch searches in
> root->eq_classes for equivalence members belonging to the given relation.
> For every such expression it creates a single member pathkey list and
> corresponding path. The test postgres_fdw.sql has an existing join which
> uses merge join. With this patch the data is sorted on the foreign server
> than locally.
>
> While mergejoinable clauses can be obtained from rel->joininfo as well. But
> rel->joininfo contains other clauses as well and we need extra efforts to
> remove duplicates if the same expression appears in multiple merge joinable
> clauses.
>
> Two joining relations can have multiple merge joinable clauses, requiring
> multi-member pathkeys so that merge join is efficient to the maximum extent.
> The order in which the expressions appears in pathkeys can change the costs
> of sorting the data according to the pathkeys, depending upon the
> expressions and the presence of indexes containing those expressions. Thus
> ideally we would need to club all the expressions appearing in all the
> clauses for given two relations and create paths with pathkeys for every
> order of these expressions.That explodes the number of possible paths. We
> may restrict the number of paths created by considering only certain orders
> like sort_inner_and_outer(). In any case, costing such paths increases the
> planning time which may not be worth it. So, this patch uses a heuristic
> approach of creating single member pathkeys path for every merge joinable
> expression.
>
> The pathkeys need to be canonicalised using make_canonical_pathkey(), which
> is a static function. I have added a TODO and comments in the patch
> explaining possible ways to avoid "extern"alization of this function.
>
> Comments/suggestions are welcome.

I think this approach is generally reasonable, but I suggested parts
of it, so may be biased.  I would be interested in hearing the
opinions of others.

Random notes:

"possibily" is a typo.

usable_pklist is confusing because it seems like it might be talking
about primary keys rather than pathkeys.  Also, I realize now, looking
at this again, that you're saying "usable" when what I really think
you mean is "useful".  Lots of pathkeys are usable, but only a few of
those are useful.  I suggest renaming usable_pathkeys to
query_pathkeys and usable_pklist to useful_pathkeys.  Similarly, let's
rename generate_pathkeys_for_relation() to
get_useful_pathkeys_for_relation().

Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects.  Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage?  Maybe it is, but if so I'd like to hear a good explanation as
to why.

 Is the comment "Equivalence classes covering relations other than the
current one are of interest here" missing a "not"?

I don't find this comment illuminating:

+ * In case of child relation, we need to check that the
+ * equivalence class indicates a join to a relation other than
+ * parents, other children and itself (something similar to above).
+ * Otherwise we will end up creating useless paths. The code below is
+ * similar to generate_implied_equalities_for_column(), which might
+ * give a hint.

That basically just says that we have to do it this way because the
other way would be wrong.  But it doesn't say WHY the other way would
be wrong. Then a few lines later, you have another comment which says
the same thing again:

+/*
+ * Ignore equivalence members which correspond to children
+ * or same relation or to parent relations
+ */

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for geqo tweaks

2015-11-06 Thread Tom Lane

Nathan Wagner  writes:
> I see you committed a modified version of my patch in commit
> 59464bd6f928ad0da30502cbe9b54baec9ca2c69.

> You changed the tour[0] to be hardcoded to 1, but it should be any of
> the possible gene numbers from 0 to remainder.

How so?  The intent is to replace the first iteration of the Fisher-Yates
loop, not the old loop.  That iteration will certainly end by assigning
1 to tour[0], because it must choose j = i = 0.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #12989: pg_size_pretty with negative values

2015-11-06 Thread Robert Haas

On Thu, Nov 5, 2015 at 4:19 PM, Adrian.Vondendriesch
 wrote:
> New patch attached and rebased on HEAD
> (8c75ad436f75fc629b61f601ba884c8f9313c9af).

I've committed this with some modifications:

- I changed the comment for the half_rounded() macros because the one
you had just restated the code.
- I tightened up the coding in numeric_half_rounded() very slightly.
- You didn't, as far as I can see, modify the regression test schedule
to execute the files you added.  I consolidated them into one file,
added it to the schedule, and tightened up the SQL a bit.

Thanks for the patch, and please let me know if I muffed anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2015-11-06 Thread Alvaro Herrera

Craig Ringer wrote:
> On 6 November 2015 at 13:34, Robert Haas  wrote:
> 
> >> But some options control how
> >> next host should be choosen (i.e. use random order for load-balancing
> >> or sequential order for high availability), so they should be specified
> >> only once per connect string.
> >
> > But this seems like a point worthy of consideration.
> 
> This makes me think that trying to wedge this into the current API
> using a funky connection string format might be a mistake.
> 
> Lots of these issues would go away if you could provide more than just
> a connstring.

Yes, I agree.  I wonder if the failover aspect couldn't be better
covered by something more powerful than a single URI, such as the
service file format.  Maybe just allow the contents of a service file to
be passed as a "connection string", so that the application/environment
can continue to maintain the connection info as a string somewhere
instead of having to have an actual file.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-06 Thread Kouhei Kaigai

> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Friday, November 06, 2015 9:40 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Etsuro Fujita; Tom Lane; Kyotaro HORIGUCHI; pgsql-hackers@postgresql.org;
> Shigeru Hanada
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> 
> On Tue, Nov 3, 2015 at 8:15 AM, Kouhei Kaigai  wrote:
> > A challenge is that junk wholerow references on behalf of ROW_MARK_COPY
> > are injected by preprocess_targetlist(). It is earlier than the main path
> > consideration by query_planner(), thus, it is not predictable how remote
> > query shall be executed at this point.
> 
> Oh, dear.  That seems like a rather serious problem for my approach.
> 
> > If ROW_MARK_COPY, base tuple image is fetched using this junk attribute.
> > So, here is two options if we allow to put joined tuple on either of
> > es_epqTuple[].
> 
> Neither of these sounds viable to me.
> 
> I'm inclined to go back to something like what you proposed here:
>
Good :-)

> http://www.postgresql.org/message-id/9A28C8860F777E439AA12E8AEA7694F80114B89
> d...@bpxm15gp.gisp.nec.co.jp
>
This patch needs to be rebased.
One thing different from the latest version is fdw_recheck_quals of
ForeignScan was added. So, ...

(1) Principle is that FDW driver knows what qualifiers were pushed down
and how does it kept in the private field. So, fdw_recheck_quals is
redundant and to be reverted.

(2) Even though the principle is as described in (1), however,
wired logic in ForeignRecheck() and fdw_recheck_quals are useful
default for most of FDW drivers. So, it shall be kept and valid
only if RecheckForeignScan callback is not defined.

Which is better approach for the v3 patch?
My preference is (1), because fdw_recheck_quals is a new feature,
thus, FDW driver has to be adjusted in v9.5 more or less, even if
it already supports qualifier push-down.
In general, interface becomes more graceful to stick its principle.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-06 Thread Robert Haas

On Tue, Nov 3, 2015 at 8:15 AM, Kouhei Kaigai  wrote:
> A challenge is that junk wholerow references on behalf of ROW_MARK_COPY
> are injected by preprocess_targetlist(). It is earlier than the main path
> consideration by query_planner(), thus, it is not predictable how remote
> query shall be executed at this point.

Oh, dear.  That seems like a rather serious problem for my approach.

> If ROW_MARK_COPY, base tuple image is fetched using this junk attribute.
> So, here is two options if we allow to put joined tuple on either of
> es_epqTuple[].

Neither of these sounds viable to me.

I'm inclined to go back to something like what you proposed here:

http://www.postgresql.org/message-id/9a28c8860f777e439aa12e8aea7694f80114b...@bpxm15gp.gisp.nec.co.jp

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2015-11-06 Thread Ildus Kurbangaliev

On Wed, 23 Sep 2015 11:46:00 -0400
Robert Haas  wrote:

> On Wed, Sep 23, 2015 at 11:22 AM, Alvaro Herrera
>  wrote:
> > Robert Haas wrote:  
> >> On Tue, Sep 22, 2015 at 5:16 AM, Ildus Kurbangaliev
> >>  wrote:  
> >> > Yes, probably.
> >> > I'm going to change API calls as you suggested earlier.
> >> > How you do think the tranches registration after initialization
> >> > should look like?  
> >>
> >> I don't see any need to change anything there.  The idea there is
> >> that an extension allocates a tranche ID and are responsible for
> >> making sure that every backend that uses that tranche finds out
> >> about the ID that was chosen and registers a matching tranche
> >> definition.  How to do that is the extension's problem.  Maybe
> >> eventually we'll provide some tools to make that easier, but
> >> that's separate from the work we're trying to do here.  
> >
> > FWIW I had assumed, when you created the tranche stuff, that SLRU
> > users would all allocate their lwlocks from a tranche provided by
> > slru.c itself, and the locks would be stored in the slru Ctl
> > struct.  Does that not work for some reason?  
> 
> I think that should work and that it's a good idea.  I think it's just
> a case of nobody having done the work.
> 

There is a patch that splits SLRU LWLocks to separate tranches and
moves them to SLRU Ctl. It does some work from the main patch from
this thread, but can be commited separately. It also simplifies
lwlock.c.

-- 
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 3a58f1e..887efc9 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -456,7 +456,7 @@ void
 CLOGShmemInit(void)
 {
 	ClogCtl->PagePrecedes = CLOGPagePrecedes;
-	SimpleLruInit(ClogCtl, "CLOG Ctl", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
+	SimpleLruInit(ClogCtl, "CLOG", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
   CLogControlLock, "pg_clog");
 }
 
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index b21a313..3c8291c 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -478,7 +478,7 @@ CommitTsShmemInit(void)
 	bool		found;
 
 	CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
-	SimpleLruInit(CommitTsCtl, "CommitTs Ctl", CommitTsShmemBuffers(), 0,
+	SimpleLruInit(CommitTsCtl, "CommitTs", CommitTsShmemBuffers(), 0,
   CommitTsControlLock, "pg_commit_ts");
 
 	commitTsShared = ShmemInitStruct("CommitTs shared",
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 7d97085..1341c00 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1838,10 +1838,10 @@ MultiXactShmemInit(void)
 	MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
 
 	SimpleLruInit(MultiXactOffsetCtl,
-  "MultiXactOffset Ctl", NUM_MXACTOFFSET_BUFFERS, 0,
+  "MultiXactOffset", NUM_MXACTOFFSET_BUFFERS, 0,
   MultiXactOffsetControlLock, "pg_multixact/offsets");
 	SimpleLruInit(MultiXactMemberCtl,
-  "MultiXactMember Ctl", NUM_MXACTMEMBER_BUFFERS, 0,
+  "MultiXactMember", NUM_MXACTMEMBER_BUFFERS, 0,
   MultiXactMemberControlLock, "pg_multixact/members");
 
 	/* Initialize our shared state struct */
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 90c7cf5..1eb9a25 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -136,6 +136,16 @@ static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
 		  int segpage, void *data);
 static void SlruInternalDeleteSegment(SlruCtl ctl, char *filename);
 
+/* Add a postfix to a some string */
+static char *
+add_postfix(const char *name, const char *postfix)
+{
+	int len = strlen(name) + strlen(postfix) + 1;
+	char *buf = (char *) palloc(len);
+	snprintf(buf, len, "%s%s", name, postfix);
+	return buf;
+}
+
 /*
  * Initialization of shared memory
  */
@@ -157,6 +167,8 @@ SimpleLruShmemSize(int nslots, int nlsns)
 	if (nlsns > 0)
 		sz += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));	/* group_lsn[] */
 
+	sz += MAXALIGN(nslots * sizeof(LWLockPadded)); /* lwlocks[] */
+
 	return BUFFERALIGN(sz) + BLCKSZ * nslots;
 }
 
@@ -164,19 +176,23 @@ void
 SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
 			  LWLock *ctllock, const char *subdir)
 {
-	SlruShared	shared;
-	bool		found;
+	SlruShared	 shared;
+	bool		 found;
+	char		*shared_key = add_postfix(name, " SLRU Ctl");
 
-	shared = (SlruShared) ShmemInitStruct(name,
+	shared = (SlruShared) ShmemInitStruct(shared_key,
 		  SimpleLruShmemSize(nslots, nlsns),
 		  &found);
+	pfree(shared_key);
 
 	if (!IsUnderPostmaster)
 	{
 		/* Initialize locks and shared memory area */
-		char	   *ptr;
-		Size		offset;
-		int			slotno;
+		char*ptr;
+		Size offs

Re: [HACKERS] extend pgbench expressions with functions

2015-11-06 Thread Fabien COELHO




Those can be avoided in other ways.  For example:


Ok, ok, I surrender:-)

Here is a v15 which hides conversions and assignment details in macros and 
factors out type testing of overloaded operators so that the code 
expansion is minimal (basically the operator evaluation is duplicated for 
int & double, but the rest is written once). The evaluation cost is 
probably slightly higher than the previous version because of the many 
hidden type tests.


Note that variables are only int stored as text. Another patch may try to 
propagate the value structure for variables, but then it changes the query 
expansion code, it is more or less orthogonal to add functions. Moreover 
double variables would not be really useful anyway.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..59445de 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -771,24 +771,28 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14156,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

 

 
- \setrandom varname min max [ uniform | { gaussian | exponential } threshold ]
+ \setrandom varname min max [ uniform | { gaussian | exponential } param ]
  
 
 
@@ -801,57 +805,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory threshold which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -threshold on the left and +threshold
-  on the right.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, then value i
-  between min and max inclusive is drawn
-  with probability:
-  
-(PHI(2.0 * threshold * (i - min - mu + 0.5) / (max - min + 1)) -
- PHI(2.0 * threshold * (i - min - mu - 0.5) / (max - min + 1))) /
- (2.0 * PHI(threshold) - 1.0).
-  Intuitively, the larger the threshold, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds.
-  About 67% of values are drawn from the middle 1.0 / threshold
-  and 95% in the middle 2.0 / threshold; for instance, if
-  threshold is 4.0, 67% of values are drawn from the middle
-  quarter and 95% from the middle half of the interval.
-  The minimum threshold is 2.0 for performance of
-  the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, the threshold
-  parameter controls the distribution by truncating a quickly-decreasing
-  exponential distribution at threshold, and then
-  projecting onto integers between the bounds.
-  To be precise, value i between min and
-  max inclusive is drawn with probability:
-  (exp(-threshold*(i-min)/(max+1-min)) -
-   exp(-threshold*(i+1-min)/(max+1-min))) / (1.0 - exp(-threshold)).
-  Intuitively, the larger the threshold, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 the threshold, the flatter (more uniform) the access
-  distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  threshold%  of the time.
-  The threshold value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses

Re: [HACKERS] [PROPOSAL] Improvements of Hunspell dictionaries support

2015-11-06 Thread Artur Zakirov


06.11.2015 12:33, Artur Zakirov пишет:

Hello again!

Patches
===


Link to commitfest:
https://commitfest.postgresql.org/8/420/

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] RFC: Add length parameterised dmetaphone functions

2015-11-06 Thread Albe Laurenz

Christian Marie wrote:
> A developer I work with was trying to use dmetaphone to group people names 
> into
> equivalence classes. He found that many long names would be grouped together
> when they shouldn't be, this turned out to be because dmetaphone has an
> undocumented upper bound on its output length, of four. This is obviously
> impractical for many use cases.
> 
> This patch addresses this by adding and documenting an optional argument to
> dmetaphone and dmetaphone_alt that specifies the maximum output length. This
> makes it possible to use dmetaphone on much longer inputs.
> 
> Backwards compatibility is catered for by making the new argument optional,
> defaulting to the old, hard-coded value of four. We now have:
> 
>   dmetaphone(text source) returns text
>   dmetaphone(text source, int max_output_length) returns text
>   dmetaphone_alt(text source) returns text
>   dmetaphone_alt(text source, int max_output_length) returns text

I like the idea.

How about:
dmetaphone(text source, int max_output_length DEFAULT 4) returns text
dmetaphone_alt(text source, int max_output_length DEFAULT 4) returns text

Saves two functions and is self-documenting.

> +postgres=# select dmetaphone('unicorns');
> + dmetaphone
> +
> + ANKR
> +(1 row)
> +
> +postgres=# select dmetaphone('unicorns', 8);
> + dmetaphone
>  
> - KMP
> + ANKRNS
>  (1 row)
>  
>   

Yeah, "ponies" would have been too short...

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Improvements of Hunspell dictionaries support

2015-11-06 Thread Artur Zakirov


Hello again!

Patches
===

I had implemented support for FLAG long, FLAG num and AF parameters. I 
attached patch to the e-mail (hunspell-dict.patch).


This patch allow to use Hunspell dictionaries listed in the previous 
e-mail: ar, br_fr, ca, ca_valencia, en_ca, en_gb, en_us, fr, gl_es, 
hu_hu, is, ne_np, nl_nl, si_lk.


The most part of changes was in spell.c in the affix file parsing code.
The following are dictionary structures changes:
- useFlagAliases and flagMode fields had been added to the IspellDict 
struct;

- flagval array size had been increased from 256 to 65000;
- flag field of the AFFIX struct also had been increased.

I also had implemented a patch that fixes an error from the e-mail
http://www.postgresql.org/message-id/562e1073.8030...@postgrespro.ru
This patch just ignore that error.

Tests
=

Extention test dictionaries for loading into PostgreSQL and for 
normalizing with ts_lexize function can be downloaded from 
https://dl.dropboxusercontent.com/u/15423817/HunspellDictTest.tar.gz


It would be nice if somebody can do additional tests of dictionaries of 
well known languages. Because I do not know many of them.


Other Improvements
==

There are also some parameters for compound words. But I am not sure 
that we want use this parameters.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
*** a/src/backend/tsearch/spell.c
--- b/src/backend/tsearch/spell.c
***
*** 237,242  cmpaffix(const void *s1, const void *s2)
--- 237,309 
  	   (const unsigned char *) a2->repl);
  }
  
+ static unsigned short
+ decodeFlag(IspellDict *Conf, char *sflag, char **sflagnext)
+ {
+ 	unsigned short	s;
+ 	char		   *next;
+ 
+ 	switch (Conf->flagMode)
+ 	{
+ 		case FM_LONG:
+ 			s = (int)sflag[0] << 8 | (int)sflag[1];
+ 			if (sflagnext)
+ *sflagnext = sflag + 2;
+ 			break;
+ 		case FM_NUM:
+ 			s = (unsigned short) strtol(sflag, &next, 10);
+ 			if (sflagnext)
+ 			{
+ if (next)
+ {
+ 	*sflagnext = next;
+ 	while (**sflagnext)
+ 	{
+ 		if (**sflagnext == ',')
+ 		{
+ 			*sflagnext = *sflagnext + 1;
+ 			break;
+ 		}
+ 		*sflagnext = *sflagnext + 1;
+ 	}
+ }
+ else
+ 	*sflagnext = 0;
+ 			}
+ 			break;
+ 		default:
+ 			s = (unsigned short) *((unsigned char *)sflag);
+ 			if (sflagnext)
+ *sflagnext = sflag + 1;
+ 	}
+ 
+ 	return s;
+ }
+ 
+ static bool
+ isAffixFlagInUse(IspellDict *Conf, int affix, unsigned short affixflag)
+ {
+ 	char *flagcur;
+ 	char *flagnext = 0;
+ 
+ 	if (affixflag == 0)
+ 		return true;
+ 
+ 	flagcur = Conf->AffixData[affix];
+ 
+ 	while (*flagcur)
+ 	{
+ 		if (decodeFlag(Conf, flagcur, &flagnext) == affixflag)
+ 			return true;
+ 		if (flagnext)
+ 			flagcur = flagnext;
+ 		else
+ 			break;
+ 	}
+ 
+ 	return false;
+ }
+ 
  static void
  NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
  {
***
*** 355,361  FindWord(IspellDict *Conf, const char *word, int affixflag, int flag)
  	else if ((flag & StopMiddle->compoundflag) == 0)
  		return 0;
  
! 	if ((affixflag == 0) || (strchr(Conf->AffixData[StopMiddle->affix], affixflag) != NULL))
  		return 1;
  }
  node = StopMiddle->node;
--- 422,428 
  	else if ((flag & StopMiddle->compoundflag) == 0)
  		return 0;
  
! 	if (isAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
  		return 1;
  }
  node = StopMiddle->node;
***
*** 394,400  NIAddAffix(IspellDict *Conf, int flag, char flagflags, const char *mask, const c
  
  	Affix = Conf->Affix + Conf->naffixes;
  
! 	if (strcmp(mask, ".") == 0)
  	{
  		Affix->issimple = 1;
  		Affix->isregis = 0;
--- 461,467 
  
  	Affix = Conf->Affix + Conf->naffixes;
  
! 	if (strcmp(mask, ".") == 0 || *mask == '\0')
  	{
  		Affix->issimple = 1;
  		Affix->isregis = 0;
***
*** 595,604  addFlagValue(IspellDict *Conf, char *s, uint32 val)
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg("multibyte flag character is not allowed")));
  
! 	Conf->flagval[*(unsigned char *) s] = (unsigned char) val;
  	Conf->usecompound = true;
  }
  
  /*
   * Import an affix file that follows MySpell or Hunspell format
   */
--- 662,719 
  (errcode(ERRCODE_CONFIG_FILE_ERROR),
   errmsg("multibyte flag character is not allowed")));
  
! 	Conf->flagval[decodeFlag(Conf, s, (char **)NULL)] = (unsigned char) val;
  	Conf->usecompound = true;
  }
  
+ static int
+ getFlagValues(IspellDict *Conf, char *s)
+ {
+ 	uint32	 flag = 0;
+ 	char	*flagcur;
+ 	char	*flagnext = 0;
+ 
+ 	flagcur = s;
+ 	while (*flagcur)
+ 	{
+ 		flag |= Conf->flagval[decodeFlag(Conf, flagcur, &flagnext)];
+ 		if (flagnext)
+ 			flagcur = flagnext;
+ 		else
+ 			break;
+ 	}
+ 
+ 	return flag;
+ }
+ 
+ /*
+  * Get flag set from "s".
+  *
+  * Returns flag set from AffixData array if AF parameter used (useFlagAliases is true).
+  * In this case "s" is alias for flag

Re: [HACKERS] SortSupport for UUID type

2015-11-06 Thread Kyotaro HORIGUCHI

Hello, I tried to look on this as far as I can referring to
numeric.c..

1. Patch application

  This patch applies on the current master cleanly. And looks to
  be work as expected.

2. uuid.c

  pg_bswap.h is included under hash.h so it is not needed to be
  included but I don't object if you include it explicitly.

3.  uuid_sortsupport()

  The comment for PrepareSortSupportFromIndexRel says that,

 > * Caller must previously have zeroed the SortSupportData structure and then
 > * filled in ssup_cxt, ssup_attno, ssup_collation, and ssup_nulls_first.  This

 And all callers comply with this order, and numeric_sortsupport
 relies on this contract and does not initialize ssup->extra but
 uuid_sortsupport does. I think these are better to be in uniform
 style. I think removing ssup_extra = NULL and adding a comment
 that ssup_extra has been initialized by the caller instead.

4. uuid_cmp_abbrev()

 Although I don't know it is right or not, 3-way comparison
 between two Datums surely works.

5. uuid_abbrev_abort()

 I don't know the validity of every thresholds, but they work as
 expected.

6. uuid_abbrev_convert()

 > memcpy((char *) &res, authoritative->data, sizeof(Datum));

 memcpy's prototype is "memcpy(void *dest..." so the cast to
 (char *) is not necessary.

7. system catalogs

 uuid_sortsupport looks to be properly registered so that
 PrepareSortSupportFrom(OrderingOp|IndexRel)() can find and it.

regards,

At Thu, 5 Nov 2015 16:10:21 -0800, Peter Geoghegan  wrote in 

> On Thu, Oct 8, 2015 at 5:27 PM, Peter Geoghegan  wrote:
> > This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
> > you should change it there too. The extra set of parenthesis are
> > removed in the attached patch. The patch also mechanically updates
> > things to be consistent with the text changes on the text thread [1]
> > -- I had to rebase.
> 
> Attached is almost the same patch, but rebased. This was required
> because the name of our new macro was changed to
> DatumBigEndianToNative() at the last minute.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some questions about the array.

2015-11-06 Thread YUriy Zhuravlev

On Thursday 05 November 2015 23:45:53 you wrote:
> On Thu, Nov 5, 2015 at 9:57 AM, YUriy Zhuravlev
> 
>  wrote:
> > Hello hackers.
> > There are comments to my patch? Maybe I should create a separate thread?
> > Thanks.
> 
> You should add this on commitfest.postgresql.org.
I created a couple of weeks ago:
https://commitfest.postgresql.org/7/397/

> 
> I'm sure I know your answer, but what do other people think?
I wonder the same thing.

Thanks.
-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some questions about the array.

2015-11-06 Thread YUriy Zhuravlev

On Thursday 05 November 2015 22:33:37 you wrote:
> Would something like array[1:~1] as a syntax be acceptable to denote
> backward counting?
Very interesting idea! I could implement it. I just need to check for side 
effects.

-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

67 matches

Mail list logo