date:20131204

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Jeff Davis

On Tue, 2013-12-03 at 08:44 -0500, Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > > On 3 December 2013 02:02, Dimitri Fontaine  wrote:
> > > ISTM that the real solution to this particular problem is to decouple
> > > the extensions that are currently in contrib from a specific postgres
> > > version.
> > 
> > "Problem"?  It's not a bug that you get hstore 1.2 when you dump from 9.2
> > and reload into 9.3; that's a feature.  You wanted an upgrade, presumably,
> 
> I don't buy this argument at *all* and it's not going to fly when we've
> got multiple versions of an extension available concurrently.

It seems there's a use case for both behaviors; perhaps we should
include it in the control information?

  preserve_version_on_dump (boolean)

FWIW, I find the current behavior surprising when in the mindset of a
SQL extension. But it makes sense for things more closely tied to the
backend.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ruleutils vs. empty targetlists

2013-12-04 Thread Dean Rasheed

On 3 December 2013 23:37, Tom Lane  wrote:
> Thinking some more about bug #8648, it occurred to me that ruleutils.c
> isn't exactly prepared for the case either:
>
> regression=# create table nocols();
> CREATE TABLE
> regression=# create view vv1 as select exists (select * from nocols);
> CREATE VIEW
> regression=# \d+ vv1
>   View "public.vv1"
>  Column |  Type   | Modifiers | Storage | Description
> +-+---+-+-
>  exists | boolean |   | plain   |
> View definition:
>  SELECT (EXISTS ( SELECT
>FROM nocols)) AS "exists";
>
> which of course is illegal syntax.  I thought at first that this could be
> fixed trivially by emitting "*" if get_target_list found no columns.
> However, that doesn't quite work; consider
>
> create view vv2 as select exists (select nocols.* from nocols, somecols);
>
> If we regurgitate this as "exists (select * from nocols, somecols)", it
> wouldn't mean the same thing.
>
> But on the third hand, at least in the context of an EXISTS() subquery,
> it doesn't really matter because the tlist is irrelevant, at least as long
> as it only contains Vars.  So you could argue that emitting "*" is plenty
> good enough even in the above example.  I'm not certain that that applies
> everywhere that a tlist could appear, though.
>
> I experimented with code that would attempt to regurgitate "nocols.*"
> in the above example; see attached draft patch.  I don't like this patch
> much: it's ugly, it'd be a pain to backport (because it's digging into
> data structures that have changed repeatedly), and I'm not sure how much
> I trust it anyway.  So I'm leaning towards just doing
>
> +   if (colno == 0)
> +   appendStringInfoString(buf, " *");
>
> at least till such time as somebody shows a case where this isn't good
> enough.
>

Well here's a contrived example with grouping:

create table nocols();
create table somecols(a int, b int);
create view v as select exists(select nocols.* from nocols, somecols
group by somecols.a);

Simply turning that tlist into "*" makes the query invalid because
somecols.b doesn't appear in the group by clause. So in this case, it
needs to try to output a tlist that contains no columns.

Regards,
Dean


> Note that I wouldn't be thinking of this as something to back-patch
> at all, except that if someone did have a view that looked like this,
> they'd find that pg_dump output wouldn't restore.  You could construct
> scenarios where that could seem like a denial of service, particularly
> if the DBA wasn't smart enough to figure out what was wrong with his
> dump.  (And who wants to deal with such a thing at 4AM anyway ...)
>
> Comments?
>
> regards, tom lane
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-12-04 Thread Sameer Thakur

> >I've cleaned this up - revision attached - and marked it "ready for
> committer".
> Thank you for this.
>
  I did the basic hygiene test. The patch applies correctly and compiles
with no warnings. Did not find anything broken in basic functionality.
  In the documentation i have a minor suggestion of replacing phrase "might
judge to be a non-distinct " with ->" may judge to be non-  distinct".

regards
Sameer




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/pg-stat-statements-calls-under-estimation-propagation-tp5738128p5781577.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] logical changeset generation v6.7

2013-12-04 Thread Kyotaro HORIGUCHI

Hello, this is cont'd comments.

> 0008 and after to come later..

I had nothing to comment for patch 0008.

= 0009: 

 - In repl_scanner.l, you omitted double-doublequote handling for
   replication but it should be implemented. Zero-length
   identifier check might be needed depending on the upper-layer.

 - In walsender.c, the log messages "Initiating logical rep.."
   and "Starting logical replication.." should be INFO or LOG in
   loglevel, not WARNING. And 'rep' in the former message would
   be better not abbreviated since not done so in the latter.

 - In walsender.c, StartLogicalReplication seems trying to abort
   itself for timeline change. But timeline changes in 9.3+ don't
   need such an aid. You'd better consult StartReplication in
   current master for detail. There might be other defferences.

 - In walsender.c, the typedef name WalSndSendData doesn't seem
   to be a function pointer. I suppose passing bare function
   pointer to WanSndLoop and WalSndDone is not a good deed. It'd
   be better to wrap it in any struct for callback, say,
   LogicalDecodingContext. It'd be even better if it could be a
   common struct with 'physycal' replication.

 - In walsender.c, I wonder if the differences are necessary
   between logical and physical replication in fetching latest
   WALs, construction of WAL sending loop and so on .. Logical
   walsender seems to be implimentated in somewhat ad-hoc way on
   the whole. I belive it could be more commonize in the base
   structure.

 - In procarray.c, the added two includes which is not
   accompanied by any other modification are needless. make emits
   no error or warning without them.

...Time's up. It'll be continued for later from 0010..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Jeff Davis

On Mon, 2013-12-02 at 15:44 -0500, Stephen Frost wrote:
> How are we going to handle new keywords
> being added in new major versions?  A pg_dump of the extension template
> script is then going to be loaded into the new major version but will
> not actually be able to be run because it'll error out...

Elsewhere in the thread you argued that the version of an extension
should be preserved across dump/reload. Surely a given version of the
extension corresponds to a specific set of SQL commands (specifically,
the SQL text blob on PGXN), so it *should* error out.

Otherwise you end up with a weird situation where upgrading a 9.4
install to 9.5 allows you to keep version 1.2 of some extension, but 1.2
won't install directly to 9.5. (By the way, I think this is a problem
with pg_upgrade currently.)

You're fighting pretty hard against text blobs, but at the same time
saying that we should be able to fully make use of existing PGXN
extensions, which contain text blobs of SQL. And extension authors are
versioning their SQL blobs, not some abstract concepts internal to
postgres and the catalogs.

Just because we start with blobs from PGXN doesn't mean we need to use
blobs everywhere; but I think you're too quick to rule them out.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Jeff Davis

On Tue, 2013-12-03 at 10:23 -0500, Robert Haas wrote:
> In more normal cases, however, the system can (and probably should)
> figure out what was intended by choosing the *shortest* path to get to
> the intended version.  For example, if someone ships 1.0, 1.0--1.1,
> 1.1, and 1.1--1.2, the system should choose to run 1.1 and then
> 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2.  But that can
> be automatic: only if there are two paths of equal length (as in the
> example in the previous paragraph) do we need help from the user to
> figure out what to do.

Why do we need help from the user? Just pick a path.

For an extension update, I understand why someone wouldn't want to
accidentally downgrade 5 versions (dropping all of their dependent
objects) before updating to the latest. But this doesn't apply to
creation.

And it just seems really awkward to document, and it's a constant
maintenance burden on extension authors to specify their upgrade paths
every time they release a new version.

> Putting all that together, I'm inclined to suggest that what we really
> need is a LIST of version numbers, rather than just one.  If there one
> path to the version we're installing is shorter than any other, we
> choose that, period.  If there are multiple paths of equal length, we
> break the tie by choosing which version number appears first in the
> aforementioned list.  If that still doesn't break the tie, either
> because none of the starting points are mentioned in that list or
> because there are multiple equal-length paths starting in the same
> place, we give up and emit an error.

That seems like extreme overkill, and still doesn't give users full
control over upgrade paths.

Regards,
Jeff Davis





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Heikki Linnakangas


On 12/04/2013 01:08 AM, Tom Lane wrote:

Magnus Hagander  writes:

On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus  wrote:

Would certainly be nice.  Realistically, getting good automated
performace tests will require paying someone like Greg S., Mark or me
for 6 solid months to develop them, since worthwhile open source
performance test platforms currently don't exist.  That money has never
been available; maybe I should do a kickstarter.



So in order to get *testing* we need to pay somebody. But to build a great
database server, we can rely on volunteer efforts or sponsorship from
companies who are interested in moving the project forward?


And even more to the point, volunteers to reinvent the kernel I/O stack
can be found on every street corner?


Actually, yes, I think so. That's a lot more exciting to work on than a 
regression test suite.



 And those volunteers won't need any
test scaffolding to be sure that *their* version never has performance
regressions?  (Well, no, they won't, because no such thing will ever be
built.  But we do need better test scaffolding for real problems.)


Maybe we should lie, and *say* that we want direct I/O, but require that 
all submissions come with a test suite to prove that it's a gain. Then 
someone might actually write one, as a sidekick of a direct I/O patch. 
Then we could toss out the direct I/O stuff and take only the test 
framework.


FWIW, I also think that it'd be a folly to reimplement the I/O stack. 
The kernel does a lot of things for us. It might not do a great job, but 
it's good enough. As one datapoint, before my time, the VMware vPostgres 
team actually did use direct I/O in vPostgres. We shipped that in a few 
releases. It was a lot of effort to get the code right, and for DBAs, it 
made correct tuning of shared_buffers a lot more important - set it too 
low and you want take full advantage of your RAM, set it too high and 
you won't have memory available for other things. To be a good VM 
citizen, they also had to implement a memory ballooning module inside 
Postgres, to release shared buffers if the system hosting the VM is 
under memory pressure. What did we gain by doing all that, compared to 
just letting the kernel handle? Some extra performance in some use 
cases, and a loss in others. Not worth the trouble.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v6.7

2013-12-04 Thread Andres Freund

On 2013-12-04 17:31:50 +0900, Kyotaro HORIGUCHI wrote:
> = 0009: 
> 
>  - In repl_scanner.l, you omitted double-doublequote handling for
>replication but it should be implemented. Zero-length
>identifier check might be needed depending on the upper-layer.

I am not sure what you mean here. IDENT can be double quoted, and so can
the option names?

>  - In walsender.c, the log messages "Initiating logical rep.."
>and "Starting logical replication.." should be INFO or LOG in
>loglevel, not WARNING. And 'rep' in the former message would
>be better not abbreviated since not done so in the latter.

Agreed.

>  - In walsender.c, StartLogicalReplication seems trying to abort
>itself for timeline change. But timeline changes in 9.3+ don't
>need such an aid. You'd better consult StartReplication in
>current master for detail. There might be other defferences.

Timeline increases currently need work, yes, that error messgage is the
smallest part...

>  - In walsender.c, the typedef name WalSndSendData doesn't seem
>to be a function pointer. I suppose passing bare function
>pointer to WanSndLoop and WalSndDone is not a good deed. It'd
>be better to wrap it in any struct for callback, say,
>LogicalDecodingContext. It'd be even better if it could be a
>common struct with 'physycal' replication.

I don't see that as being realistic/advantageous. Wrapping a function
pointer in a struct doesn't improve anything in itself.

I was thinking we might want to just decouple the entire event loop and
not reuse that code, but that's ugly as well.

>  - In walsender.c, I wonder if the differences are necessary
>between logical and physical replication in fetching latest
>WALs, construction of WAL sending loop and so on .. Logical
>walsender seems to be implimentated in somewhat ad-hoc way on
>the whole. I belive it could be more commonize in the base
>structure.

That's because the xlogreader.h interface - over my loud protests -
doesn't support chunk-wise reading of the WAL stream and neccessicates
blocking inside the reader callback. So the event loop needs to be in
several individual functions (WalSndLoop, WalSndWaitForWal,
WalSndWriteData) instead of once in WalSndLoop…

>  - In procarray.c, the added two includes which is not
>accompanied by any other modification are needless. make emits
>no error or warning without them.

Right. Will remove.

Thanks,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-12-04 Thread Dean Rasheed

On 2 December 2013 04:55, Pavel Stehule  wrote:
> Hello
>
> it looks well, thank you
>
> Regards
>
> Pavel
>

I've been thinking about this some more, and there's another case that
concerns me slightly. We're now making some of the DROP...IF EXISTS
commands tolerate non-existent types as well as non-existent schemas
--- functions, aggregates, casts and operators all have type names in
their specifications. Of course it's possible that the type is missing
because it was in a schema that was dropped, so this change seems to
be in spirit of what was discussed, but it seems like a change that
might catch some people out.

I think that, on balance, it is a sensible change, since if the type
doesn't exist, the dependent object can't exist either, so DROP...IF
EXISTS shouldn't be raising an error. However, I wonder if we should
be issuing a more specific NOTICE in this case too --- i.e., check for
non-existent types in the same way as we check for non-existent parent
objects --- type_does_not_exist_skipping() and
type_list_does_not_exist_skipping().

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund

On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
> >4) Start the slave and connect to it using psql and in another session I can 
> >see
> >all archive recovery log
> Hmm...　I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=3), it might work collectry.
> However, I sat long timed recovery_time_delay(=300), it didn't work.

> My reporduced operation log is under following.
> >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
> >starting vacuum...end.
> >transaction type: TPC-B (sort of)
> >scaling factor: 10
> >query mode: simple
> >number of clients: 8
> >number of threads: 4
> >duration: 30 s
> >number of transactions actually processed: 68704
> >latency average: 3.493 ms
> >tps = 2289.196747 (including connections establishing)
> >tps = 2290.175129 (excluding connections establishing)
> >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> >server starting
> >[mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in 
> >recovery at 2013-12-03 10:26:41 JST
> >LOG:  entering standby mode
> >LOG:  consistent recovery state reached at 0/5C4D8668
> >LOG:  redo starts at 0/5C4000D8
> >[mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >FATAL:  the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL:  the database system is starting up
> >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> >psql: FATAL:  the database system is starting up
> I attached my postgresql.conf and recovery.conf. It will be reproduced.

So, you brought up a standby and it took more time to become consistent
because it waited on commits? That's the problem? If so, I don't think
that's a bug?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-12-04 Thread Magnus Hagander

On Mon, Dec 2, 2013 at 10:42 PM, Peter Eisentraut  wrote:

> On 11/19/13, 11:30 PM, Peter Geoghegan wrote:
> >>> +1 from me.
> >> >
> >> >
> >> > That's +1 for *not* including this?
> > Right.
>
> I agree with not including this.
>
>
If you're looking for more of those, here's another +1 for not including
it. (And of course also for Peters comment about that we need to figure out
if something is actually missing for building this higher level tool that
can provide this information and more)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Christian Kruse

Hi,

On 04/12/13 11:13, KONDO Mitsumasa wrote:
> >1) Clusters
> >- build master
> >- build slave and attach to the master using SR and config 
> >recovery_time_delay to
> >1min.
> >
> >2) Stop de Slave
> >
> >3) Run some transactions on the master using pgbench to generate a lot of 
> >archives
> >
> >4) Start the slave and connect to it using psql and in another session I can 
> >see
> >all archive recovery log
> Hmm...　I had thought my mistake in reading your email, but it reproduce again.
> When I sat small recovery_time_delay(=3), it might work collectry.
> However, I sat long timed recovery_time_delay(=300), it didn't work.
> […]

I'm not sure if I understand your problem correctly. I try to
summarize, please correct if I'm wrong:

You created a master node and a hot standby with 300 delay. Then
you stopped the standby, did the pgbench and startet the hot standby
again. It did not get in line with the master. Is this correct?

I don't see a problem here… the standby should not be in sync with the
master, it should be delayed. I did step by step what you did and
after 50 minutes (300ms) the standby was at the same level the
master was.

Did I missunderstand you?

Regards,
 Christian Kruse

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgp7HACTkLsby.pgp
Description: PGP signature

Re: [HACKERS] Problem with displaying "wide" tables in psql

2013-12-04 Thread Alexander Korotkov

On Wed, Dec 4, 2013 at 11:20 AM, Sergey Muraviov <
sergey.k.murav...@gmail.com> wrote:

> Thank you for this trick.
> It would be nice if this trick was documented.
>
> However, with the pager I can't see wide value on one screen, select and
> copy it entirely.
> And I have to press many keys to find the necessary part of the value.
> There is no such problems with the patch.
>

I think the solution is to provide proposed behaviour as an option.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] [bug fix] "pg_ctl stop" times out when it should respond quickly

2013-12-04 Thread MauMau


From: "Tom Lane" 

I think the reason why it was coded like that was that we hadn't written
postmaster_is_alive() yet, or maybe we had but didn't want to trust it.
However, with the coding you have here, we're fully exposed to any failure
modes postmaster_is_alive() may have; so there's not a lot of value in
accepting those and get_pgpid's failure modes too.


Thank you for reviewing the patch so quickly.  You are right, I don't think 
get_pgpid() here is no longer necessary.  If the pid changes in one second, 
i.e. the original postgres terminates and "pg_ctl start" starts another one, 
"pg_ctl stop" can terminate successfully because the original postgres it 
was waiting for actually terminated.


I'll submit the revised patch tomorrow.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder

2013-12-04 Thread MauMau


From: "MauMau" 
In addition, I'll remove libpq.dll from lib folder unless somebody 
objects.

Currently, libpq.dll is placed in both bin and lib.  I guess libpq.dll was
left in lib because it was considered necessary for ECPG DLLs.


The attached patch also removes libpq.dll from lib folder.  I don't mind 
whether this patch or yesterday's one will be adopted.  I'll add this to 
2014-1 commitfest this weekend if the patch is not committed until then.


Regards
MauMau


ECPG_DLL.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

>   I think all of this data cannot fit in shared_buffers, you might want
to increase shared_buffers
>   to larger size (not 30GB but close to your data size) to see how it
behaves.

When I use shared_buffers larger than my data size such as 10 GB, results
scale nearly as expected at least for this instance type.

> You queries have Aggregation, ORDER/GROUP BY, so there is a chance
> that I/O can happen for those operation's
> if PG doesn't have sufficient memory (work_mem) to perform such operation.

I used work_mem as 32 MB, this should be enough for these queries. I also
tested with higher values of work_mem, and didn't obverse any difference.

> Can you simplify your queries (simple scan or in other words no
> aggregation or other things) to see how
> they behave in your env., once you are able to see simple queries
> scaling as per your expectation, you
> can try with complex one's.

Actually we observe problem when queries start to get simpler such as
select count(*). Here is the results table in more compact format:

  select count(*) TPC-H Simple(#6) TPC-H Complex(#1)
1 Table / 1 query  1.5 s2.5 s   8.4 s
2 Tables/ 2 queries1.5 s2.5 s   8.4 s
4 Tables/ 4 queries2.0 s2.9 s   8.8 s
8 Tables/ 8 queries3.3 s4.0 s   9.6 s

> Can we have the explain analyze of those queries, postgres
> configuration, perhaps vmstat output during execution?

postgres=# explain analyze SELECT count(*) from lineitem_1;
  QUERY PLAN

--
 Aggregate  (cost=199645.01..199645.02 rows=1 width=0) (actual
time=11317.391..11317.393 rows=1 loops=1)
   ->  Seq Scan on lineitem_1  (cost=0.00..184641.81 rows=6001281 width=0)
(actual time=0.011..5805.255 rows=6001215 loops=1)
 Total runtime: 11317.440 ms
(3 rows)

postgres=# explain analyze SELECT
postgres-# sum(l_extendedprice * l_discount) as revenue
postgres-# FROM
postgres-# lineitem_1
postgres-# WHERE
postgres-# l_shipdate >= date '1994-01-01'
postgres-# AND l_shipdate < date '1994-01-01' + interval '1' year
postgres-# AND l_discount between 0.06 - 0.01 AND 0.06 + 0.01
postgres-# AND l_quantity < 24;
 QUERY PLAN

--
 Aggregate  (cost=260215.36..260215.37 rows=1 width=16) (actual
time=1751.775..1751.776 rows=1 loops=1)
   ->  Seq Scan on lineitem_1  (cost=0.00..259657.82 rows=111508 width=16)
(actual time=0.031..1630.449 rows=114160 loops=1)
 Filter: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate <
'1995-01-01 00:00:00'::timestamp without time zone) AND (l_discount >=
0.05::double precision) AND (l_discount <= 0.07::double precision) AND
 (l_quantity < 24::double precision))
 Rows Removed by Filter: 5887055
 Total runtime: 1751.830 ms
(5 rows)

postgres=# explain analyze SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
FROM
lineitem_1
WHERE
l_shipdate <= date '1998-12-01' - interval '90' day
GROUP BY
l_returnflag,
l_linestatus
ORDER BY
l_returnflag,
l_linestatus;
 QUERY PLAN

-
 Sort  (cost=436342.68..436342.69 rows=6 width=36) (actual
time=18720.932..18720.936 rows=4 loops=1)
   Sort Key: l_returnflag, l_linestatus
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=436342.49..436342.60 rows=6 width=36) (actual
time=18720.887..18720.892 rows=4 loops=1)
 ->  Seq Scan on lineitem_1  (cost=0.00..199645.01 rows=5917437
width=36) (actual time=0.011..6754.619 rows=5916591 loops=1)
   Filter: (l_shipdate <= '1998-09-02 00:00:00'::timestamp
without time zone)
   Rows Removed by Filter: 84624
 Total runtime: 18721.021 ms
(8 rows)


Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
(#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
and 1.

procs ---memory-- ---swap-- -io --system--
-cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id
wa st
 0  0  0 30093568  84892 3872389600 0 0   22   14  0  0
100  0  0
 8  1  0 30043056  84892 3872389600 0 0 27080 52708 16

[HACKERS] [bug fix] psql's \conninfo reports incorrect destination on Windows

2013-12-04 Thread MauMau


Hello,

I've found a bug that psql's \conninfo displays incorrect information on 
Windows.  Please find attached the patch and commit this.


[Problem]
When I run "psql postgres" on Windows and execute \conninfo, it outputs the 
text below.  It reports that psql connected to the server via UNIX domain 
socket, but the actual connection is of course via TCP/IP socket to 
localhost.


You are connected to database "postgres" as user "Administrator" via socket 
in "/tmp" at port "5432".


It should output:

You are connected to database "postgres" as user "Administrator" on host 
"localhost" at port "5432".



[Cause]
\conninfo calls PQhost(conn) to get the destination info.  PQhost() in this 
case returns NULL because conn->pghost and conn->pghostaddr are NULL.  When 
\conninfo receives NULL from PQhost(), it assumes /tmp.



[Fix]
PQhost() should return the actual destination.


Regards
MauMau


conninfo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PERFORM] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

> Maybe you could help test this patch:
>
http://www.postgresql.org/message-id/20131115194725.gg5...@awork2.anarazel.de

Which repository should I apply these patches. I tried main repository, 9.3
stable and source code of 9.3.1, and in my trials at least of one the
patches is failed. What patch command should I use?

Re: [HACKERS] ruleutils vs. empty targetlists

2013-12-04 Thread Tom Lane

Dean Rasheed  writes:
> On 3 December 2013 23:37, Tom Lane  wrote:
>> Thinking some more about bug #8648, it occurred to me that ruleutils.c
>> isn't exactly prepared for the case either:
>> ...  So I'm leaning towards just doing
>> 
>> +   if (colno == 0)
>> +   appendStringInfoString(buf, " *");
>> 
>> at least till such time as somebody shows a case where this isn't good
>> enough.

> Well here's a contrived example with grouping:

> create table nocols();
> create table somecols(a int, b int);
> create view v as select exists(select nocols.* from nocols, somecols
> group by somecols.a);

Hm, that's cute :-(.  And I realized last night that my patch's approach
of seizing on some random zero-column RTE isn't too bulletproof against
ALTER TABLE ADD COLUMN operations, either, ie, the referenced table might
have some columns by now.  Which is more or less the entire reason we
expand "*" at parse time in the first place.

What I'm thinking about this today is that really the *right* solution
is to allow syntactically-empty SELECT lists; once we've bought into the
notion of zero-column tables, the notion that you can't have an empty
select list is just fundamentally at odds with that.  And since you can
already have semantically-empty SELECT lists, this should in theory not
create much risk of new bugs.  If we did that, the existing ruleutils
code is just fine, as are any existing dump files containing this sort
of query.

That change might still be thought too aggressive for a back-patch,
though.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO

2013/12/4 Andres Freund 

> On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
> > >4) Start the slave and connect to it using psql and in another session
> I can see
> > >all archive recovery log
> > Hmm... I had thought my mistake in reading your email, but it reproduce
> again.
> > When I sat small recovery_time_delay(=3), it might work collectry.
> > However, I sat long timed recovery_time_delay(=300), it didn't work.
>
> > My reporduced operation log is under following.
> > >[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
> > >starting vacuum...end.
> > >transaction type: TPC-B (sort of)
> > >scaling factor: 10
> > >query mode: simple
> > >number of clients: 8
> > >number of threads: 4
> > >duration: 30 s
> > >number of transactions actually processed: 68704
> > >latency average: 3.493 ms
> > >tps = 2289.196747 (including connections establishing)
> > >tps = 2290.175129 (excluding connections establishing)
> > >[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
> > >[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
> > >server starting
> > >[mitsu-ko@localhost postgresql]$ LOG:  database system was shut down
> in recovery at 2013-12-03 10:26:41 JST
> > >LOG:  entering standby mode
> > >LOG:  consistent recovery state reached at 0/5C4D8668
> > >LOG:  redo starts at 0/5C4000D8
> > >[mitsu-ko@localhost postgresql]$ FATAL:  the database system is
> starting up
> > >FATAL:  the database system is starting up
> > >FATAL:  the database system is starting up
> > >FATAL:  the database system is starting up
> > >FATAL:  the database system is starting up
> > >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> > >psql: FATAL:  the database system is starting up
> > >[mitsu-ko@localhost postgresql]$ bin/psql -p6543
> > >psql: FATAL:  the database system is starting up
> > I attached my postgresql.conf and recovery.conf. It will be reproduced.
>
> So, you brought up a standby and it took more time to become consistent
> because it waited on commits? That's the problem? If so, I don't think
> that's a bug?
>
When it happened, psql cannot connect standby server at all. I think this
behavior is not good.
It should only delay recovery position and can seen old delay table data.
Cannot connect server is not hoped behavior.
If you think this behavior is the best, I will set ready for commiter. And
commiter will fix it better.

Rregards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund

On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
> 2013/12/4 Andres Freund 
> When it happened, psql cannot connect standby server at all. I think this
> behavior is not good.
> It should only delay recovery position and can seen old delay table data.

That doesn't sound like a good plan - even if the clients cannot connect
yet, you can still promote the server. Just not taking delay into
consideration at that point seems like it would possibly surprise users
rather badly in situations they really cannot use such surprises.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ruleutils vs. empty targetlists

2013-12-04 Thread Alvaro Herrera

Tom Lane escribió:

> What I'm thinking about this today is that really the *right* solution
> is to allow syntactically-empty SELECT lists; once we've bought into the
> notion of zero-column tables, the notion that you can't have an empty
> select list is just fundamentally at odds with that.  And since you can
> already have semantically-empty SELECT lists, this should in theory not
> create much risk of new bugs.  If we did that, the existing ruleutils
> code is just fine, as are any existing dump files containing this sort
> of query.

Wow, as strange-sounding as that is, you're probably correct.

This might probably be seen as a deviation from the standard, but then
so are zero-column tables.  Of course, syntactically-empty select lists
would also work with (standard-conforming) tables containing columns,
but it's hard to see that that would be a problem in practice.

> That change might still be thought too aggressive for a back-patch,
> though.  Comments?

Well, no correct query will start failing due to this change; the only
visible change would be queries that previously throw errors would start
working.  It's hard to see that as a backward-incompatibility.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO

2013/12/4 Christian Kruse 

> You created a master node and a hot standby with 300 delay. Then
> you stopped the standby, did the pgbench and startet the hot standby
> again. It did not get in line with the master. Is this correct?
>
No. First, I start master, and execute pgbench. Second, I start standby
with 300ms(50min) delay.
Then it cannot connect standby server by psql at all. I'm not sure why
standby did not start.
It might because delay feature is disturbed in REDO loop when first standby
start-up.


> I don't see a problem here… the standby should not be in sync with the
> master, it should be delayed. I did step by step what you did and
> after 50 minutes (300ms) the standby was at the same level the
> master was.
>
I think we can connect standby server any time, nevertheless with delay
option.


> Did I missunderstand you?
>
I'm not sure... You might right or another best way might be existed.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund

Hi,

On 2013-12-03 19:33:16 +, Simon Riggs wrote:
> > - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> > XLOG_XACT_COMMIT_COMPACT checks
> 
> Why just those? Why not aborts and restore points also?

What would the advantage of waiting on anything but commits be? If it's
not a commit, the action won't change the state of the database (yesyes,
there are exceptions, but those don't have a timestamp)...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO

2013/12/4 Andres Freund 

> On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
> > 2013/12/4 Andres Freund 
> > When it happened, psql cannot connect standby server at all. I think this
> > behavior is not good.
> > It should only delay recovery position and can seen old delay table data.
>
> That doesn't sound like a good plan - even if the clients cannot connect
> yet, you can still promote the server.
>
I'm not sure your argument, but does a purpose of this patch slip off?

Just not taking delay into
> consideration at that point seems like it would possibly surprise users
> rather badly in situations they really cannot use such surprises.
>
Hmm... I think user will be surprised...

 I think it is easy to fix behavior using recovery flag.
So we had better to wait for other comments.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Re: [HACKERS] Changes in Trigger Firing

2013-12-04 Thread Alvaro Herrera

Sameer Kumar wrote:

> 
> CreateTrigStmt is passed to CreateTrigger function as an arguement. I am
> struggling to understand how the values for various members of trigger are
> set and where [which file] calls CreateTrigStmt.
> 
> 
> Can someone provide some help on this?

I think you need better tools to guide you in exploring the source code.
For example, you can use cscope to tell you where is CreateTrigStmt
used, and you would find gram.y; and use it to tell you where
CreateTrigger is used, and you would find utility.c.

Any half-decent code editor should be able to generate a "database" of
symbols and let you frolic around the various files quickly.  Without
that, anyone would be completely lost in developing new features of even
the lowest complexity.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [bug fix] pg_ctl fails with config-only directory

2013-12-04 Thread MauMau


Hello,

I've found a bug and would like to fix it, but I cannot figure out how to do 
that well.  Could you give me any advice?  I encountered this on PG 9.2, but 
it will probably exist in later versions.


[Problem]
On Windows, a user with Administrator privileges can start the database 
server.  However, when he uses config-only directory, the database server 
cannot be started.  "pg_ctl start" fails with the following messages:


Execution of PostgreSQL by a user with administrative permissions is not
permitted.
The server must be started under an unprivileged user ID to prevent
possible system security compromises.  See the documentation for
more information on how to properly start the server.


[Cause]
pg_ctl runs "postgres -C data_directory" to know the data directory.  But 
postgres cannot be run by a user with Administrator privileges, and displays 
the above messages.



[Fix]
It is ideal that users with administrative privileges can start postgres, 
with the Administrator privileges removed.


Currently, initdb and pg_ctl take trouble to invoke postgres in a process 
with restricted privileges.  I understand this improvement was done in 8.2 
or 8.3 for convenience.  The same convenience should be available when 
running postgres directly, at least "postgres -C", 
"postgres --describe-config", and "postgres --single".


Then, how can we do this?  Which approach should we take?

* Approach 1
When postgres starts, it removes Administrator privileges from its own 
process.  But is this possible at all?  Windows security API is complex and 
provides many functions.  It seems difficult to understand them.  I'm afraid 
it would take a long time to figure out the solution.  Is there any good web 
page to look at?


* Approach 2
Do not call check_root() on Windows when -C, --describe-config, or --single 
is specified when running postgres.  This would be easy, and should not be 
dangerous in terms of security because attackers cannot get into the server 
process via network.


I'll try to find a solution based on approach 1, but I doubt there's one. 
If okay, I want to take approach 2.  Could you give me your thoughts?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Stephen Frost

* Jeff Davis (pg...@j-davis.com) wrote:
> On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote:
> > Stephen Frost  writes:
> > > When it comes to dump/reload, I'd much rather see a mechanism which uses
> > > our deep understanding of the extension's objects (as database objects)
> > > to implement the dump/reload than a text blob which is carried forward
> > > from major version to major version and may even fail to run.
> > 
> > Note that we're already doing that in the binary_upgrade code path.
> > I agree that generalizing that approach sounds like a better idea
> > than keeping a text blob around.
> 
> So does this take us fully back to Inline Extensions, or is there a
> distinction that I'm missing?

I've not really looked at the inline extensions patch/proposal in depth,
but I do think that's a lot closer than this.  As I understand it,
Dimitri had a patch for this, though what I've found is the blog post.
Also, there were a lot of discussions about the idea a year or so ago,
including folks who haven't spoken up on this discussion.

> I still don't see that Extension Templates are all bad:
>   * They preserve the fact that two instances of the same extension
> (e.g. in different databases) were created from the same template.

This is only true if we change the extension templates to be shared
catalogs, which they aren't today..

>   * They mirror the file-based templates, so it seems easier to get
> consistent behavior.

While it might seem easier and perhaps simpler, I'm not sure that I
really buy into the idea that we'd actually be more consistent.  Even if
we are, I'm not convinced that's what we want here..  The only thing
driving us in that direction is that we're calling these 'extensions'
too.  While I don't want five different extension-like things, I'd
rather use a different name from 'extensions' if we feel that the
differences between catalog-only extensions and filesystem extensions
will cause 'extensions' overall to have terribly inconsistent behavior.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets

2013-12-04 Thread Ian Lawrence Barwick

2013/11/8 Tom Lane :
> Albe Laurenz  writes:
>> What I would like to do is add a custom resjunk column
>> (e.g. a bytea) in AddForeignUpdateTargets that carries a row identifier
>> from the scan state to the modify state.
>> Would that be possible? Can I have anything else than a Var
>> in a resjunk column?
>
> [ thinks for awhile... ]  Hm.  In principle you can put any expression
> you want into the tlist during AddForeignUpdateTargets.  However, if it's
> not a Var then the planner won't understand that it's something that needs
> to be supplied by the table scan, so things won't work right in any but
> the most trivial cases (maybe not even then :-().
>
> What I'd try is creating a Var that has the attno of ctid
> (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea.
> This won't match what the catalogs say your table's ctid is, but I think
> that nothing will care much about that.

Apologies for reinvigorating this thread, but I'm running into a similar wall
myself and would like to clarify if this approach will work at all.

My foreign data source is returning a fixed-length string as a unique row
identifier; in AddForeignUpdateTargets() I can create a Var like this:

  var = makeVar(parsetree->resultRelation,
   SelfItemPointerAttributeNumber,
   BPCHAROID,
   32,
   InvalidOid,
   0);

but is it possible to store something other than a TIDOID here, and if so how?


Regards

Ian Barwick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with displaying "wide" tables in psql

2013-12-04 Thread Pavel Stehule

Hello

postgres=# \pset  format wrapped
Output format (format) is wrapped.
postgres=# select 'afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf
sadf sa df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf
sad fadsf';

?column?
-
 afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa df
sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf a.
.sdf sad f sadf sad fadsf
(1 row)

It works as expected

but it is not supported for row view. So any fix of this mode should be nice

Regards

Pavel


2013/12/4 Sergey Muraviov 

> Thank you for this trick.
> It would be nice if this trick was documented.
>
> However, with the pager I can't see wide value on one screen, select and
> copy it entirely.
> And I have to press many keys to find the necessary part of the value.
> There is no such problems with the patch.
>
>
> 2013/12/3 Pavel Stehule 
>
>> Hello
>>
>> do you know a pager less trick
>>
>> http://merlinmoncure.blogspot.cz/2007/10/better-psql-with-less.html
>>
>> Regards
>>
>> Pavel Stehule
>>
>>
>> 2013/12/3 Sergey Muraviov 
>>
>>> Hi.
>>>
>>> Psql definitely have a problem with displaying "wide" tables.
>>> Even in expanded mode, they look horrible.
>>> So I tried to solve this problem.
>>>
>>> Before the patch:
>>> postgres=# \x 1
>>> Expanded display (expanded) is on.
>>> postgres=# \pset border 2
>>> Border style (border) is 2.
>>> postgres=# select * from pg_stats;
>>>
>>> +-[ RECORD 1
>>> ]---+--
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> 
>>>
>>> --

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Stephen Frost

* Jeff Davis (pg...@j-davis.com) wrote:
> On Mon, 2013-12-02 at 15:44 -0500, Stephen Frost wrote:
> > How are we going to handle new keywords
> > being added in new major versions?  A pg_dump of the extension template
> > script is then going to be loaded into the new major version but will
> > not actually be able to be run because it'll error out...
> 
> Elsewhere in the thread you argued that the version of an extension
> should be preserved across dump/reload. Surely a given version of the
> extension corresponds to a specific set of SQL commands (specifically,
> the SQL text blob on PGXN), so it *should* error out.

I *do* think the version should be preserved (though I admit that the
argument about things released with PG does make some sense).  My point
above is that if we dump the text blob out and then just rerun it, it
might not work, but if we use pg_dump and dump the extension out as
database objects (using the newer version of pg_dump, as we always
recommend..), then it's certainly more likely to work even in the face
of new keywords and the like which change between releases.

> Otherwise you end up with a weird situation where upgrading a 9.4
> install to 9.5 allows you to keep version 1.2 of some extension, but 1.2
> won't install directly to 9.5. (By the way, I think this is a problem
> with pg_upgrade currently.)

Hmm.  I'll grant, that's an interesting situation to consider, but I'm
trying to figure out why it's better to make it always break, both on
initial installation and when doing a restore from a backup, than only
have it (most likely anyway) break on initial installation (to a newer
version that may not have existed originally).

> You're fighting pretty hard against text blobs, but at the same time
> saying that we should be able to fully make use of existing PGXN
> extensions, which contain text blobs of SQL. And extension authors are
> versioning their SQL blobs, not some abstract concepts internal to
> postgres and the catalogs.

I don't want text blobs in the backend catalogs.  I'm not argueing
against text blobs in general (that'd be kinda hard to do..).

> Just because we start with blobs from PGXN doesn't mean we need to use
> blobs everywhere; but I think you're too quick to rule them out.

Perhaps, but I really don't see the point of putting a text blob into
the database for a set of objects that we're just going to create in the
next moment.  That would be, from my point of view anyway, akin to
storing 'CREATE TABLE' statements in the catalog next to the actual
definition of the table in pg_class/pg_attribute.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-12-04 Thread Pavel Stehule

2013/12/4 Dean Rasheed 

> On 2 December 2013 04:55, Pavel Stehule  wrote:
> > Hello
> >
> > it looks well, thank you
> >
> > Regards
> >
> > Pavel
> >
>
> I've been thinking about this some more, and there's another case that
> concerns me slightly. We're now making some of the DROP...IF EXISTS
> commands tolerate non-existent types as well as non-existent schemas
> --- functions, aggregates, casts and operators all have type names in
> their specifications. Of course it's possible that the type is missing
> because it was in a schema that was dropped, so this change seems to
> be in spirit of what was discussed, but it seems like a change that
> might catch some people out.
>
> I think that, on balance, it is a sensible change, since if the type
> doesn't exist, the dependent object can't exist either, so DROP...IF
> EXISTS shouldn't be raising an error. However, I wonder if we should
> be issuing a more specific NOTICE in this case too --- i.e., check for
> non-existent types in the same way as we check for non-existent parent
> objects --- type_does_not_exist_skipping() and
> type_list_does_not_exist_skipping().
>

+1

Pavel


>
> Regards,
> Dean
>

Re: [HACKERS] [patch] Adding EXTRA_REGRESS_OPTS to all pg_regress invocations

2013-12-04 Thread Bruce Momjian

On Mon, May  6, 2013 at 11:51:47PM -0700, Christoph Berg wrote:
> "make check" supports EXTRA_REGRESS_OPTS to pass extra options to
> pg_regress, but all the other places where pg_regress is used do not
> allow this. The attached patch adds EXTRA_REGRESS_OPTS to
> Makefile.global.in (for contrib modules) and two more special
> Makefiles (isolation and pg_upgrade).
> 
> The use case here is that Debian needs to be able to redirect the unix
> socket directory used to /tmp, because /var/run/postgresql isn't
> writable for the buildd user. The matching part for this inside
> pg_regress is still in discussion here, but the addition of
> EXTRA_REGRESS_OPTS is an independent step that is also useful for
> others, so I'd like to propose it for inclusion.

Thanks, patch applied.  This will appear in PG 9.4.  I suppose we could
backpatch this but I would need community feedback on that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Kevin Grittner

Robert Haas  wrote:

> So, I proposed this patch previously and I still think it's a
> good idea, but it got voted down on the grounds that it didn't
> deal with clock drift.  I view that as insufficient reason to
> reject the feature, but others disagreed.  Unless some of those
> people have changed their minds, I don't think this patch has
> much future here.

There are many things that a system admin can get wrong.  Failing
to supply this feature because the sysadmin might not be running
ntpd (or equivalent) correctly seems to me to be like not having
the software do fsync because the sysadmin might not have turned
off write-back buffering on drives without persistent storage. 
Either way, poor system management can defeat the feature.  Either
way, I see no reason to withhold the feature from those who manage
their systems in a sane fashion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-04 Thread Kohei KaiGai

Hanada-san,

Thanks for your reviewing,

2013/12/4 Shigeru Hanada :
> I first reviewed postgres_fdw portion of the patches to learn the
> outline of Custom Plan.  Wiki page is also a good textbook of the
> feature.  I have some random comments about the basic design of Custom
> Plan:
>
> (1) IIUC add_join_path and add_scan_path are added to allow extensions
> to plug their code into planner.
>
Almost yes. For more correctness, these hooks allows extensions to
plug paths they can provide into a particular join or scan. Then planner
will choose the cheapest one  according to the cost value.

> (2) FDW framework has executor callbacks based on existing executor
> nodes.  Is there any plan to integrate them into one way, or wrap on
> by another?  I'm not sure that we should have two similar framework
> side by side.
> # I'm sorry if I've missed the past discussion about this issue.
>
Probably, FDW has different role from the CustomScan API.
As literal, FDW performs as a bridge between a relation form and
an opaque external data source, to intermediate two different world
on behalf of a foreign table.
On the other hand, CustomScan allows to provide alternative logic
to scan or join particular relations, in addition to the built-in ones,
but does not perform on behalf of foreign tables.

Existing FDW is designed to implement a scan on an intangible
relation, thus it can assume some things; like a tuple returned
from FDW has equivalent TupleDesc as table definition, or it can
always use ExecScan() for all the cases.
So, I don't think these two frameworks should be consolidated
because it makes confusion on the existing extensions that
assumes FDW callbacks always has a particular foreign table
definition.

> (3) Internal routines such as is_self_managed_relation and
> has_wholerow_reference seem to be useful for other FDWs.  Is it able
> to move them  into core?
>
Probably, src/backend/foreign/foreign.c is a good host for them.

> (4) postgres_fdw estimates costs of join by calculating local numbers.
>  How about to support remote estimation by throwing EXPLALAIN query
> when use_remote_estimates = true.
>
I'm uncertain whether the cost value from remote EXPLAIN represents
right difficulty on the local side, because it indeed represents the
difficulty to join two relations on the remote side, however, does not
represents local job; that just fetches tuples from the result set of
remote query with table joining.
How about your opinion? Is the remote cost estimation value comparable
with local value?

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-04 Thread Kohei KaiGai

Thanks for fixing many my carelessness.
I didn't know "seek" was an irregular verb...

Best regards,

2013/12/4 Shigeru Hanada :
> 2013/11/29 Kohei KaiGai :
>> I merged all the propositions from Jim. Thanks, it made the documentation
>> quality better. Also, I fixed up cosmetic stuff around whitespace <-> tab.
>
> I found some typos in documents and comments.  Please see attached
> patch for detail.
>
> --
> Shigeru HANADA



-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Christian Kruse

Hi,

On 04/12/13 07:22, Kevin Grittner wrote:
> There are many things that a system admin can get wrong.  Failing
> to supply this feature because the sysadmin might not be running
> ntpd (or equivalent) correctly seems to me to be like not having
> the software do fsync because the sysadmin might not have turned
> off write-back buffering on drives without persistent storage. 
> Either way, poor system management can defeat the feature.  Either
> way, I see no reason to withhold the feature from those who manage
> their systems in a sane fashion.

I agree. But maybe we should add a warning in the documentation about
time syncing?

Greetings,
 CK

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgp929ckT_fsN.pgp
Description: PGP signature

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Peter Eisentraut

On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote:
> running a
> few kvm instances that get bootstrapped automatically is something that
> is a solved problem.

Is it sound to run performance tests on kvm?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet

On Tue, 03 Dec 2013 10:44:15 -0800
Josh Berkus  wrote:

> It seems clear that Kernel.org, since 2.6, has been in the business of
> pushing major, hackish, changes to the IO stack without testing them or
> even thinking too hard about what the side-effects might be.  This is
> perhaps unsurprising given that two of the largest sponsors of the
> Kernel -- who, incidentally, do 100% of the performance testing -- don't
> use the IO stack.
> 
> This says to me that Linux will clearly be an undependable platform in
> the future with the potential to destroy PostgreSQL performance without
> warning, leaving us scrambling for workarounds.  Too bad the
> alternatives are so unpopular.

Wow, Josh, I'm surprised to hear this from you.

The active/inactive list mechanism works great for the vast majority of
users.  The second-use algorithm prevents a lot of pathological behavior,
like wiping out your entire cache by copying a big file or running a
backup.  We *need* that kind of logic in the kernel.

Now, back in 2012, Johannes (working for one of those big contributors)
hit upon an issue where second-use falls down.  So he set out to fix it:

https://lwn.net/Articles/495543/

This code has been a bit slow getting into the mainline for a few reasons,
but one of the chief ones is this: nobody is saying from the sidelines
that they need it!  If somebody were saying "Postgres would work a lot
better with this code in place" and had some numbers to demonstrate that,
we'd be far more likely to see it get into an upcoming release.

In the end, Linux is quite responsive to the people who participate in its
development, even as testers and bug reporters.  It responds rather less
well to people who find problems in enterprise kernels years later,
granted.  

The amount of automated testing, including performance testing, has
increased markedly in the last couple of years.  I bet that it would not
be hard at all to get somebody like Fengguang Wu to add some
Postgres-oriented I/O tests to his automatic suite:

https://lwn.net/Articles/571991/

Then we would all have a much better idea of how kernel releases are
affecting one of our most important applications; developers would pay
attention to that information.

Or you could go off and do your own thing, but I believe that would leave
us all poorer.

Thanks,

jon

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner


On 12/04/2013 04:30 PM, Peter Eisentraut wrote:

On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote:

running a
few kvm instances that get bootstrapped automatically is something that
is a solved problem.


Is it sound to run performance tests on kvm?


as sounds as on any other platform imho, the performance characteristics 
will differ between bare metal or other virtualisation platforms but the 
future is virtual and that is what a lot of stuff runs on...



Stefan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: programmable file format for postgresql.conf

2013-12-04 Thread Peter Eisentraut

On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote:
> IMHO, a data structure like the above would be completely
> self-contained and allow any autoconfiguring tool or GUI tool to be
> easily created, if the syntax is programmable. It would certainly make
> the config file more verbose, but at the same time would help a lot of
> users to configure postgres providing much more information.

What you are describing appears to be isomorphic to XML and XML Schema.
 Note that you are not required to maintain your configuration data in a
postgresql.conf-formatted file.  You can keep it anywhere you like, GUI
around in it, and convert it back to the required format.  Most of the
metadata is available through postgres --describe-config, which is the
result of a previous attempt in this area, which never really went anywhere.

It's not like there are a bunch of GUI and autotuning tools that people
are dying to use or developers are dying to create, but couldn't because
editing configuration files programmatically is hard.

Let's also not forget the two main use cases (arguably) of the
configuration files: hand editing, and generation by configuration
management tools.  Anything that makes these two harder is not going to
be well-received.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor patch for the uuid-ossp extension

2013-12-04 Thread Peter Eisentraut

On 11/23/13, 7:12 AM, Mario Weilguni wrote:
> Well, in that case and since this is a rarely used extension (I guess
> so), maybe it would be the best to simply rename that extension to
> uuidossp (or whatever) and don't make any special treatment for it?

Why?  This is a solved problem, and renaming the extension would only
cause unnecessary work.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FDW: possible resjunk columns in AddForeignUpdateTargets

2013-12-04 Thread Albe Laurenz

Ian Lawrence Barwick wrote:
> 2013/11/8 Tom Lane :
>> [ thinks for awhile... ]  Hm.  In principle you can put any expression
>> you want into the tlist during AddForeignUpdateTargets.  However, if it's
>> not a Var then the planner won't understand that it's something that needs
>> to be supplied by the table scan, so things won't work right in any but
>> the most trivial cases (maybe not even then :-().
>>
>> What I'd try is creating a Var that has the attno of ctid
>> (ie, SelfItemPointerAttributeNumber) but the datatype you want, ie bytea.
>> This won't match what the catalogs say your table's ctid is, but I think
>> that nothing will care much about that.
> 
> Apologies for reinvigorating this thread, but I'm running into a similar wall
> myself and would like to clarify if this approach will work at all.
> 
> My foreign data source is returning a fixed-length string as a unique row
> identifier; in AddForeignUpdateTargets() I can create a Var like this:
> 
>   var = makeVar(parsetree->resultRelation,
>SelfItemPointerAttributeNumber,
>BPCHAROID,
>32,
>InvalidOid,
>0);
> 
> but is it possible to store something other than a TIDOID here, and if so how?

Subsequent analysis showed that this won't work as you have
no way to populate such a resjunk column.
resjunk columns seem to get filled with the values from the
column of the same name, so currently there is no way to invent
your own column, fill it and pass it on.

See thread 8b848b463a71b7a905bc5ef18b95528e.squir...@sq.gransy.com

What I ended up doing is introduce a column option that identifies
a primary key column.  I add a resjunk entry for each of those and
use them to identify the correct row during an UPDATE or DELETE.

That only works for foreign data sources that have a concept of
a primary key, but maybe you can do something similar.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: programmable file format for postgresql.conf

2013-12-04 Thread Álvaro Hernández Tortosa




On 04/12/13 16:51, Peter Eisentraut wrote:

On 12/4/13, 1:42 AM, Álvaro Hernández Tortosa wrote:

 IMHO, a data structure like the above would be completely
self-contained and allow any autoconfiguring tool or GUI tool to be
easily created, if the syntax is programmable. It would certainly make
the config file more verbose, but at the same time would help a lot of
users to configure postgres providing much more information.


What you are describing appears to be isomorphic to XML and XML Schema.


	I don't think XML would be a good idea. Even if it is both 
programatically and humanly editable (two of the features I was 
suggesting for it), it is messy and very verbose for this purpose.



  Note that you are not required to maintain your configuration data in a
postgresql.conf-formatted file.  You can keep it anywhere you like, GUI
around in it, and convert it back to the required format.  Most of the


	I think it is not a very good idea to encourage GUI tools or tools to 
auto-configure postgres to use a separate configuration file and then 
convert it to postgresql.conf. That introduces a duplicity with evil 
problems if either source of data is modified out-of-the-expected-way.


	That's why I'm suggesting a config file that is, at the same time, 
usable by both postgres and other external tools. That also enables 
other features such as editing the config file persistently through a 
SQL session.



metadata is available through postgres --describe-config, which is the
result of a previous attempt in this area, which never really went anywhere.

It's not like there are a bunch of GUI and autotuning tools that people
are dying to use or developers are dying to create, but couldn't because
editing configuration files programmatically is hard.


	It might be a chicken-and-egg problem. Maybe it's hard and futile to 
write this config tools since postgresql.conf doesn't support the 
required features. I don't know how to measure the "interest of people" 
but I have seen many comments on this mailing list about features like 
this. IMHO it would be a great addition :)




Let's also not forget the two main use cases (arguably) of the
configuration files: hand editing, and generation by configuration
management tools.  Anything that makes these two harder is not going to
be well-received.


	100% agreed :) That's why I suggested that the format of the config 
file should adhere to the requisites a) to e) mentioned on my original 
email (http://www.postgresql.org/message-id/529b8d01.6060...@nosys.es).


	Would it be well-received a new file format that keeps it simple for 
both hand editing and generation of the configuration, and at the same 
time offers the features I have mentioned?


Thanks for your comments,

aht


--
Álvaro Hernández Tortosa


---
NOSYS
Networked Open SYStems


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Robert Haas

On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan  wrote:
> On 4 December 2013 01:24, Robert Haas  wrote:
>> Yeah, more or less, but the key is ensuring that it wouldn't let you
>> create the constraint in the first place if the partial index
>> specified *didn't* match the WHERE clause.  For example, suppose the
>> partial index says WHERE parent_entity = 'event' but the constraint
>> definition is WHERE parent_event = 'somethingelse'.  That ought to
>> fail, just as creating a regular foreign constraint will fail if
>> there's no matching unique index.
>
> The where clause only applies to queries against the FK table, and we
> don’t currently fail if there isn’t a matching index on the fk column
> when creating a FK (I’ve been bitten by that before).
>
> We fail if there isn’t a unique index on the referenced
> table/column(s), but queries against that table on insert/update not
> the FK table are unchanged (save that we don’t bother with them at all
> if the where clause expression fails for the given tuple).

Oh.  I misinterpreted what this feature was about, then.  I thought it
was about restricting the reference to a subset of the *referenced*
table, but it seems to be about restricting the constraint to a subset
of the *referencing* table.  I guess they're both useful, but the
syntax...

REFERENCES tab(col) WHERE (stuff)

...sure looks like the WHERE clause is syntactically associated with
the table being referenced.  What would we do if we eventually wanted
to support both variants?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire

On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu  wrote:
>
> Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
> (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
> and 1.
>
> procs ---memory-- ---swap-- -io --system--
> -cpu-
>  r  b   swpd   free   buffcache si   sobiboin cs us 
> sy  id wa st
>  0  0  0 30093568  84892 3872389600 0 022 14  0  
> 0 100  0  0
>  8  1  0 30043056  84892 3872389600 0 0 27080  52708 16 
> 14  70  0  0
>  8  1  0 30006600  84892 3872389600 0 0 44952 118286 43 
> 44  12  1  0
>  8  0  0 29986264  84900 3872389600 020 28043  95934 49 
> 42   8  1  0
>  7  0  0 29991976  84900 3872389600 0 0  8308  73641 52 
> 42   6  0  0
>  0  0  0 30091828  84900 3872389600 0 0  3996  30978 23 
> 24  53  0  0
>  0  0  0 30091968  84900 3872389600 0 01723   0  
> 0 100  0  0


Notice the huge %sy

What kind of VM are you using? HVM or paravirtual?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu  wrote:
> >
> > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple
> > (#6) queries:  Although there is no need for I/O, "wa" fluctuates between 0
> > and 1.
> >
> > procs ---memory-- ---swap-- -io --system--
> > -cpu-
> >  r  b   swpd   free   buffcache si   sobiboin cs us 
> > sy  id wa st
> >  0  0  0 30093568  84892 3872389600 0 022 14  0 
> >  0 100  0  0
> >  8  1  0 30043056  84892 3872389600 0 0 27080  52708 16 
> > 14  70  0  0
> >  8  1  0 30006600  84892 3872389600 0 0 44952 118286 43 
> > 44  12  1  0
> >  8  0  0 29986264  84900 3872389600 020 28043  95934 49 
> > 42   8  1  0
> >  7  0  0 29991976  84900 3872389600 0 0  8308  73641 52 
> > 42   6  0  0
> >  0  0  0 30091828  84900 3872389600 0 0  3996  30978 23 
> > 24  53  0  0
> >  0  0  0 30091968  84900 3872389600 0 01723   0 
> >  0 100  0  0
> 
> 
> Notice the huge %sy

My bet is on transparent hugepage defragmentation. Alternatively it's
scheduler overhead, due to superflous context switches around the buffer
mapping locks.

I'd strongly suggest doing a "perf record -g -a ;
perf report" run to check what's eating up the time.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

>Notice the huge %sy
>What kind of VM are you using? HVM or paravirtual?

This instance is paravirtual.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

> I'd strongly suggest doing a "perf record -g -a ;
> perf report" run to check what's eating up the time.

Here is one example:

+  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
+   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
+   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at
+   5.83%  postgres  [kernel.kallsyms]   [k] copy_user_generic_string
+   2.06%  postgres  [kernel.kallsyms]   [k] file_read_actor
+   1.89%  postgres  postgres[.] heapgettup_pagemode
+   1.83%  postgres  postgres[.] hash_search_with_hash_value
+   1.33%  postgres  [kernel.kallsyms]   [k] get_phys_to_machine
+   1.25%  postgres  [kernel.kallsyms]   [k] find_get_page
+   1.00%  postgres  postgres[.] heapgetpage
+   0.99%  postgres  [kernel.kallsyms]   [k] radix_tree_lookup_element
+   0.98%  postgres  postgres[.] advance_aggregates
+   0.96%  postgres  postgres[.] ExecProject
+   0.94%  postgres  postgres[.] advance_transition_function
+   0.88%  postgres  postgres[.] ExecScan
+   0.87%  postgres  postgres[.] HeapTupleSatisfiesMVCC
+   0.86%  postgres  postgres[.] LWLockAcquire
+   0.82%  postgres  [kernel.kallsyms]   [k] put_page
+   0.82%  postgres  postgres[.] MemoryContextReset
+   0.80%  postgres  postgres[.] SeqNext
+   0.78%  postgres  [kernel.kallsyms]   [k] pte_mfn_to_pfn
+   0.69%  postgres  postgres[.] ExecClearTuple
+   0.57%  postgres  postgres[.] ExecProcNode
+   0.54%  postgres  postgres[.] heap_getnext
+   0.53%  postgres  postgres[.] LWLockRelease
+   0.53%  postgres  postgres[.] ExecStoreTuple
+   0.51%  postgres  libc-2.12.so[.] __GI___libc_read
+   0.42%  postgres  [kernel.kallsyms]   [k] xen_spin_lock
+   0.40%  postgres  postgres[.] ReadBuffer_common
+   0.38%  postgres  [kernel.kallsyms]   [k] __do_fault
+   0.37%  postgres  [kernel.kallsyms]   [k] shmem_fault
+   0.37%  postgres  [kernel.kallsyms]   [k] unmap_single_vma
+   0.35%  postgres  [kernel.kallsyms]   [k] __wake_up_bit
+   0.33%  postgres  postgres[.] StrategyGetBuffer
+   0.33%  postgres  [kernel.kallsyms]   [k] set_page_dirty
+   0.33%  postgres  [kernel.kallsyms]   [k] handle_pte_fault
+   0.33%  postgres  postgres[.] ExecAgg
+   0.31%  postgres  postgres[.] XidInMVCCSnapshot
+   0.31%  postgres  [kernel.kallsyms]   [k] __audit_syscall_entry
+   0.31%  postgres  postgres[.] CheckForSerializableConflictOut
+   0.29%  postgres  [kernel.kallsyms]   [k] handle_mm_fault
+   0.25%  postgres  [kernel.kallsyms]   [k] shmem_getpage_gfp



On Wed, Dec 4, 2013 at 6:33 PM, Andres Freund wrote:

> On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
> > On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu  wrote:
> > >
> > > Here are the results of "vmstat 1" while running 8 parallel TPC-H
> Simple
> > > (#6) queries:  Although there is no need for I/O, "wa" fluctuates
> between 0
> > > and 1.
> > >
> > > procs ---memory-- ---swap-- -io --system--
> > > -cpu-
> > >  r  b   swpd   free   buffcache si   sobiboin
> cs us sy  id wa st
> > >  0  0  0 30093568  84892 3872389600 0 022
> 14  0  0 100  0  0
> > >  8  1  0 30043056  84892 3872389600 0 0 27080
>  52708 16 14  70  0  0
> > >  8  1  0 30006600  84892 3872389600 0 0 44952
> 118286 43 44  12  1  0
> > >  8  0  0 29986264  84900 3872389600 020 28043
>  95934 49 42   8  1  0
> > >  7  0  0 29991976  84900 3872389600 0 0  8308
>  73641 52 42   6  0  0
> > >  0  0  0 30091828  84900 3872389600 0 0  3996
>  30978 23 24  53  0  0
> > >  0  0  0 30091968  84900 3872389600 0 017
>  23   0  0 100  0  0
> >
> >
> > Notice the huge %sy
>
> My bet is on transparent hugepage defragmentation. Alternatively it's
> scheduler overhead, due to superflous context switches around the buffer
> mapping locks.
>
> I'd strongly suggest doing a "perf record -g -a ;
> perf report" run to check what's eating up the time.
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Andrew Dunstan



On 12/04/2013 11:25 AM, Robert Haas wrote:

On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan  wrote:

On 4 December 2013 01:24, Robert Haas  wrote:

Yeah, more or less, but the key is ensuring that it wouldn't let you
create the constraint in the first place if the partial index
specified *didn't* match the WHERE clause.  For example, suppose the
partial index says WHERE parent_entity = 'event' but the constraint
definition is WHERE parent_event = 'somethingelse'.  That ought to
fail, just as creating a regular foreign constraint will fail if
there's no matching unique index.

The where clause only applies to queries against the FK table, and we
don’t currently fail if there isn’t a matching index on the fk column
when creating a FK (I’ve been bitten by that before).

We fail if there isn’t a unique index on the referenced
table/column(s), but queries against that table on insert/update not
the FK table are unchanged (save that we don’t bother with them at all
if the where clause expression fails for the given tuple).

Oh.  I misinterpreted what this feature was about, then.  I thought it
was about restricting the reference to a subset of the *referenced*
table, but it seems to be about restricting the constraint to a subset
of the *referencing* table.  I guess they're both useful, but the
syntax...

REFERENCES tab(col) WHERE (stuff)

...sure looks like the WHERE clause is syntactically associated with
the table being referenced.  What would we do if we eventually wanted
to support both variants?




Well I guess we could say something like:

   FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
   (b-condition)


But it's somewhat ugly.

The case of restricting the allowed referent rows does look slightly 
like a solution in search of a problem, but maybe that's just because I 
haven't thought of a use for it yet.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 18:43:35 +0200, Metin Doslu wrote:
> > I'd strongly suggest doing a "perf record -g -a ;
> > perf report" run to check what's eating up the time.
> 
> Here is one example:
> 
> +  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
> +   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
> +   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at

All that time is spent in your virtualization solution. One thing to try
is to look on the host system, sometimes profiles there can be more
meaningful.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Robert Haas

On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan  wrote:
>> Oh.  I misinterpreted what this feature was about, then.  I thought it
>> was about restricting the reference to a subset of the *referenced*
>> table, but it seems to be about restricting the constraint to a subset
>> of the *referencing* table.  I guess they're both useful, but the
>> syntax...
>>
>> REFERENCES tab(col) WHERE (stuff)
>>
>> ...sure looks like the WHERE clause is syntactically associated with
>> the table being referenced.  What would we do if we eventually wanted
>> to support both variants?
>>
>
> Well I guess we could say something like:
>
>FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>(b-condition)
>
> But it's somewhat ugly.
>
> The case of restricting the allowed referent rows does look slightly like a
> solution in search of a problem, but maybe that's just because I haven't
> thought of a use for it yet.

Interestingly, the variant for which you can't think of a use case is
the one I've missed most.  Typical examples in my experience are
things like project.project_manager_id references person (id) where
person.is_project_manager, or alert (device_id) references device (id)
where not device.deleted.

Different strokes for different folks, I guess.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Peter Eisentraut

src/backend/access/transam/xlog.c:5889: trailing whitespace.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Andrew Dunstan



On 12/04/2013 12:00 PM, Robert Haas wrote:

On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan  wrote:

Oh.  I misinterpreted what this feature was about, then.  I thought it
was about restricting the reference to a subset of the *referenced*
table, but it seems to be about restricting the constraint to a subset
of the *referencing* table.  I guess they're both useful, but the
syntax...

REFERENCES tab(col) WHERE (stuff)

...sure looks like the WHERE clause is syntactically associated with
the table being referenced.  What would we do if we eventually wanted
to support both variants?


Well I guess we could say something like:

FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
(b-condition)

But it's somewhat ugly.

The case of restricting the allowed referent rows does look slightly like a
solution in search of a problem, but maybe that's just because I haven't
thought of a use for it yet.

Interestingly, the variant for which you can't think of a use case is
the one I've missed most.  Typical examples in my experience are
things like project.project_manager_id references person (id) where
person.is_project_manager, or alert (device_id) references device (id)
where not device.deleted.



OK, those make sense. I wonder whether this should be done via a USING 
clause on the constraint that pointed to the partial unique index. Or 
would that be too obscure?


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing

2013-12-04 Thread Magnus Hagander

On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Tue, Dec 3, 2013 at 7:11 PM, Tom Lane  wrote:
> >> Maybe we should just bite the bullet and change the WAL format for
> >> heap_freeze (inventing an all-new record type, not repurposing the old
> >> one, and allowing WAL replay to continue to accept the old one).  The
> >> implication for users would be that they'd have to update slave servers
> >> before the master when installing the update; which is unpleasant, but
> >> better than living with a known data corruption case.
>
> > Agreed. It may suck, but it sucks less.
>
> > How badly will it break if they do the upgrade in the wrong order though.
> > Will the slaves just stop (I assume this?) or is there a risk of a
> > wrong-order upgrade causing extra breakage?
>
> I assume what would happen is the slave would PANIC upon seeing a WAL
> record code it didn't recognize.  Installing the updated version should
> allow it to resume functioning.  Would be good to test this, but if it
> doesn't work like that, that'd be another bug to fix IMO.  We've always
> foreseen the possible need to do something like this, so it ought to
> work reasonably cleanly.
>
>
I wonder if we should for the future have the START_REPLICATION command (or
the IDENTIFY_SYSTEM would probably make more sense - or even adding a new
command like IDENTIFY_CLIENT. The point is, something in the replication
protocol) have walreceiver include it's version sent to the master. That
way we could have the walsender identify a walreceiver that's too old and
disconnect it right away - with a much  nicer error message than a PANIC.
Right now, walreceiver knows the version of the walsender (through
pqserverversion), but AFAICT there is no way for the walsender to know
which version of the receiver is connected.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Jeff Davis

On Wed, 2013-12-04 at 09:50 -0500, Stephen Frost wrote:
> > I still don't see that Extension Templates are all bad:
> >   * They preserve the fact that two instances of the same extension
> > (e.g. in different databases) were created from the same template.
> 
> This is only true if we change the extension templates to be shared
> catalogs, which they aren't today..

I agree with you about that -- I don't like per-DB templates.

I guess the challenge is that we might want to use namespaces to support
user-installable extensions, and namespaces reside within a DB. But I
think we can find some other solution there (e.g. user names rather than
schemas), and per-DB templates are just not a good solution anyway.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner

On 12/04/2013 04:33 PM, Jonathan Corbet wrote:
> On Tue, 03 Dec 2013 10:44:15 -0800
> Josh Berkus  wrote:
> 
>> It seems clear that Kernel.org, since 2.6, has been in the business of
>> pushing major, hackish, changes to the IO stack without testing them or
>> even thinking too hard about what the side-effects might be.  This is
>> perhaps unsurprising given that two of the largest sponsors of the
>> Kernel -- who, incidentally, do 100% of the performance testing -- don't
>> use the IO stack.
>>
>> This says to me that Linux will clearly be an undependable platform in
>> the future with the potential to destroy PostgreSQL performance without
>> warning, leaving us scrambling for workarounds.  Too bad the
>> alternatives are so unpopular.
> 
> Wow, Josh, I'm surprised to hear this from you.
> 
> The active/inactive list mechanism works great for the vast majority of
> users.  The second-use algorithm prevents a lot of pathological behavior,
> like wiping out your entire cache by copying a big file or running a
> backup.  We *need* that kind of logic in the kernel.
> 
> Now, back in 2012, Johannes (working for one of those big contributors)
> hit upon an issue where second-use falls down.  So he set out to fix it:
> 
>   https://lwn.net/Articles/495543/
> 
> This code has been a bit slow getting into the mainline for a few reasons,
> but one of the chief ones is this: nobody is saying from the sidelines
> that they need it!  If somebody were saying "Postgres would work a lot
> better with this code in place" and had some numbers to demonstrate that,
> we'd be far more likely to see it get into an upcoming release.
> 
> In the end, Linux is quite responsive to the people who participate in its
> development, even as testers and bug reporters.  It responds rather less
> well to people who find problems in enterprise kernels years later,
> granted.  
> 
> The amount of automated testing, including performance testing, has
> increased markedly in the last couple of years.  I bet that it would not
> be hard at all to get somebody like Fengguang Wu to add some
> Postgres-oriented I/O tests to his automatic suite:
> 
>   https://lwn.net/Articles/571991/
> 
> Then we would all have a much better idea of how kernel releases are
> affecting one of our most important applications; developers would pay
> attention to that information.

hmm interesting tool, I can see how that would be very useful "for early
warning" style detection on the kernel development side using a small
set of postgresql "benchmarks". That would basically help with part of
Josh complained that it will take ages for regressions to be detected.
>From postgresqls pov we would also need additional long term and more
complex testing spanning different postgresql version on various
distribution platforms (because that is what people deploy in
production, hand built git-fetched kernels are rare) using tests that
both might have extended runtimes and/or require external infrastructure


> 
> Or you could go off and do your own thing, but I believe that would leave
> us all poorer.

fully agreed


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Jeff Davis

On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote:
> Stephen Frost  writes:
> > When it comes to dump/reload, I'd much rather see a mechanism which uses
> > our deep understanding of the extension's objects (as database objects)
> > to implement the dump/reload than a text blob which is carried forward
> > from major version to major version and may even fail to run.
> 
> Note that we're already doing that in the binary_upgrade code path.
> I agree that generalizing that approach sounds like a better idea
> than keeping a text blob around.

The reason for doing it that way in pg_upgrade was to preserve OIDs for
types, etc.:

http://www.postgresql.org/message-id/20783.1297184...@sss.pgh.pa.us

That doesn't seem to apply to ordinary dump/reload. Do you think it's
good for other reasons, as well?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Robert Haas

On Wed, Dec 4, 2013 at 12:18 PM, Andrew Dunstan  wrote:
>> Interestingly, the variant for which you can't think of a use case is
>> the one I've missed most.  Typical examples in my experience are
>> things like project.project_manager_id references person (id) where
>> person.is_project_manager, or alert (device_id) references device (id)
>> where not device.deleted.
>>
> OK, those make sense. I wonder whether this should be done via a USING
> clause on the constraint that pointed to the partial unique index. Or would
> that be too obscure?

I wondered that, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changes in Trigger Firing

2013-12-04 Thread Sameer Kumar

>
>
> >
> > CreateTrigStmt is passed to CreateTrigger function as an arguement. I am
> > struggling to understand how the values for various members of trigger
> are
> > set and where [which file] calls CreateTrigStmt.
> >
> >
> > Can someone provide some help on this?
>
> I think you need better tools to guide you in exploring the source code.
> For example, you can use cscope to tell you where is CreateTrigStmt
> used, and you would find gram.y; and use it to tell you where
> CreateTrigger is used, and you would find utility.c.
>
> Thanks for your advice. I was relying on PostgreSQL documentation which
was quite helpful so far. Let me try some development tool.


> Any half-decent code editor should be able to generate a "database" of
> symbols and let you frolic around the various files quickly.  Without
> that, anyone would be completely lost in developing new features of even
> the lowest complexity.
>

Got the point!

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire

On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund  wrote:
> On 2013-12-04 18:43:35 +0200, Metin Doslu wrote:
>> > I'd strongly suggest doing a "perf record -g -a ;
>> > perf report" run to check what's eating up the time.
>>
>> Here is one example:
>>
>> +  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
>> +   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
>> +   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at
>
> All that time is spent in your virtualization solution. One thing to try
> is to look on the host system, sometimes profiles there can be more
> meaningful.

You cannot profile the host on EC2.

You could try HVM. I've noticed it fare better  under heavy CPU load,
and it's not fully-HVM (it still uses paravirtualized network and
I/O).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

> You could try HVM. I've noticed it fare better  under heavy CPU load,
> and it's not fully-HVM (it still uses paravirtualized network and
> I/O).

I already tried with HVM (cc2.8xlarge instance on Amazon EC2) and observed
same problem.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 16:00:40 -0200, Claudio Freire wrote:
> On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund  wrote:
> > All that time is spent in your virtualization solution. One thing to try
> > is to look on the host system, sometimes profiles there can be more
> > meaningful.
> 
> You cannot profile the host on EC2.

Didn't follow the thread from the start. So, this is EC2? Have you
checked, with a recent enough version of top or whatever, how much time
is reported as "stolen"?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

> Didn't follow the thread from the start. So, this is EC2? Have you
> checked, with a recent enough version of top or whatever, how much time
> is reported as "stolen"?

Yes, this EC2. "stolen" is randomly reported as 1, mostly as 0.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

Here are some extra information:

- When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is
disappeared for 8 core machines and come back with 16 core machines on
Amazon EC2. Would it be related with PostgreSQL locking mechanism?

- I tried this test with 4 core machines including my personel computer and
some other instances on Amazon EC2, I didn't see this problem with 4 core
machines. I started to see this problem in PostgreSQL when core count is 8
or more.

- Here are the results of "vmstat 1" while running 8 parallel select
count(*). Normally I would expect zero idle time.

procs ---memory-- ---swap-- -io --system--
-cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id
wa st
 0  0  0 29838640  94000 3895474000 0 0   22   21  0  0
100  0  0
 7  2  0 29788416  94000 3895474000 0 0 53922 108490 14
24 60  1  1
 5  0  0 29747248  94000 3895474000 0 0 68008 164571 22
48 27  2  1
 8  0  0 29725796  94000 3895474000 0 0 43587 150574 28
54 16  1  1
 0  0  0 29838328  94000 3895474000 0 0 15584 100459 26
55 18  1  0
 0  0  0 29838328  94000 3895474000 0 0   42   15  0  0
100  0  0

- When I run 8 parallel wc command or other scripts, they scale out as
expected and they utilize all cpu. This leads me to think that problem is
related with PostgreSQL instead of OS.

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 20:19:55 +0200, Metin Doslu wrote:
> - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is
> disappeared for 8 core machines and come back with 16 core machines on
> Amazon EC2. Would it be related with PostgreSQL locking mechanism?

You could try my lwlock-scalability improvement patches - for some
workloads here, the improvements have been rather noticeable. Which
version are you testing?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Metin Doslu

> You could try my lwlock-scalability improvement patches - for some
> workloads here, the improvements have been rather noticeable. Which
> version are you testing?

I'm testing with PostgreSQL 9.3.1.

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Joshua D. Drake



On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote:


On 12/04/2013 04:30 PM, Peter Eisentraut wrote:

On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote:

running a
few kvm instances that get bootstrapped automatically is something that
is a solved problem.


Is it sound to run performance tests on kvm?


as sounds as on any other platform imho, the performance characteristics
will differ between bare metal or other virtualisation platforms but the
future is virtual and that is what a lot of stuff runs on...


In actuality you need both. We need to know what the kernel is going to 
do on bare metal. For example, 3.2 to 3.8 are total crap for random IO 
access. We will only catch that properly from bare metal tests or at 
least, we will only catch it easily on bare metal tests.


If we know the standard bare metal tests are working then the next step 
up would be to test virtual.


BTW: Virtualization is only one future and it is still a long way off 
from serving the needs that bare metal serves at the same level 
(speaking PostgreSQL specifically).


JD





Stefan





--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner

On 12/04/2013 07:30 PM, Joshua D. Drake wrote:
> 
> On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote:
>>
>> On 12/04/2013 04:30 PM, Peter Eisentraut wrote:
>>> On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote:
 running a
 few kvm instances that get bootstrapped automatically is something that
 is a solved problem.
>>>
>>> Is it sound to run performance tests on kvm?
>>
>> as sounds as on any other platform imho, the performance characteristics
>> will differ between bare metal or other virtualisation platforms but the
>> future is virtual and that is what a lot of stuff runs on...
> 
> In actuality you need both. We need to know what the kernel is going to
> do on bare metal. For example, 3.2 to 3.8 are total crap for random IO
> access. We will only catch that properly from bare metal tests or at
> least, we will only catch it easily on bare metal tests.
> 
> If we know the standard bare metal tests are working then the next step
> up would be to test virtual.
> 
> BTW: Virtualization is only one future and it is still a long way off
> from serving the needs that bare metal serves at the same level
> (speaking PostgreSQL specifically).

we need to get that off the ground - and whatever makes it easier to get
off the ground will help. and if we solve the automation for
virtualisation, bare metal is just a small step away (or the other way
round). Getting comparable performance levels between either different
postgresql versions (or patches) or different operating systems with
various workloads is probably more valuable now that getting absolute
peak performance levels under specific tests long term.



Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-12-04 Thread Merlin Moncure

On Mon, Dec 2, 2013 at 10:26 PM, David Fetter  wrote:
> On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
>> On 11/28/2013 03:24 AM, David Fetter wrote:
>> > WITH, or SRF, or whatever, the point is that we need to be able to
>> > specify what we're sending--probably single opaque strings delimited
>> > just as we do other strings--and what we might get back--errors only,
>> > rows, [sets of] refcursors are the ones I can think of offhand.
>>
>> So, you're thinking of something like:
>>
>> WITH FOREIGN somecte AS $$... foreign query ...$$
>> SELECT ...
>> FROM somecte;
>
> I was picturing something a little more like an SRF which would take
> one opaque string, the remote command, some descriptor, perhaps an
> enum, of what if anything might come back.  Long ago, I implemented a
> similar thing in DBI-Link.  It was called
>
> remote_exec_dbh(data_source_id integer, query text, returns_rows bool)

Couple thoughts:
*) Any 'pass through' API should support parameterization (the FDW may
not support that, but many will and API should allow for it).   Lack
of parameterization is a major downside of dblink.  The function could
be set up to be variadic for the parameters.

*) For a connectivity APIs of this style, Dblink-ish mechanic of
separating command execution from data returning commands is likely
the right way to go.  Also, probably better to stick with SRF
mechanics if we go the 'function route'.  So basically we are making
dblink for FDW, adding parameterization and some concept of utilizing
the foreign server.

All this is assuming we are adding a special remote execution function
('fdwlink').  While that would be great, it's a significant deviation
from the standard into postgresql specific SRF syntax.   If some of
the qual pushdown deparsing functionality could be put inside the
internal FDW API, then you'd get the best of both worlds.  Maybe you'd
still want a dblink style extension anyways, but it wouldn't be as
critical.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: programmable file format for postgresql.conf

2013-12-04 Thread Peter Eisentraut

On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote:
> Would it be well-received a new file format that keeps it simple for
> both hand editing and generation of the configuration, and at the same
> time offers the features I have mentioned?

I don't see how that would work exactly: You want to add various kinds
of complex metadata to the configuration file, but make that metadata
optional at the same time.  The immediate result will be that almost no
one will supply the optional metadata, and no tools will be able to rely
on their presence.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RFC: programmable file format for postgresql.conf

2013-12-04 Thread Álvaro Hernández Tortosa




On 04/12/13 19:49, Peter Eisentraut wrote:

On 12/4/13, 11:22 AM, Álvaro Hernández Tortosa wrote:

Would it be well-received a new file format that keeps it simple for
both hand editing and generation of the configuration, and at the same
time offers the features I have mentioned?


I don't see how that would work exactly: You want to add various kinds
of complex metadata to the configuration file, but make that metadata
optional at the same time.  The immediate result will be that almost no
one will supply the optional metadata, and no tools will be able to rely
on their presence.



	I wouldn't say the metadata is "complex". Looks quite familiar to that 
of pg_settings (besides that, it was just a brainstorming, not a formal 
proposal).


	The optional fields are basically NULLABLE attributes in pg_settings. 
That is, they only make sense depending on other values (in this case, 
the parameter name). All of the attributes that are required for tools 
to work are marked as non optional.


	So optional fields are either purely optional (i.e., only for tools 
that want to use them; everyone else may ignore, but preserve, them) and 
some other are just NULLABLEs, depending on the parameter).


	In any case, my idea is just to open up the question and search for the 
best possible set of data to be represented, and then, the best possible 
syntax / file format for it.



aht


--
Álvaro Hernández Tortosa


---
NOSYS
Networked Open SYStems


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Josh Berkus

On 12/04/2013 07:33 AM, Jonathan Corbet wrote:
> Wow, Josh, I'm surprised to hear this from you.

Well, I figured it was too angry to propose for an LWN article. ;-)

> The active/inactive list mechanism works great for the vast majority of
> users.  The second-use algorithm prevents a lot of pathological behavior,
> like wiping out your entire cache by copying a big file or running a
> backup.  We *need* that kind of logic in the kernel.

There's a large body of research on 2Q algorithms going back to the 80s,
which is what this is.  As far as I can tell, the modification was
performed without any reading of this research, since that would have
easily shown that 50/50 was unlikely to be a good division, and that in
fact there is nothing which would work except a tunable setting, because
workloads are different.  Certainly the "what happens if a single file
is larger than the entire recency bucket" question is addressed and debated.

As an example, PostgreSQL would want to shrink the frequency list to 0%,
because we already implement our own frequency list, and we already
demonstrated back in version 8.1 that a 3-list system was ineffective.

I can save Johannes some time: don't implement ARC.  Not only is it
under IBM patent, it's not effective in real-world situations.  Both
Postgres and Apache tried it in the early aughts.

However, this particular issue concerns me less than the general
attitude that it's OK to push in experimental IO changes which can't be
disabled by users into release kernels, as exemplified by several
problematic and inadequately tested IO changes in the 3.X kernels --
most notably the pdflush bug.  It speaks of a policy that the Linux IO
stack is not production software, and it's OK to tinker with it in ways
that break things for many users.

I also wasn't exaggerating the reception I got when I tried to talk
about IO and PostgreSQL at LinuxCon and other events.  The majority of
Linux hackers I've talked to simply don't want to be bothered with
PostgreSQL's performance needs, and I've heard similar things from my
collegues at the MySQL variants.  Greg KH was the only real exception.

Heck, I went to a meeting of filesystem geeks at LinuxCon and the main
feedback I received, from Linux FS developers (Chris and Ted), was
"PostgreSQL should implement its own storage and use DirectIO, we don't
know why you're even trying to use the Linux IO stack."  That's why I
gave up on working through community channels; I face enough uphill
battles in *this* project.

> This code has been a bit slow getting into the mainline for a few reasons,
> but one of the chief ones is this: nobody is saying from the sidelines
> that they need it!  If somebody were saying "Postgres would work a lot
> better with this code in place" and had some numbers to demonstrate that,
> we'd be far more likely to see it get into an upcoming release.

Well, Citus did that; do you need more evidence?

> In the end, Linux is quite responsive to the people who participate in its
> development, even as testers and bug reporters.  It responds rather less
> well to people who find problems in enterprise kernels years later,
> granted.

All infrastructure software, including Postgres, has the issue that most
enterprise users are using a version which was released years ago.  As a
result, some performance issues simply aren't going to be found until
that version has been out for a couple of years.  This leads to a
Catch-22: enterprise users are reluctant to upgrade because of potential
performance regressions, and as a result the median "enterprise" version
gets further and further behind current development, and as a result the
performance regressions are never fixed.

We encounter this in PostgreSQL (I have customers who are still on 8.4
or 9.1 because of specific regressions), and it's even worse in the
Linux world, where RHEL is still on 2.6.  We work really hard to avoid
performance regressions in Postgres versions, because we know we can't
test for them adequately, and often can't fix them in release versions
after the fact.

But you know what?  2.6, overall, still performs better than any kernel
in the 3.X series, at least for Postgres.

> The amount of automated testing, including performance testing, has
> increased markedly in the last couple of years.  I bet that it would not
> be hard at all to get somebody like Fengguang Wu to add some
> Postgres-oriented I/O tests to his automatic suite:
> 
>   https://lwn.net/Articles/571991/
> 
> Then we would all have a much better idea of how kernel releases are
> affecting one of our most important applications; developers would pay
> attention to that information.

Oh, good!  I was working with Greg on having an automated pgBench run,
but doing it on Wu's testing platform would be even better.  I still
need to get some automated stats digestion, since I want to at least
make sure that the tests would show the three major issues which we
encountered in recent Linux kernels

Re: [HACKERS] WITHIN GROUP patch

2013-12-04 Thread Andrew Gierth

> "Tom" == Tom Lane  writes:

 Tom> Well, okay, but you've not said anything that wouldn't be
 Tom> handled just as well by some logic that adds a fixed
 Tom> integer-constant-zero flag column to the rows going into the
 Tom> tuplesort.

Adding such a column unconditionally even for non-hypothetical
functions would break the optimization for sorting a single column
(which is a big deal, something like 3x speed difference, for by-value
types).

Adding it only for hypothetical set functions is making a distinction
in how functions are executed that I don't think is warranted -
imagine for example a function that calculates some measure over a
frequency distribution by adding a known set of boundary values to the
sort; this would not be a hypothetical set function in terms of
argument processing, but it would still benefit from the extra sort
column. I did not want to unnecessarily restrict such possibilities.

 >> It would still be overloaded in some sense because a non-hypothetical
 >> ordered set function could still take an arbitrary number of args
 >> (using variadic "any") - there aren't any provided, but there's no
 >> good reason to disallow user-defined functions doing that - so you'd
 >> still need a special value like -1 for aggordnargs to handle that.

 Tom> Sure.  But a -1 to indicate "not applicable" doesn't seem like it's
 Tom> too much of a stretch.  It's the -2 business that's bothering me.
 Tom> Again, that seems unnecessarily non-orthogonal --- who's to say which
 Tom> functions would want to constrain the number of direct arguments and
 Tom> which wouldn't?  (I wonder whether having this info in the catalogs
 Tom> isn't the wrong thing anyhow, as opposed to expecting the functions
 Tom> themselves to check the argument count at runtime.)

Not checking the number of arguments to a function until runtime seems
a bit on the perverse side. Having a fixed number of direct args is
the "normal" case (as seen from the fact that all the non-hypothetical
ordered set functions in the spec and in our patch have fixed argument
counts).

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-12-04 Thread David Fetter

On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote:
> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter  wrote:
> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
> >> On 11/28/2013 03:24 AM, David Fetter wrote:
> >> > WITH, or SRF, or whatever, the point is that we need to be able to
> >> > specify what we're sending--probably single opaque strings delimited
> >> > just as we do other strings--and what we might get back--errors only,
> >> > rows, [sets of] refcursors are the ones I can think of offhand.
> >>
> >> So, you're thinking of something like:
> >>
> >> WITH FOREIGN somecte AS $$... foreign query ...$$
> >> SELECT ...
> >> FROM somecte;
> >
> > I was picturing something a little more like an SRF which would take
> > one opaque string, the remote command, some descriptor, perhaps an
> > enum, of what if anything might come back.  Long ago, I implemented a
> > similar thing in DBI-Link.  It was called
> >
> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool)
> 
> Couple thoughts:
> *) Any 'pass through' API should support parameterization (the FDW may
> not support that, but many will and API should allow for it).   Lack
> of parameterization is a major downside of dblink.  The function could
> be set up to be variadic for the parameters.

I don't know for sure that that needs to be in version 1 of this.  It
definitely shouldn't block implementing the non-parameterized one.

> *) For a connectivity APIs of this style, Dblink-ish mechanic of
> separating command execution from data returning commands is likely
> the right way to go.  Also, probably better to stick with SRF
> mechanics if we go the 'function route'.  So basically we are making
> dblink for FDW, adding parameterization and some concept of utilizing
> the foreign server.

Yes, modulo the above.

> All this is assuming we are adding a special remote execution function
> ('fdwlink').  While that would be great, it's a significant deviation
> from the standard into postgresql specific SRF syntax.

What the standard has is literally insane.

> If some of the qual pushdown deparsing functionality could be put
> inside the internal FDW API, then you'd get the best of both worlds.

If this were flawless on the PostgreSQL side (i.e. our path generator
understood everything perfectly including aggregates) and trivial to
implement correctly in FDWs, certainly.

The idea here is that such a happy situation will not obtain until
much later, if ever, and meanwhile, we need a way to get things
accomplished even if it's inelegant, inefficient, etc.  The
alternative is that those things simply will not get accomplished at
all.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Tom Lane

Andrew Dunstan  writes:
>>> Well I guess we could say something like:
>>> 
>>> FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>> (b-condition)
>>> 
>>> But it's somewhat ugly.

> OK, those make sense. I wonder whether this should be done via a USING 
> clause on the constraint that pointed to the partial unique index. Or 
> would that be too obscure?

I like what you have above.  Yeah, it requires the more verbose syntax
for declaring a foreign key, but this feature is not going to be so
heavily used that anyone will be in danger of worsening their carpal
tunnel syndrome.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Joshua D. Drake



On 12/04/2013 07:33 AM, Jonathan Corbet wrote:



Wow, Josh, I'm surprised to hear this from you.

The active/inactive list mechanism works great for the vast majority of
users.  The second-use algorithm prevents a lot of pathological behavior,
like wiping out your entire cache by copying a big file or running a
backup.  We *need* that kind of logic in the kernel.


The amount of automated testing, including performance testing, has
increased markedly in the last couple of years.  I bet that it would not
be hard at all to get somebody like Fengguang Wu to add some
Postgres-oriented I/O tests to his automatic suite:

https://lwn.net/Articles/571991/

Then we would all have a much better idea of how kernel releases are
affecting one of our most important applications; developers would pay
attention to that information.

Or you could go off and do your own thing, but I believe that would leave
us all poorer.


Thank you for your very well thought out, and knowledgeable response. 
This is certainly helpful and highlights what a lot of us were already 
stating.


Sincerely,

Joshua D. Drake


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing

2013-12-04 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane  wrote:
>> I assume what would happen is the slave would PANIC upon seeing a WAL
>> record code it didn't recognize.

> I wonder if we should for the future have the START_REPLICATION command (or
> the IDENTIFY_SYSTEM would probably make more sense - or even adding a new
> command like IDENTIFY_CLIENT. The point is, something in the replication
> protocol) have walreceiver include it's version sent to the master. That
> way we could have the walsender identify a walreceiver that's too old and
> disconnect it right away - with a much  nicer error message than a PANIC.

Meh.  That only helps for the case of streaming replication, and not for
the thirty-seven other ways that some WAL might arrive at something that
wants to replay it.

It might be worth doing anyway, but I can't get excited about it for this
scenario.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Tom Lane

Jeff Davis  writes:
> On Tue, 2013-12-03 at 14:31 -0500, Tom Lane wrote:
>> Stephen Frost  writes:
>>> When it comes to dump/reload, I'd much rather see a mechanism which uses
>>> our deep understanding of the extension's objects (as database objects)
>>> to implement the dump/reload than a text blob which is carried forward
>>> from major version to major version and may even fail to run.

>> Note that we're already doing that in the binary_upgrade code path.
>> I agree that generalizing that approach sounds like a better idea
>> than keeping a text blob around.

> The reason for doing it that way in pg_upgrade was to preserve OIDs for
> types, etc.:

That was *a* reason, but not the only one, I believe.

> That doesn't seem to apply to ordinary dump/reload. Do you think it's
> good for other reasons, as well?

I think Stephen has already argued why it could be a good idea here.
But in a nutshell: it seems like there are two use-cases to be
supported, one where you want "CREATE EXTENSION hstore" to give you
some appropriate version of hstore, and one where you want to restore
exactly what you had on the previous installation.  It seems to me that
"exploding" the extension by dumping, rather than suppressing, its
component objects is by far the most reliable way of accomplishing the
latter.  To point out just one reason why, we've never made any effort
to prohibit suitably-privileged users from modifying the objects within
an extension.  So even if you'd kept around the originally defining
text string, it might not represent current reality.  And as for relying
on some URL or other --- whoever proposed that doesn't live in the same
internet I do.  URLs aren't immutable, even on days when you can get to
them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-12-04 Thread Tom Lane

David Fetter  writes:
> The idea here is that such a happy situation will not obtain until
> much later, if ever, and meanwhile, we need a way to get things
> accomplished even if it's inelegant, inefficient, etc.  The
> alternative is that those things simply will not get accomplished at
> all.

If that's the argument, why not just use dblink or dbilink, and be
happy?  This discussion sounds a whole lot like it's trending to a
conclusion of wanting one of those in core, which is not where
I'd like to end up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposed feature: Selective Foreign Keys

2013-12-04 Thread Andrew Dunstan



On 12/04/2013 02:40 PM, Tom Lane wrote:

Andrew Dunstan  writes:

Well I guess we could say something like:

FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
(b-condition)

But it's somewhat ugly.

OK, those make sense. I wonder whether this should be done via a USING
clause on the constraint that pointed to the partial unique index. Or
would that be too obscure?

I like what you have above.  Yeah, it requires the more verbose syntax
for declaring a foreign key, but this feature is not going to be so
heavily used that anyone will be in danger of worsening their carpal
tunnel syndrome.





Fair enough. I guess in terms of *this* feature TomD would then need to 
adjust the location of his WHERE clause so it's before the REFERENCES 
clause.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Robert Haas

On Tue, Dec 3, 2013 at 11:44 AM, Dimitri Fontaine
 wrote:
>> We should also consider the possibility of a user trying to
>> deliberately install and older release.  For example, if the user has
>> 1.0, 1.0--1.1, 1.1, 1.1--1.2, and 1.2--1.0 (a downgrade script) with
>> default_full_version = 1.2, an attempt to install 1.0 should run just
>> the 1.0 script, NOT 1.2 and then 1.2--1.0.
>
> In what I did, if you want version 1.0 and we have a script --1.0.sql
> around, then we just use that script, never kicking the path chooser.

Oh, right.  Duh.  Sorry, bad example.  I do think we want to avoid
using a downgrade script as part of an install though - and to install
from the newest possible full version (I kind of like the term "base"
version) whenever possible.

>> break the tie by choosing which version number appears first in the
>> aforementioned list.  If that still doesn't break the tie, either
>> because none of the starting points are mentioned in that list or
>> because there are multiple equal-length paths starting in the same
>> place, we give up and emit an error.
>
> Jeff also did mention about tiebreakers without entering into any level
> of details.
>
> We won't be able to just use default_version as the tiebreaker list
> here, because of the following example:
>
>   default_version = 1.2, 1.0
>
>   create extension foo version '1.1';
>
> With such a setup it would prefer 1.2--1.1 to 1.0--1.1, which doesn't
> look like what we want. Instead, we want
>
>   default_version = 1.2
>   create_from_version_candidates = 1.0
>
>   create extension foo version '1.1';
>
> Then the tie breaker is the 1.0 in "create_from_version_candidates" so
> we would run foo--1.0.sql and then foo--1.0--1.1.sql.

I guess one way to skin this cat would be to just let the user provide
an ordering for the versions i.e.

version_ordering = 1.0 1.1 1.2

When the user asks for version X, we reject any paths that pass
through a newer version (so that we never downgrade), and start with
the path that begins as close to the target version as possible.  For
scenarios were people might be installing either an older or newer
version, that might be easier to understand than a base-version
preference list.

> Baring objections, I'm going to prepare a new branch to support
> developping that behavior against only file based extensions, and submit
> a spin-off patch to the current CF entry.

Not totally sure we're all on the same page yet, but that's not
necessarily meant to dissuade you.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITHIN GROUP patch

2013-12-04 Thread Tom Lane

Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> Well, okay, but you've not said anything that wouldn't be
>  Tom> handled just as well by some logic that adds a fixed
>  Tom> integer-constant-zero flag column to the rows going into the
>  Tom> tuplesort.

> Adding such a column unconditionally even for non-hypothetical
> functions would break the optimization for sorting a single column
> (which is a big deal, something like 3x speed difference, for by-value
> types).

Well, sure, but I was only suggesting adding it when the aggregate asks
for it, probably via a new flag column in pg_aggregate.  The question
you're evading is what additional functionality could be had if the
aggregate could demand a different datatype or constant value for the
flag column.

> Adding it only for hypothetical set functions is making a distinction
> in how functions are executed that I don't think is warranted -

That seems like rather a curious argument from someone who's willing to
give up the ability to specify a regular transition value concurrently
with the flag column.

But anyway, what I'm thinking right now is that these questions would all
go away if the aggregate transfunction were receiving the rows and
sticking them into the tuplestore.  It could add whatever columns it felt
like.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-12-04 Thread Merlin Moncure

On Wed, Dec 4, 2013 at 1:39 PM, David Fetter  wrote:
> On Wed, Dec 04, 2013 at 12:43:44PM -0600, Merlin Moncure wrote:
>> On Mon, Dec 2, 2013 at 10:26 PM, David Fetter  wrote:
>> > On Tue, Dec 03, 2013 at 11:15:36AM +0800, Craig Ringer wrote:
>> >> On 11/28/2013 03:24 AM, David Fetter wrote:
>> >> > WITH, or SRF, or whatever, the point is that we need to be able to
>> >> > specify what we're sending--probably single opaque strings delimited
>> >> > just as we do other strings--and what we might get back--errors only,
>> >> > rows, [sets of] refcursors are the ones I can think of offhand.
>> >>
>> >> So, you're thinking of something like:
>> >>
>> >> WITH FOREIGN somecte AS $$... foreign query ...$$
>> >> SELECT ...
>> >> FROM somecte;
>> >
>> > I was picturing something a little more like an SRF which would take
>> > one opaque string, the remote command, some descriptor, perhaps an
>> > enum, of what if anything might come back.  Long ago, I implemented a
>> > similar thing in DBI-Link.  It was called
>> >
>> > remote_exec_dbh(data_source_id integer, query text, returns_rows bool)
>>
>> Couple thoughts:
>> *) Any 'pass through' API should support parameterization (the FDW may
>> not support that, but many will and API should allow for it).   Lack
>> of parameterization is a major downside of dblink.  The function could
>> be set up to be variadic for the parameters.
>
> I don't know for sure that that needs to be in version 1 of this.  It
> definitely shouldn't block implementing the non-parameterized one.

I'm not making the case it should be version anything.  But, if you
went dblink style, you'd want to go variadic.  It's not really any
extra work and you can always embed the string if the FDW driver
doesn't support parameterization.

> What the standard has is literally insane.

Not sure I agree.  The guiding principle of the standard
implementation AIUI is that it wants to connectivity management via
syntax and keep the DML abstractions clean (minus some
un-implementable things like RI triggers).  In other words, you write
exactly the same queries for native and foreign tables.  This makes
things much easier for people who just want to write SQL the classical
way and not get into funky vendor specific APIs.

The downside of SQL-MED, particularly the way postgres implemented the
driver API, is that each driver is responsible for for all
optimization efforts and I think this is bad.  So I'm openly wondering
if the FDW API should expose optional query rewriting hooks.  The
odbc-fdw and jdbc-fdw drivers for example could then benefit from
those hooks so that qual pushdown could be implemented with far less
code duplication and effort and a *much* broader set of problems could
be addressed by FDW.  For non- or exotic- SQL implementations those
hooks could be implemented locally by the driver or disabled if
doesn't make sense to use them.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Status of FDW pushdowns

2013-12-04 Thread Tom Lane

Merlin Moncure  writes:
> The downside of SQL-MED, particularly the way postgres implemented the
> driver API, is that each driver is responsible for for all
> optimization efforts and I think this is bad.

There was never any intention that that would be the final state of
things.  All the FDW APIs are quite experimental at this point, and
subject to change, and one of the reasons for change is going to be
to improve the optimization situation.

At the same time, it's hard to say what might constitute optimization
for FDWs that aren't backed by a remote SQL database.  There are always
going to be reasons why an FDW will have to do some of that work for
itself.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Robert Haas

On Wed, Dec 4, 2013 at 3:39 AM, Jeff Davis  wrote:
> On Tue, 2013-12-03 at 10:23 -0500, Robert Haas wrote:
>> In more normal cases, however, the system can (and probably should)
>> figure out what was intended by choosing the *shortest* path to get to
>> the intended version.  For example, if someone ships 1.0, 1.0--1.1,
>> 1.1, and 1.1--1.2, the system should choose to run 1.1 and then
>> 1.1--1.2, not 1.0 and then 1.0--1.1 and then 1.1--1.2.  But that can
>> be automatic: only if there are two paths of equal length (as in the
>> example in the previous paragraph) do we need help from the user to
>> figure out what to do.
>
> Why do we need help from the user? Just pick a path.
>
> For an extension update, I understand why someone wouldn't want to
> accidentally downgrade 5 versions (dropping all of their dependent
> objects) before updating to the latest. But this doesn't apply to
> creation.

I suppose.  But suppose we have 1.0, 1.1, 1.0--1.2, and 1.1--1.2.
Suppose further that 1.1 drops some interfaces present in 1.0, and 1.2
adds new stuff.  If the system chooses to run 1.0 and then 1.0--1.2,
it'll create all the deprecated interfaces and then drop them again.
Now maybe that won't cause any problems, but I bet it will. For
example, consider hstore again.  If we eventually disallow => as an
operator altogether, the 1.0 script won't even run any more.

Of course that doesn't matter for core because we've removed it
entirely from our repository and don't ship it any more, but an
out-of-core extension might well keep around more old scripts than we
do, to make it easier to use the same bundle with multiple server
versions.  Imagine, for example, that 1.0 only works on 9.4 or earlier
and 1.2 only works on releases 9.2 or later.  The extension author
wants to keep the 1.0 script around for the benefit of people who
haven't upgraded, so that they can still install the older version
that works there; but he also need the 1.1 base version to be
preferred to the 1.0 base version, else installation of 1.2 on 10.0+
will fail completely.  My experience with software upgrades is that
cases like this, and even weirder things, happen pretty routinely, so
I think more control is good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet

On Wed, 04 Dec 2013 11:07:04 -0800
Josh Berkus  wrote:

> On 12/04/2013 07:33 AM, Jonathan Corbet wrote:
> > Wow, Josh, I'm surprised to hear this from you.
> 
> Well, I figured it was too angry to propose for an LWN article. ;-)

So you're going to make us write it for you :)

> > The active/inactive list mechanism works great for the vast majority of
> > users.  The second-use algorithm prevents a lot of pathological behavior,
> > like wiping out your entire cache by copying a big file or running a
> > backup.  We *need* that kind of logic in the kernel.
> 
> There's a large body of research on 2Q algorithms going back to the 80s,
> which is what this is.  As far as I can tell, the modification was
> performed without any reading of this research, since that would have
> easily shown that 50/50 was unlikely to be a good division, and that in
> fact there is nothing which would work except a tunable setting, because
> workloads are different.

In general, the movement of useful information between academia and
real-world programming seems to be minimal at best.  Neither side seems to
find much that is useful or interesting in what the other is doing.
Unfortunate.

For those interested in the details... (1) It's not quite 50/50, that's one
bound for how the balance is allowed to go.  (2) Anybody trying to add
tunables to the kernel tends to run into resistance.  Exposing thousands of
knobs tends to lead to a situation where you *have* to be an expert on all
those knobs to get decent behavior out of your system.  So there is a big
emphasis on having the kernel tune itself whenever possible.  Here is a
situation where that is not always happening, but a fix (which introduces
no knob) is in the works.

As an example, I've never done much with the PostgreSQL knobs on the LWN
server.  I just don't have the time to mess with it, and things Work Well
Enough.  

> However, this particular issue concerns me less than the general
> attitude that it's OK to push in experimental IO changes which can't be
> disabled by users into release kernels, as exemplified by several
> problematic and inadequately tested IO changes in the 3.X kernels --
> most notably the pdflush bug.  It speaks of a policy that the Linux IO
> stack is not production software, and it's OK to tinker with it in ways
> that break things for many users.

Bugs and regressions happen, and I won't say that we do a good enough job
in that regard.  There has been some concern recently that we're accepting
too much marginal stuff.  We have problems getting enough people to
adequately review code — I think I've heard of another project or two with
similar issues :).  But nobody sees the kernel as experimental or feels
that the introduction of bugs is an acceptable thing.

> I also wasn't exaggerating the reception I got when I tried to talk
> about IO and PostgreSQL at LinuxCon and other events.  The majority of
> Linux hackers I've talked to simply don't want to be bothered with
> PostgreSQL's performance needs, and I've heard similar things from my
> collegues at the MySQL variants.  Greg KH was the only real exception.
>
> Heck, I went to a meeting of filesystem geeks at LinuxCon and the main
> feedback I received, from Linux FS developers (Chris and Ted), was
> "PostgreSQL should implement its own storage and use DirectIO, we don't
> know why you're even trying to use the Linux IO stack."

I think you're talking to the wrong people.  Nothing you've described is a
filesystem problem; you're contending with memory management problems.
Chris and Ted weren't helpful because there's actually little they can do
to help you.  I would be happy to introduce you to some people who would be
more likely to take your problems to heart.

Mel Gorman, for example, is working on putting together a set of MM
benchmarks in the hopes of quantifying changes and catching regressions
before new code is merged.  He's one of the people who has to deal with
performance regressions when they show up in enterprise kernels, and I get
the sense he'd rather do less of that.

Perhaps even better: the next filesystem, storage, and memory management
summit is March 24-25.  A session on your pain points there would bring in
a substantial portion of the relevant developers at all levels.  LSFMM
is arguably the most productive kernel event I see over the course of a
year; it's where I would go first to make progress on this issue.  I'm not
an LSFMM organizer, but I would be happy to work to make such a session
happen if somebody from the PostgreSQL community wanted to be there.

> > This code has been a bit slow getting into the mainline for a few reasons,
> > but one of the chief ones is this: nobody is saying from the sidelines
> > that they need it!  If somebody were saying "Postgres would work a lot
> > better with this code in place" and had some numbers to demonstrate that,
> > we'd be far more likely to see it get into an upcoming release.
> 
> Well, Citus did that; do you need mor

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Magnus Hagander

On Wed, Dec 4, 2013 at 9:31 PM, Jonathan Corbet  wrote:

> > I also wasn't exaggerating the reception I got when I tried to talk
> > about IO and PostgreSQL at LinuxCon and other events.  The majority of
> > Linux hackers I've talked to simply don't want to be bothered with
> > PostgreSQL's performance needs, and I've heard similar things from my
> > collegues at the MySQL variants.  Greg KH was the only real exception.
> >
> > Heck, I went to a meeting of filesystem geeks at LinuxCon and the main
> > feedback I received, from Linux FS developers (Chris and Ted), was
> > "PostgreSQL should implement its own storage and use DirectIO, we don't
> > know why you're even trying to use the Linux IO stack."
>
> I think you're talking to the wrong people.  Nothing you've described is a
> filesystem problem; you're contending with memory management problems.
> Chris and Ted weren't helpful because there's actually little they can do
> to help you.  I would be happy to introduce you to some people who would be
> more likely to take your problems to heart.
>
> Mel Gorman, for example, is working on putting together a set of MM
> benchmarks in the hopes of quantifying changes and catching regressions
> before new code is merged.  He's one of the people who has to deal with
> performance regressions when they show up in enterprise kernels, and I get
> the sense he'd rather do less of that.
>
> Perhaps even better: the next filesystem, storage, and memory management
> summit is March 24-25.  A session on your pain points there would bring in
> a substantial portion of the relevant developers at all levels.  LSFMM
> is arguably the most productive kernel event I see over the course of a
> year; it's where I would go first to make progress on this issue.  I'm not
> an LSFMM organizer, but I would be happy to work to make such a session
> happen if somebody from the PostgreSQL community wanted to be there.
>

I think that's an excellent idea. If one of our developers could find the
time to attend that, I think that could be very productive. While I'm not
on the funds team, I'd definitely vote for funding such participation out
of community funds if said developer can't do it on his own.

But it should definitely be a developer with interest and skills in that
particular area as well of course :) So don't think I'm proposing myself, I
definitely am not :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] pgsql: Fix a couple of bugs in MultiXactId freezing

2013-12-04 Thread Magnus Hagander

On Wed, Dec 4, 2013 at 8:43 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane  wrote:
> >> I assume what would happen is the slave would PANIC upon seeing a WAL
> >> record code it didn't recognize.
>
> > I wonder if we should for the future have the START_REPLICATION command
> (or
> > the IDENTIFY_SYSTEM would probably make more sense - or even adding a new
> > command like IDENTIFY_CLIENT. The point is, something in the replication
> > protocol) have walreceiver include it's version sent to the master. That
> > way we could have the walsender identify a walreceiver that's too old and
> > disconnect it right away - with a much  nicer error message than a PANIC.
>
> Meh.  That only helps for the case of streaming replication, and not for
> the thirty-seven other ways that some WAL might arrive at something that
> wants to replay it.
>
> It might be worth doing anyway, but I can't get excited about it for this
> scenario.
>

It does, but I bet it's one of the by far most common cases. I'd say it's
that one and restore-from-backup that would cover a huge majority of all
cases. If we can cover those, we don't have to be perfect - so unless it
turns out to be ridiculously complicated, I think it would be worthwhile
having.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stephen Frost

* Magnus Hagander (mag...@hagander.net) wrote:
> I think that's an excellent idea. If one of our developers could find the
> time to attend that, I think that could be very productive. While I'm not
> on the funds team, I'd definitely vote for funding such participation out
> of community funds if said developer can't do it on his own.
> 
> But it should definitely be a developer with interest and skills in that
> particular area as well of course :) So don't think I'm proposing myself, I
> definitely am not :)

For my part, I'm definitely interested and those dates currently look
like they'd work for me.  Not sure if I really meet Magnus'
qualifications above, but I'd be happy to try. ;)  Stark and I were
having a pretty good discussion with Ted Ts'o at pgconf.eu and he
certainly seemed interested and willing to at least discuss things with
us..

Thanks,

Stephen

signature.asc
Description: Digital signature

[HACKERS] Performance optimization of btree binary search

2013-12-04 Thread Peter Geoghegan

Having nothing better to do over the holiday weekend, I decided to
pursue a number of ideas for improving performance that I thought
about a long time ago. These include:

* Pre-fetching list node pointers. This looks to be moderately
promising, but I'm certainly not going to be the one to land it, given
present obligations. Stephen Frost may wish to pick it up, given his
previous interest in the matter. This is slightly controversial,
because it uses a GCC intrinsic (__builtin_prefetch), but also because
the Linux kernel removed this optimization to their generic list
data-structure [1]. However, that list was what we'd call an embedded
list, so we should probably shouldn't be totally deterred. The amount
of effort that I put into this was, frankly, pretty low. A motivated
person, willing to do the appropriate analysis could probably bring it
further. For one thing, every single foreach() has a call to this
intrinsic, even where the list doesn't store pointers (which is not
undefined). At the very least that's going to bloat things up,
frequently for no conceivable gain, and yet with the patch applied
we're still able to see see quite tangible benefits, even if it isn't
exactly a stellar improvement. I have an idea that prefetching the
last element at the start of the loop could be much better than what I
did, because we know that those lists are mostly pretty small in
practice, and that's likely to help pipelining - prefetching too late
or even too early makes the optimization useless, because you may
still get a cache miss.

* Optimizing index scans - I noticed that binary searching accounted
for many cache misses during a pgbench select.sql benchmark,
instrumented with "perf record -e cache-misses". This warranted
further investigation.

I won't say anything further about the former optimization, except to
note that it's included for comparative purposes in the set of
benchmarks I've run (I haven't included a patch). The benchmark
results are here:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/results

I took two approaches to the latter. This was the more interesting
piece of work. Test sets include:

* Master baseline (green)

* List optimization (as mentioned above, not really relevant to the
main topic of this mail) (red)

* "fib btree", earlier patch, please disregard (blue)

* "Fixed fib patch", Fibonacci search, no specialization (purple)

* The interesting one, Finonacci search + specialization - "fib + no
jump"  (turquoise, see below for details)

Initially, I had a little bit of success with Fibonnacci search [2] in
place of binary search, in the hope that it would better take
advantage of CPU cache characteristics - Fibonnacci search is said to
have advantages where non-uniform memory access is an issue - it
minimizes the final interval. I wasn't all that optimistic that it
would work that well given the smallish size of BLCKSZ relative to
modern CPU L1 cache sizes [3], but it did make an appreciable dent on
its own. I suppose old habits die hard, because next I hacked up
_bt_compare and had it do an int4btcmp directly, in the event of
encountering a scankey that had as its comparator the relevant pg_proc
oid. This is very much in the style (and the spirit) of the grotty
early draft patches for the inlining-comparators-for-sorting patch.
Patch is attached. This is a draft, a POC, posted only to facilitate
discussion and to allow my results to be independently
duplicated/verified. Note that there is a bug (attributable to the new
search code) that causes the regression tests to fail in exactly one
place (i.e. one line of difference). I didn't think it was worth
deferring discussion to deal with that, though, since I don't think it
undermines anything.

I'm not sure how valuable the comparator trick is if we stick with
binary search - I didn't test that. I'm sure it's a question that must
be considered, though.

I have a fairly huge amount of data here, having run plenty of
benchmarks over several nights. The short version is that the 'select'
benchmark has just over 18% greater throughput on this machine at some
client counts (in particular, when there are 4 clients - there are 4
cores, but 8 logical cores) with the attached patch. There is a 3.5%
regression with one client, which is certainly not accounted for by
noise. Note, however, that latency appears consistently better with
the patch applied. This is a machine running on dedicated hardware: a
4-core i7-4770. The working set easily fits in its 32GiB of DDR3 RAM
at all pgbench scales tested (1, 10 and 100). The kernel used is
"3.8.0-31-generic #46~precise1-Ubuntu SMP". Postgres settings are
typical for this kind of thing (6GiB shared_buffers), but you can
refer to my pgbench-tools results for full details (drill down to an
individual pgbench run for that - they're all the same). I'm kind of
curious as to what this benchmark would like like on a server with
many more cores.

I guess I could write a proper patch to have co

Re: [HACKERS] WITHIN GROUP patch

2013-12-04 Thread Andrew Gierth

> "Tom" == Tom Lane  writes:

 Tom> Well, sure, but I was only suggesting adding it when the
 Tom> aggregate asks for it, probably via a new flag column in
 Tom> pg_aggregate.

Sure, I was only pointing out the necessity.

 Tom> The question you're evading is what additional functionality
 Tom> could be had if the aggregate could demand a different datatype
 Tom> or constant value for the flag column.

I don't really see a question there to answer - I simply chose to
provide a general mechanism rather than make assumptions about what
future users of the code would desire. I have no specific application
in mind that would require some other type.

 >> Adding it only for hypothetical set functions is making a
 >> distinction in how functions are executed that I don't think is
 >> warranted -

 Tom> That seems like rather a curious argument from someone who's
 Tom> willing to give up the ability to specify a regular transition
 Tom> value concurrently with the flag column.

In the current patch the idea of also specifying a regular transition
value is meaningless since there is no transition function.

 Tom> But anyway, what I'm thinking right now is that these questions
 Tom> would all go away if the aggregate transfunction were receiving
 Tom> the rows and sticking them into the tuplestore.  It could add
 Tom> whatever columns it felt like.

True, but this ends up duplicating the sorting functionality of
nodeAgg that we are leveraging off in the first place. I think this
will be somewhat more intrusive and likely slower.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Josh Berkus

Jonathan,

> For those interested in the details... (1) It's not quite 50/50, that's one
> bound for how the balance is allowed to go.  (2) Anybody trying to add
> tunables to the kernel tends to run into resistance.  Exposing thousands of
> knobs tends to lead to a situation where you *have* to be an expert on all
> those knobs to get decent behavior out of your system.  So there is a big
> emphasis on having the kernel tune itself whenever possible.  Here is a
> situation where that is not always happening, but a fix (which introduces
> no knob) is in the works.

Yeah, we get into this argument all the time. The problem is when you
run into situations where there is no optimal (or even acceptable)
setting for all, or even most, users.  And I'll say in advance that 2Q
is one of those situations.

> As an example, I've never done much with the PostgreSQL knobs on the LWN
> server.  I just don't have the time to mess with it, and things Work Well
> Enough.  

Sure, and even when I teach fiddling with the knobs, there's only 12-20
knobs 95% of users need to have any interest in.  But we have ~~220
settings for the other 5%, and those users would be screwed without them.

> Bugs and regressions happen, and I won't say that we do a good enough job
> in that regard.  There has been some concern recently that we're accepting
> too much marginal stuff.  We have problems getting enough people to
> adequately review code — I think I've heard of another project or two with
> similar issues :).  But nobody sees the kernel as experimental or feels
> that the introduction of bugs is an acceptable thing.

OK.  The chain of events over the pdflush bug really felt like what I
said earlier, especially since problems *were* reported shortly after
kernel release and ignored.

> I think you're talking to the wrong people.  

Quite possibly.

> Perhaps even better: the next filesystem, storage, and memory management
> summit is March 24-25.

Link?  I can't find anything Googling by that name.  I'm pretty sure we
can get at least one person there.

> Gee, if only there were a web site where one could read about changes to
> the Linux kernel :)

Even you don't cover 100% of performance-changing commits.  And I'll
admit to missing issues of LWN when I'm travelling.

> Seriously, though, one of the best things to do would be to make a point of
> picking up a kernel around -rc3 (right around now, say, for 3.13) and
> running a few benchmarks on it.  If you report a performance regression at
> that stage, it will get attention.

Yeah, back to the "we need resources for good benchmarks" discussion
fork ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Merlin Moncure

On Wed, Dec 4, 2013 at 2:31 PM, Jonathan Corbet  wrote:
> For those interested in the details... (1) It's not quite 50/50, that's one
> bound for how the balance is allowed to go.  (2) Anybody trying to add
> tunables to the kernel tends to run into resistance.  Exposing thousands of
> knobs tends to lead to a situation where you *have* to be an expert on all
> those knobs to get decent behavior out of your system.  So there is a big
> emphasis on having the kernel tune itself whenever possible.  Here is a
> situation where that is not always happening, but a fix (which introduces
> no knob) is in the works.

I think there are interesting parallels here with the 'query plan
hints' debate.  In both cases I think the conservative voices are
correct: better not to go crazy adding knobs.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet

On Wed, 04 Dec 2013 13:01:37 -0800
Josh Berkus  wrote:

> > Perhaps even better: the next filesystem, storage, and memory management
> > summit is March 24-25.  
> 
> Link?  I can't find anything Googling by that name.  I'm pretty sure we
> can get at least one person there.

It looks like the page for the 2014 event isn't up yet.  It will be
attached (as usual) to the LF collaboration summit:

http://events.linuxfoundation.org/events/collaboration-summit

I'll make a personal note to send something here when the planning process
begins and the CFP goes out.

Napa Valley...one can do worse...:)

jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance optimization of btree binary search

2013-12-04 Thread Tom Lane

Peter Geoghegan  writes:
> I guess I could write a proper patch to have code setting up a scankey
> also set a flag that indicated that it was acceptable to assume that
> the special built-in comparator would do fine. ...
> I'd be happy with a scheme with only one built-in comparator, and
> allowed a few types to be cataloged such that it was indicated that
> just using the "built-in" comparator was acceptable, knowledge that
> could be passed to _bt_compare via the scankey. I'm thinking of just
> int4, and maybe date and a few other such int4 "covariant" types.

If what you're proposing is that we have a fast path that compares Datums
as Datums, I should think that that would work fine for int2 as well,
*and* for int8 on machines where int8 is pass-by-value.  (Does anyone
still care much about PG's performance on 32-bit hardware?)  We might
have to fool a bit with the fooGetDatum macros in some cases, eg
I think Int16GetDatum isn't careful about sign extension.  Admittedly,
that might introduce an offsetting cost on some hardware, but I think
on most machines sign-extension isn't noticeably more expensive than
zero-extension.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance optimization of btree binary search

2013-12-04 Thread Robert Haas

On Wed, Dec 4, 2013 at 4:28 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> I guess I could write a proper patch to have code setting up a scankey
>> also set a flag that indicated that it was acceptable to assume that
>> the special built-in comparator would do fine. ...
>> I'd be happy with a scheme with only one built-in comparator, and
>> allowed a few types to be cataloged such that it was indicated that
>> just using the "built-in" comparator was acceptable, knowledge that
>> could be passed to _bt_compare via the scankey. I'm thinking of just
>> int4, and maybe date and a few other such int4 "covariant" types.
>
> If what you're proposing is that we have a fast path that compares Datums
> as Datums, I should think that that would work fine for int2 as well,
> *and* for int8 on machines where int8 is pass-by-value.  (Does anyone
> still care much about PG's performance on 32-bit hardware?)  We might
> have to fool a bit with the fooGetDatum macros in some cases, eg
> I think Int16GetDatum isn't careful about sign extension.  Admittedly,
> that might introduce an offsetting cost on some hardware, but I think
> on most machines sign-extension isn't noticeably more expensive than
> zero-extension.

Yeah, I think if we can make something like this work, it would be
neat-o.  Getting this working for int4 would be a good win, as Peter
says, but getting it working for both int4 and int8 with the same code
would be a significantly better one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITHIN GROUP patch

2013-12-04 Thread Tom Lane

Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> But anyway, what I'm thinking right now is that these questions
>  Tom> would all go away if the aggregate transfunction were receiving
>  Tom> the rows and sticking them into the tuplestore.  It could add
>  Tom> whatever columns it felt like.

> True, but this ends up duplicating the sorting functionality of
> nodeAgg that we are leveraging off in the first place. I think this
> will be somewhat more intrusive and likely slower.

Hm, it's just a refactoring of the same code we'd have to have anyway,
so I'm not seeing a reason to assume it'd be slower.  If anything,
this approach would open more opportunities for function-specific
optimizations, which in the long run could be faster.  (I'm not
claiming that any such optimizations would be in the first version.)

In hindsight I wonder if it wasn't a mistake to embed ordered-aggregate
support in nodeAgg.c the way we did.  We could have dumped that
responsibility into some sort of wrapper around specific aggregates,
with an option for some aggregates to skip the wrapper and handle it
themselves.  A trivial, and perhaps not very useful, example is that
non-order-sensitive aggregates like MIN/MAX/COUNT could have been coded
to simply ignore any ordering request.  I can't immediately think of any
examples that are compelling enough to justify such a refactoring now ---
unless it turns out to make WITHIN GROUP easier.

Anyway, I'm going to go off and look at the WITHIN GROUP patch with these
ideas in mind.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Peter Eisentraut

On 12/2/13, 9:14 AM, Dimitri Fontaine wrote:
> What I want to build is an “extension distribution” software that knows
> how to prepare anything from PGXN (and other places) so that it's fully
> ready for being used in the database. Then the main client would run as
> a CREATE EXTENSION "ddl_command_start" Event Trigger and would fetch the
> prepared extension for you and make it available, then leaving the main
> command operate as intended.
> 
> Which is what I think the pex extension is doing, and that's not
> coincidental, but it runs the build step on the PostgreSQL server itself
> and needs to have a non-trivial set of file-system privileges to be
> doing so, and even needs to get root privileges with sudo for some of
> its operations.

You're thinking of autopex, and while that works, and can be made to
work better with certain small changes, I don't think it can ever be the
only solution.  Many interesting extensions will have external packages
build and run-time dependencies, and you need file-system level access
to manage that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extension Templates S03E11

2013-12-04 Thread Peter Eisentraut

On 12/2/13, 2:33 PM, Greg Stark wrote:
> Just tossing an idea out there. What if you could install an extension
> by specifying not a local file name but a URL. Obviously there's a
> security issue but for example we could allow only https URLs with
> verified domain names that are in a list of approved domain names
> specified by a GUC.

This is similar to what autopex does (https://github.com/petere/autopex).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 140 matches

Mail list logo