Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Greg Stark
On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 Since there's absolutely no sensible scenario for setting
 max_connections that high, I'd like to change the limit to 2^16, so we
 can use a uint16 in BufferDesc-refcount.

Clearly there's no sensible way to run 64k backends in the current
architecture. But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hashable custom types

2014-04-26 Thread David Fetter
On Fri, Apr 25, 2014 at 04:47:49PM -0700, Paul Ramsey wrote:
 When trying to write a recursive CTE using the PostGIS geometry type,
 I was told this:
 
 ERROR:  could not implement recursive UNION
 DETAIL:  All column datatypes must be hashable.

This leads to an interesting question, which is why does our
implementation require this.  I'm guessing it's a performance
optimization.

Quoth src/backend/executor/nodeRecursiveunion.c:

/*
 * To implement UNION (without ALL), we need a hashtable that stores tuples
 * already seen.  The hash key is computed from the grouping columns.
 */

As hashing can only approximately guarantee uniqueness (pigeonhole
principle, blah, blah), is there some other similarly performant
mechanism for tracking seen tuples that might work at least in cases
where we don't have a hash function for the data type?  Some kind of
tree, perhaps, or does that require too many other things (total
ordering, e.g.)?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] includedir_internal headers are not self-contained

2014-04-26 Thread Christoph Berg
Debian is shipping client headers in /usr/include/postgresql in the
libpq-dev package. The server headers go into
/usr/include/postgresql/major/server in postgresql-server-dev-major,
so we can have the headers for several majors installed in parallel.

Historically, a few server headers were also included in libpq-dev
because 9 years ago, there were some client apps that needed them.
We've finally got around to fix that [1], now the layout is:

libpq-dev:
  /usr/include/postgresql/internal/*
  /usr/include/postgresql/libpq-fe.h
  /usr/include/postgresql/libpq-events.h
  /usr/include/postgresql/libpq/libpq-fs.h
  /usr/include/postgresql/pg_config*.h
  /usr/include/postgresql/postgres_ext.h

postgresql-server-dev-major:
  /usr/include/postgresql/major/server/*

Unfortunately the files in internal/ are not self-contained:
  internal/postgres_fe.h includes
  common/fe_memutils.h which includes
  utils/palloc.h

Both common/ and utils/ are server-only, so you can't build client
apps which need postgres_fe.h with only libpq-dev installed.

common/ was introduced in 8396447cdbdff0b62914748de2fec04281dc9114,
and added to src/include/Makefile in c153530dc10bf5ff6dc5a89249f9cb596dd71a63.

I believe common/ should be also be installed by includedir_internal.
utils/ should probably also be installed there, alternatively only the
headers referred to from common/, the files directly referred being:

$ grep -r include 9.4/server/common/ | grep \
9.4/server/common/fe_memutils.h:#include utils/palloc.h
9.4/server/common/relpath.h:#include catalog/catversion.h /* pgrminclude 
ignore */
9.4/server/common/relpath.h:#include storage/relfilenode.h

I'd write a patch for src/include/Makefile, but we'd need to sort out
the layout first.

On a sidenote, I don't see why utils/errcodes.h and utils/fmgroids.h
need a separate INSTALL_DATA call when they are installed by into
utils/ anyway.

(Another issue is that client apps frequently seem to want
catalog/pg_type.h to get the OID definitions, it might make sense to
move that also to internal/.)

Christoph

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=314427
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread David Fetter
On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:
 Hi,
 
 Currently the maximum for max_connections (+ bgworkers + autovacuum) is
 defined by
 #define MAX_BACKENDS0x7f
 which unfortunately means that some things like buffer reference counts
 need a full integer to store references.

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Andres Freund
On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
 On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Since there's absolutely no sensible scenario for setting
  max_connections that high, I'd like to change the limit to 2^16, so we
  can use a uint16 in BufferDesc-refcount.
 
 Clearly there's no sensible way to run 64k backends in the current
 architecture.

The current limit is 2^24, I am only proposing to lower it to 2^16.

 But I don't think it's beyond the realm of possibility
 that we'll reduce the overhead in the future with an eye to being able
 to do that. Is it that helpful that it's worth baking in more
 dependencies on that limitation?

I don't think it's realistic that we'll ever have more than 2^16 full
blown backends. We might (I hope!) a builtin pooler, but pooler
connections won't be full backends.
So I really don't see any practical limitation with limiting the max
number of backends to 65k.

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
  while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

Imo those are significant scalability advantages...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Andres Freund
On 2014-04-26 05:40:21 -0700, David Fetter wrote:
 On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:
  Hi,
  
  Currently the maximum for max_connections (+ bgworkers + autovacuum) is
  defined by
  #define MAX_BACKENDS0x7f
  which unfortunately means that some things like buffer reference counts
  need a full integer to store references.
 
 Out of curiosity, where are you finding that a 32-bit integer is
 causing problems that a 16-bit one would solve?

Save space? For one it allows to shrink some structs (into one
cacheline!). For another it allows to combine flags and refcount in
buffer descriptors into one variable, manipulated atomically.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hashable custom types

2014-04-26 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Fri, Apr 25, 2014 at 04:47:49PM -0700, Paul Ramsey wrote:
 ERROR:  could not implement recursive UNION
 DETAIL:  All column datatypes must be hashable.

 This leads to an interesting question, which is why does our
 implementation require this.  I'm guessing it's a performance
 optimization.

Well, you clearly need to have a notion of equality for each column
datatype, or else UNION doesn't mean anything.

In general we consider that a datatype's notion of equality can
be defined either by its default btree opclass (which supports
sort-based query algorithms) or by its default hash opclass
(which supports hash-based query algorithms).

The plain UNION code supports either sorting or hashing, but
we've not gotten around to supporting a sort-based approach
to recursive UNION.  I'm not convinced that it's worth doing ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] includedir_internal headers are not self-contained

2014-04-26 Thread Tom Lane
Christoph Berg c...@df7cb.de writes:
   internal/postgres_fe.h includes
   common/fe_memutils.h which includes
   utils/palloc.h

Hm.  It seems rather fundamentally broken to me that frontend code is
including palloc.h --- that file was never intended to be frontend-safe,
and the #ifdefs that I see in it today don't fill me with any feeling of
quality workmanship.

I think what we ought to do about this is get rid of the dependency
on palloc.h.

 Both common/ and utils/ are server-only, so you can't build client
 apps which need postgres_fe.h with only libpq-dev installed.

Clearly, the idea that common/ is server-only is broken.

 I believe common/ should be also be installed by includedir_internal.
 utils/ should probably also be installed there, alternatively only the
 headers referred to from common/, the files directly referred being:

 $ grep -r include 9.4/server/common/ | grep \
 9.4/server/common/fe_memutils.h:#include utils/palloc.h
 9.4/server/common/relpath.h:#include catalog/catversion.h /* pgrminclude 
 ignore */
 9.4/server/common/relpath.h:#include storage/relfilenode.h

The catversion dependency also seems pretty damn brain-dead in this
context.  Let's see if we can get rid of that.  As for relfilenode,
if we need that in relpath.h maybe the answer is that relfilenode.h
has to be in common/.

Anyway, the bottom line for me is that utils/ is a server-only area and
therefore nothing in common/ ought to depend on it.

 (Another issue is that client apps frequently seem to want
 catalog/pg_type.h to get the OID definitions, it might make sense to
 move that also to internal/.)

That's not happening.  We do need some better solution for letting client
apps get hold of fixed type oids, but moving a catalog header someplace
else is not it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with displaying wide tables in psql

2014-04-26 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 I expect this regression test to fail on platforms that don't support
 utf-8 client-side (I'm assuming we such things?). I don't have such a
 platform here and I'm not sure how it would fail so I want to go ahead
 and apply it and grab the output to add the alternate output when it
 fails on the build-farm. Would that be ok?

Are you expecting to carry an alternate expected file for every possible
encoding choice?  That does not seem workable to me, and even if we could
do it the cost/benefit ratio would be pretty grim.  I think you should
drop the UTF8-dependent tests.

In other words: there are no encoding dependencies in the existing
standard regression tests.  This feature is not the place to start adding
them, and two weeks past feature freeze is not the time to start adding
them either.  We don't have time right now to shake out a whole new
set of platform dependencies in the regression tests.

If you feel these tests must be preserved someplace, you could add a
new regression test that isn't run by default, following in the
footsteps of collate.linux.utf8.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
 But I don't think it's beyond the realm of possibility
 that we'll reduce the overhead in the future with an eye to being able
 to do that. Is it that helpful that it's worth baking in more
 dependencies on that limitation?

 What I think it's necessary for is at least:

 * Move the buffer content lock inline into to the buffer descriptor,
   while still fitting into one cacheline.
 * lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips.  And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-26 05:40:21 -0700, David Fetter wrote:
 Out of curiosity, where are you finding that a 32-bit integer is
 causing problems that a 16-bit one would solve?

 Save space? For one it allows to shrink some structs (into one
 cacheline!).

And next week when we need some other field in a buffer header,
what's going to happen?  If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with displaying wide tables in psql

2014-04-26 Thread Greg Stark
Not sure what other encodings you mean. Psql uses utf8 for the border and
the test uses utf8 to test the formatting. I was only anticipating an error
on platforms where that didn't work.

I would lean towards having it but I'm fine following your judgement,
especially given the timing.

-- 
greg


Re: [HACKERS] Problem with displaying wide tables in psql

2014-04-26 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 Not sure what other encodings you mean. Psql uses utf8 for the border and
 the test uses utf8 to test the formatting. I was only anticipating an error
 on platforms where that didn't work.

Well, there are two likely misbehaviors if the regression test is being
run in some other encoding:

1. If it's a single-byte encoding, you probably won't get any bad-encoding
complaints, but the code will think the utf8 characters represent multiple
logical characters, resulting in (at least) spacing differences.  It's
possible that all single-byte encodings would act the same, but I'm not
sure.

2. If it's a multi-byte encoding different from utf8, you're almost
certainly going to get badly-encoded-data complaints, at different places
depending on the particular encoding.

I don't remember how many different multibyte encodings we support,
but I'm pretty sure we'd need a separate expected file for each one.
Plus at least one for the single-byters.

The real problem is that I don't have a lot of confidence that the
buildfarm would provide us with full coverage of all the encodings
that somebody might use in the field.  So we might not find out about
omissions or wrong expected-files until after we ship.

Anyway, the bottom line for me is that this test isn't worth that
much trouble.  I'm okay with putting it in as a separate test file
that we don't support running in non-utf8 encodings.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] includedir_internal headers are not self-contained

2014-04-26 Thread Tom Lane
I wrote:
 Christoph Berg c...@df7cb.de writes:
 $ grep -r include 9.4/server/common/ | grep \
 9.4/server/common/fe_memutils.h:#include utils/palloc.h
 9.4/server/common/relpath.h:#include catalog/catversion.h /* pgrminclude 
 ignore */
 9.4/server/common/relpath.h:#include storage/relfilenode.h

 The catversion dependency also seems pretty damn brain-dead in this
 context.  Let's see if we can get rid of that.  As for relfilenode,
 if we need that in relpath.h maybe the answer is that relfilenode.h
 has to be in common/.

On closer inspection, the issue here is really that putting relpath.h/.c
in common/ was completely misguided from the get-go.  It's unnecessary:
there's nothing outside the backend that uses it, except for
contrib/pg_xlogdump which could very easily do without it.  And relpath.h
is a serious failure from a modularity standpoint anyway, because there is
code all over the backend that has intimate familiarity with the pathname
construction rules.  We could possibly clean that up to the extent of
being able to hide TABLESPACE_VERSION_DIRECTORY inside relpath.c, but what
then?  We'd still be talking about having CATALOG_VERSION_NO compiled into
frontend code for any frontend code that actually made use of relpath.c,
which is surely not such a great idea.

So it seems to me the right fix for the relpath end of it is to push most
of relpath.c back where it came from, which I think was backend/catalog/.

There might be some value in keeping the forkname-related code in common/;
that's not quite so intimately tied to the backend version as relpath()
itself.  (And indeed forkNames[] is the only thing that pg_xlogdump.c
needs.)  But I'm not really convinced that a module encapsulating just the
fork names is worth the trouble, and especially not convinced that 
frontend code needs to be dealing with fork names.  Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hashable custom types

2014-04-26 Thread Atri Sharma
The plain UNION code supports either sorting or hashing, but
 we've not gotten around to supporting a sort-based approach
 to recursive UNION.  I'm not convinced that it's worth doing ...

 regards, tom lane



Without sorting, isnt the scope of a recursive UNION with custom datatypes
pretty restrictive?

As is, even the sorting shall be a bit restrictive due to the costs
associated. I feel what David has suggested upthread should be good. Maybe
an experimental patch with a workload that should give a load factor 1 for
the hash table should prove some performance points.

Even if thats not the case, we should really do something to improve the
scope of usability of recursive UNION with custom types.

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*


[HACKERS] make check-world problem

2014-04-26 Thread Vladimir Koković

Hi,

PostgreSQL build failed with current GIT source.

tail /home/src/postgresql-devel/dev-build/make-out-dev.log
cp ../../../contrib/dummy_seclabel/dummy_seclabel.so dummy_seclabel.so
rm -rf ./testtablespace
mkdir ./testtablespace
../../../src/test/regress/pg_regress  
--inputdir=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress  
--temp-install=./tmp_check --top-builddir=../../..--dlpath=.   
--schedule=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/parallel_schedule
pg_regress: could not open file  
/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/sql/security_label.sql  
for writing: Permission denied

make[2]: *** [check] Error 2
make[2]: Leaving directory  
`/home/src/postgresql-devel/dev-build/src/test/regress'

make[1]: *** [check-regress-recurse] Error 2
make[1]: Leaving directory `/home/src/postgresql-devel/dev-build/src/test'
make: *** [check-world-src/test-recurse] Error 2

My build environment:
-
dev-build.sh:
#!/bin/bash

set -v
set -e

POSTGRESQL=/home/src/postgresql-devel
BUILD=dev-build

cd $POSTGRESQL
rm -rf $BUILD
mkdir $BUILD
chown postgres:postgres $BUILD
cd $POSTGRESQL/$BUILD
su -c $POSTGRESQL/dev-build-postgres.sh postgres

exit 0

--
dev-build-postgres.sh:
#!/bin/bash

set -v
set -e

POSTGRESQL=/home/src/postgresql-devel
BUILD=dev-build

cd $POSTGRESQL/$BUILD

export CFLAGS=-g3 -gdwarf-2

$POSTGRESQL/postgresql-git/postgresql/configure  
--srcdir=$POSTGRESQL/postgresql-git/postgresql '--enable-cassert' \
'--enable-nls' '--enable-integer-datetimes' '--with-perl' '--with-python'  
'--with-tcl' '--with-openssl' \
'--enable-thread-safety' '--with-ldap' '--with-gssapi' '--with-pam'  
'--with-libxml' '--with-libxslt' \

--prefix=$POSTGRESQL/dev-install  configure-out-dev.log 21

make check-world  make-out-dev.log 21

make installcheck-world  make-install-out-dev.log 21

kdiff3 /home/src/pgadmin3-git/pgadmin3/pgadmin/pg_scanners/pg93/scan.l  
$POSTGRESQL/postgresql-git/postgresql/src/backend/parser/scan.l 
kdiff3  
/home/src/pgadmin3-git/pgadmin3/pgadmin/pg_scanners/pg93/src/backend/parser/93/parser/gram.h  
$POSTGRESQL/dev-install/include/server/parser/gram.h 


exit 0
---

Best regards
Vladimir Kokovic DP senior


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Martijn van Oosterhout
On Fri, Apr 25, 2014 at 04:18:18PM +0100, Greg Stark wrote:
 Which isn't to say they're a bad idea but like everything else in
 engineering there are tradeoffs and no such thing as a free lunch.
 You can avoid depleting the entropy pool by including data you expect
 to be unique as a kind of fake entropy -- which quickly gets you back
 to looking for things like MAC address to avoid duplicates across
 systems.

ISTM you could use the database identifier we already have to at least
produce UUIDs which are unique amongst PostgreSQL instances. That
might be something worth aiming for?

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread David Fetter
On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
  But I don't think it's beyond the realm of possibility
  that we'll reduce the overhead in the future with an eye to being able
  to do that. Is it that helpful that it's worth baking in more
  dependencies on that limitation?
 
  What I think it's necessary for is at least:
 
  * Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
  * lockless/atomic Pin/Unpin Buffer.
 
 TBH, that argument seems darn weak, not to mention probably applicable
 only to current-vintage Intel chips.  And you have not proven that
 narrowing the backend ID is necessary to either goal, even if we
 accepted that these goals were that important.
 
 While I agree with you that it seems somewhat unlikely we'd ever get
 past 2^16 backends, these arguments are not nearly good enough to
 justify a hard-wired limitation.

Rather than hard-wiring one, could we do something clever with
bit-stuffing, or would that tank performance in some terrible ways?

I know we allow for gigantic numbers of backend connections, but I've
never found a win for 2x the number of cores in the box, which at
least in my experience so far tops out in the 8-bit (in extreme cases
unsigned 8-bit) range.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Fri, Apr 25, 2014 at 04:18:18PM +0100, Greg Stark wrote:
 Which isn't to say they're a bad idea but like everything else in
 engineering there are tradeoffs and no such thing as a free lunch.
 You can avoid depleting the entropy pool by including data you expect
 to be unique as a kind of fake entropy -- which quickly gets you back
 to looking for things like MAC address to avoid duplicates across
 systems.

 ISTM you could use the database identifier we already have to at least
 produce UUIDs which are unique amongst PostgreSQL instances. That
 might be something worth aiming for?

It's worth noting in this connection that we've never tried hard to ensure
that database identifiers are actually unique.  One potentially serious
issue is that slave servers will have the same identifier as their master.

Also, I think there's a still-open issue that creation of the identifier
has a thinko about using OR instead of XOR, resulting in way few bits of
freedom than it should have even with the limited amount of entropy used.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Andres Freund
On 2014-04-26 11:20:56 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-26 11:52:44 +0100, Greg Stark wrote:
  But I don't think it's beyond the realm of possibility
  that we'll reduce the overhead in the future with an eye to being able
  to do that. Is it that helpful that it's worth baking in more
  dependencies on that limitation?
 
  What I think it's necessary for is at least:
 
  * Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
  * lockless/atomic Pin/Unpin Buffer.
 
 TBH, that argument seems darn weak, not to mention probably applicable
 only to current-vintage Intel chips.

64 byte has been the cacheline size for more than a decade and it's not
just x86. ARM has also moved to it, as well as other architectures. And
even if it's 32 or 128bit - fitting datastructures to a power of 2 of
the cacheline size is still beneficial.
I don't think many datastructures in pg deserves attention to that, but
the buffer descriptors are one of the few. It's currently one of the top
#3 sources of cpu cache issues in pg.

 And you have not proven that
 narrowing the backend ID is necessary to either goal, even if we
 accepted that these goals were that important.

I am pretty sure there are other ways, but since the actual cost of that
restriction imo is just about zero, it seems like a quite sensible
solution.

 While I agree with you that it seems somewhat unlikely we'd ever get
 past 2^16 backends, these arguments are not nearly good enough to
 justify a hard-wired limitation.

Even if you include a lockless pin/unpin buffer? Besides the lwlock's
internal spinlock the buffer spinlocks are the hottest ones in PG by
far.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Andres Freund
On 2014-04-26 11:22:39 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-26 05:40:21 -0700, David Fetter wrote:
  Out of curiosity, where are you finding that a 32-bit integer is
  causing problems that a 16-bit one would solve?
 
  Save space? For one it allows to shrink some structs (into one
  cacheline!).
 
 And next week when we need some other field in a buffer header,
 what's going to happen?  If things are so tight that we need to
 shave a few bits off backend IDs, the whole thing is a house of
 cards anyway.

The problem isn't so much that we need the individual bits, but that we
need something that has an alignment of two, instead of 4.

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check-world problem

2014-04-26 Thread Tom Lane
=?utf-8?B?VmxhZGltaXIgS29rb3ZpxIc=?= vladimir.koko...@a-asoft.com writes:
 PostgreSQL build failed with current GIT source.

Works for me ...

 pg_regress: could not open file  
 /home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/sql/security_label.sql
   
 for writing: Permission denied
 make[2]: *** [check] Error 2

Hmmm.  Reading between the lines here, but are you attempting to do a
VPATH build as a user that doesn't have write permission on the source
tree?  AFAIK that's never worked, and isn't expected to work, because
architecture-independent derived files will be stored back into the
source tree.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small typo in src/backend/access/transam/xlog.c

2014-04-26 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Mon, Jul 22, 2013 at 07:32:20PM -0400, Tom Lane wrote:
 We could for instance keep the high half as tv_sec, while making the low
 half be something like (tv_usec  12) | (getpid()  0xfff).  This would
 restore the intended ability to reverse-engineer the exact creation time
 from the sysidentifier, and also add a little more uniqueness by way of
 the creating process's PID.  (Note tv_usec must fit in 20 bits.)

 Can someone make a change here so we can close the issue?

Done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Josh Berkus
On 04/26/2014 11:18 AM, Tom Lane wrote:
 It's worth noting in this connection that we've never tried hard to ensure
 that database identifiers are actually unique.  One potentially serious
 issue is that slave servers will have the same identifier as their master.

Yeah, this is one of those things I've been thinking about.  The proble
is that we need a node ID, which identifies the PostgreSQL instance,
and a dataset ID, which identifies the chain of data, especially when
combined with the timeline ID.  So a master and replica would have
different node IDs, but the same dataset ID, until the replica is
promoted, at which point its dataset ID + timeline No. would change.
This would allow for relatively easy management of large clusters by
allowing automated identification of databases and their mirrors.

However, there's a fundamental problem with the concept of the dataset
ID in that there's absolutely no way for PostgreSQL to know when it has
a unique dataset.  Consider a downtime database file cloning for
example; the two databases would have the same identifier and yet both
be standalones which quickly diverge.  So I haven't thought of a good
solution to that.

We could implement a NodeID, though, based on some combination of IP/MAC
address and port though.  Not entirely reliable, but better than nothing ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Josh Berkus
On 04/25/2014 11:46 AM, David Fetter wrote:
 On Fri, Apr 25, 2014 at 10:58:29AM -0700, Josh Berkus wrote:
 You may say oh, that's not the job of the identifer, but if it's not,
 WTF is the identifer for, then?
 
 Frequently, it's to provide some kind of opacity in the sense of not
 have an obvious predecessor or successor.

A far better solution to that is to not share the unadorned ID with the
user.

Basically, there's two different reasons to offer UUIDs in PostgreSQL:

1) because they actually serve a useful purpose in providing a globally
unique identifier;

2) because they work well with existing platforms and frameworks.

Given the state of the art, the above two goals are separate and
exclusive, apologists for poorly conceived UUID algorithms
nonwithstanding.  So either we provide a UUID type which actually helps
identify unique entities between database servers, OR we supply a UUID
which just works with popular web frameworks, or we supply both *as
two or more different types*.  But claiming that types chosen because
they're popular are also technically sound is misleading at best.

Further, based on our experience with OSSP, if we're going to make a
UUId type in core because it's currently popular, we'd better be pretty
sure that it's still going to be popular in 5 or 10 years from now.
Otherwise we're better off keeping it an extension.

I personally am interested in a UUID type which would support doing
multi-master replication of JSON databases built on PostgreSQL, and will
probably write one if nobody else does first, and I don't see existing,
naive randomization-based UUIDS as ever filling that role adequately.
Although, as I said, Andres' work in this area may have already taken
care of this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Josh Berkus
On 04/26/2014 11:06 AM, David Fetter wrote:
 I know we allow for gigantic numbers of backend connections, but I've
 never found a win for 2x the number of cores in the box, which at
 least in my experience so far tops out in the 8-bit (in extreme cases
 unsigned 8-bit) range.

For my part, I've found that anything over a few hundred backends on a
commodity server leads to serious performance degradation.  Even 2000 is
enough to make most servers fall over.  And with proper connection
pooling, I can pump 30,000 queries per second through about 45
connections, so the clear path to supporting large numbers of
connections is some form of built-in pooling.

However, I agree with Tom that Andres should show his hand before we
decrease MAX_BACKENDS by 256X.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Andres Freund
On 2014-04-26 13:16:38 -0700, Josh Berkus wrote:
 However, I agree with Tom that Andres should show his hand before we
 decrease MAX_BACKENDS by 256X.

I just don't want to invest time in developing and benchmarking
something that's not going to be accepted anyway. Thus my question.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

2014-04-26 Thread Tomas Vondra
On 25.4.2014 23:26, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 On 23.4.2014 16:07, Tom Lane wrote:
 To be concrete: let's add a new boolean parameter with the 
 semantics of final function takes extra dummy arguments 
 (default false). There would need to be one for the separate 
 moving-aggregate final function too, of course.
 
 Do we really need a separate parameter for this? Couldn't this be 
 decided simply using the signature of the final function? Either
 it has a single parameter (current behavior), or it has the same 
 parameters as the state transition function (new behavior).
 
 The problem is that the CREATE AGGREGATE syntax only specifies the 
 name of the final function, not its argument list, so you have to 
 make an assumption about the argument list in order to look up the 
 final function in the first place.
 
 I did consider the idea of looking for both signatures and using 
 whatever we find, but that seems fairly dangerous: the same CREATE 
 AGGREGATE command could give different results depending on what 
 versions of the final function happen to exist. This would create an 
 ordering hazard that pg_dump could not reliably cope with, for 
 example.

Yeah. And it wouldn't be clear which function to use in case two
suitable functions (with different signatures) exist. So I guess this
actually requires a parameter.

I'd vote for finalfunc_extra - can't think of a better name, and I'm
not sure what the m in mfinalfunc_extra stands for.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Noah Misch
On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  What I think it's necessary for is at least:
 
  * Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
  * lockless/atomic Pin/Unpin Buffer.
 
 TBH, that argument seems darn weak, not to mention probably applicable
 only to current-vintage Intel chips.  And you have not proven that
 narrowing the backend ID is necessary to either goal, even if we
 accepted that these goals were that important.
 
 While I agree with you that it seems somewhat unlikely we'd ever get
 past 2^16 backends, these arguments are not nearly good enough to
 justify a hard-wired limitation.

I'm satisfied with the arguments Andres presented, which I presume were weak
only because he didn't expect a staunch defense of max_connections=7 use.
The new restriction will still permit settings an order of magnitude larger
than current *worst* practice and 2-3 orders of magnitude larger than current
good practice.  If the next decade sees database server core counts grow by
two orders of magnitude or sees typical cache architectures change enough to
make the compactness irrelevant, we'll have the usual opportunities to react.
Today, the harm from contention on buffer headers totally eclipses the benefit
of allowing max_connections=7.  There's no cause to predict a hardware
development radical enough to change that conclusion.

Sure, let's not actually commit a patch to impose this limit until the first
change benefiting from doing so is ready to go.  There remains an opportunity
to evaluate whether that beneficiary change is better done a different way.
By having this thread to first settle that the new max_connections limit is
essentially okay, the eventual thread concerning lock-free pin manipulation
need not inflate from discussion of this side issue.

On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:
 And next week when we need some other field in a buffer header,
 what's going to happen?  If things are so tight that we need to
 shave a few bits off backend IDs, the whole thing is a house of
 cards anyway.

The buffer header has seen one change in nine years.  Making it an inviting
site for future patches is not important.

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

2014-04-26 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 On 25.4.2014 23:26, Tom Lane wrote:
 The problem is that the CREATE AGGREGATE syntax only specifies the 
 name of the final function, not its argument list, so you have to 
 make an assumption about the argument list in order to look up the 
 final function in the first place.

 Yeah. And it wouldn't be clear which function to use in case two
 suitable functions (with different signatures) exist. So I guess this
 actually requires a parameter.

Exactly.

 I'd vote for finalfunc_extra - can't think of a better name, and I'm
 not sure what the m in mfinalfunc_extra stands for.

Sorry for not being clear.  The m version is the alternate setting for
the moving-aggregate sub-implementation, which is new as of a couple weeks
ago:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a9d9acbf219b9e96585779cd5f99d674d4ccba74

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make check-world problem

2014-04-26 Thread Vladimir Koković

Hi,

Thanks Tom, postgresql source now belongs to user 'postgres' and make  
check-world passed.


But, installcheck-world failed:

tail /home/src/postgresql-devel/dev-build/make-install-out-dev.log

../../../src/test/regress/pg_regress  
--inputdir=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress  
--psqldir='/home/src/postgresql-devel/dev-install/bin'--dlpath=.   
--schedule=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/regress/serial_schedule

(using postmaster on Unix socket, default port)
== dropping database regression ==
sh: 1: /home/src/postgresql-devel/dev-install/bin/psql: not found
command failed: /home/src/postgresql-devel/dev-install/bin/psql -X -c  
DROP DATABASE IF EXISTS \regression\ postgres

make[2]: *** [installcheck] Error 2
make[2]: Leaving directory  
`/home/src/postgresql-devel/dev-build/src/test/regress'

make[1]: *** [installcheck-regress-recurse] Error 2
make[1]: Leaving directory `/home/src/postgresql-devel/dev-build/src/test'
make: *** [installcheck-world-src/test-recurse] Error 2

Best regards
Vladimir Kokovic DP senior


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:
 While I agree with you that it seems somewhat unlikely we'd ever get
 past 2^16 backends, these arguments are not nearly good enough to
 justify a hard-wired limitation.

 I'm satisfied with the arguments Andres presented, which I presume were weak
 only because he didn't expect a staunch defense of max_connections=7 use.
 The new restriction will still permit settings an order of magnitude larger
 than current *worst* practice and 2-3 orders of magnitude larger than current
 good practice.  If the next decade sees database server core counts grow by
 two orders of magnitude or sees typical cache architectures change enough to
 make the compactness irrelevant, we'll have the usual opportunities to react.
 Today, the harm from contention on buffer headers totally eclipses the benefit
 of allowing max_connections=7.  There's no cause to predict a hardware
 development radical enough to change that conclusion.

Well, let me clarify my position: I'm not against reducing MAX_BACKENDS
if we get a significant improvement by doing so.  But the case for that
has not been made.

 And next week when we need some other field in a buffer header,
 what's going to happen?  If things are so tight that we need to
 shave a few bits off backend IDs, the whole thing is a house of
 cards anyway.

 The buffer header has seen one change in nine years.  Making it an inviting
 site for future patches is not important.

We were just a few days ago discussing (again) making changes to the
buffer allocation algorithms.  It hardly seems implausible that any
useful improvements there might need new or different fields in the
buffer headers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Peter Geoghegan
On Sat, Apr 26, 2014 at 1:30 PM, Noah Misch n...@leadboat.com wrote:
 Sure, let's not actually commit a patch to impose this limit until the first
 change benefiting from doing so is ready to go.  There remains an opportunity
 to evaluate whether that beneficiary change is better done a different way.
 By having this thread to first settle that the new max_connections limit is
 essentially okay, the eventual thread concerning lock-free pin manipulation
 need not inflate from discussion of this side issue.

I agree with your remarks here. This kind of thing is only going to
become more important.

 On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:
 And next week when we need some other field in a buffer header,
 what's going to happen?  If things are so tight that we need to
 shave a few bits off backend IDs, the whole thing is a house of
 cards anyway.

 The buffer header has seen one change in nine years.  Making it an inviting
 site for future patches is not important.

My prototype caching patch, which seems promising to me adds an
instr_time to the BufferDesc struct. While that's obviously something
that isn't acceptable, and while I obviously could do better, it still
strikes me that that is the natural place to put such a piece of
state. That doesn't mean it's the best place, but it's still a point
worth noting in the context of this discussion.

As I mention on the thread concerning that work, the LRU-K paper
recommends a time-based delay throttling incrementation of usage_count
to address the problem of correlated references (5 seconds is
suggested there). At least one other major system implements a
configurable delay defaulting to 3 seconds. The 2Q paper also suggests
a correlated reference period.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check-world problem

2014-04-26 Thread Tom Lane
=?utf-8?B?VmxhZGltaXIgS29rb3ZpxIc=?= vladimir.koko...@a-asoft.com writes:
 Thanks Tom, postgresql source now belongs to user 'postgres' and make  
 check-world passed.

 But, installcheck-world failed:

installcheck-world is supposed to test against an installed, running
server.  So you need to do make install-world (not to mention initdb
and starting the postmaster) first.  This looks like you didn't:

 sh: 1: /home/src/postgresql-devel/dev-install/bin/psql: not found

In practice, if you've done check-world, I don't see a lot of value in
doing installcheck-world as well.  (Unless you're checking a packaging
process, but in that case you'd want to construct and install the package,
not just do make install.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Peter Geoghegan
On Sat, Apr 26, 2014 at 1:58 PM, Peter Geoghegan p...@heroku.com wrote:
 The 2Q paper also suggests a correlated reference period.

I withdraw this. 2Q in fact does not have such a parameter, while
LRU-K does. But the other major system I mentioned very explicitly has
a configurable delay that serves this exact purpose. This prevents a
burst of pins on a buffer counting as many touches. The point is that
this approach is quite feasible, and may even be the best way of
addressing the general problem of correlated references.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Jim Nasby

On 4/25/14, 12:58 PM, Josh Berkus wrote:

Well, I've already had collisions with UUID-OSSP, in production, with
only around 20 billion values.  So clearly there aren't 122bits of true
randomness in OSSP.  I can't speak for other implementations because I
haven't tried them.


Or perhaps you should be buying lottery tickets? ;)

Can you write this up in a blog post? I've argued with people more than once about why 
it's a bad idea to trust on 1 in a bazillion odds to protect your data 
(though, usually in the context of SHA1), and it'd be good to be able to point at a real 
world example of this failing.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can we make beta testing better?

2014-04-26 Thread Jim Nasby

On 4/17/14, 6:42 PM, Josh Berkus wrote:

So we have some software we've been procrastinating on OSS'ing, which does:

1) Takes full query CSV logs from a running postgres instance
2) Runs them against a target instance in parallel
3) Records response times for all queries


Is that the stuff you'd worked on for us forever ago? I thought that was just 
pgreplay based, but now I don't remember.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-26 Thread Jim Nasby

On 4/22/14, 5:01 PM, Alfred Perlstein wrote:


Hey folks, I just spoke with our director of netops Tom Sparks here at Norse 
and we have a vested interest in Postgresql.  We can throw together a cluster 
of 4 machines with specs approximately in the range of dual quad core westmere 
with ~64GB of ram running FreeBSD 10 or 11. We can also do an Ubungu install as 
well or other Linux distro.  Please let me know if that this would be a 
something that the project could make use of please.

We also have colo space and power, etc.  So this would be the whole deal.  The 
cluster would be up for as long as needed.

Are the machine specs sufficient?  Any other things we should look for?

CC'd Tom on this email.


Did anyone respond to this off-list?

Would these machines be more useful as dedicated performance test servers for 
the community or generic BenchFarm members?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-26 Thread Jim Nasby

On 4/26/14, 1:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.


Stupid question... how many OSes would actually support 65k active processes, 
let alone 2^24?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-26 Thread Stephen Frost
Jim,

* Jim Nasby (j...@nasby.net) wrote:
 On 4/22/14, 5:01 PM, Alfred Perlstein wrote:
 We also have colo space and power, etc.  So this would be the whole deal.  
 The cluster would be up for as long as needed.
 
 Are the machine specs sufficient?  Any other things we should look for?
 
 CC'd Tom on this email.
 
 Did anyone respond to this off-list?

Yes, I did follow-up with Tom.  I'll do so again, as the discussion had
died down.

 Would these machines be more useful as dedicated performance test servers for 
 the community or generic BenchFarm members?

I don't believe they would be terribly useful as buildfarm systems; we
could set up similar systems with VMs to just run the regression tests.
Where I see these systems being particularly valuable would be as the
start of our performance farm, and perhaps one of the systems as a PG
infrastructure server.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-26 Thread Alfred Perlstein
JFYI we have 3 or 4 machines racked for the pgsql project in our DC. 

Tom informed me he would be lighting them up this week time permitting.  

Sent from my iPhone

 On Apr 26, 2014, at 6:15 PM, Stephen Frost sfr...@snowman.net wrote:
 
 Jim,
 
 * Jim Nasby (j...@nasby.net) wrote:
 On 4/22/14, 5:01 PM, Alfred Perlstein wrote:
 We also have colo space and power, etc.  So this would be the whole deal.  
 The cluster would be up for as long as needed.
 
 Are the machine specs sufficient?  Any other things we should look for?
 
 CC'd Tom on this email.
 
 Did anyone respond to this off-list?
 
 Yes, I did follow-up with Tom.  I'll do so again, as the discussion had
 died down.
 
 Would these machines be more useful as dedicated performance test servers 
 for the community or generic BenchFarm members?
 
 I don't believe they would be terribly useful as buildfarm systems; we
 could set up similar systems with VMs to just run the regression tests.
 Where I see these systems being particularly valuable would be as the
 start of our performance farm, and perhaps one of the systems as a PG
 infrastructure server.
 
Thanks!
 
Stephen


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hashable custom types

2014-04-26 Thread Greg Stark
On Sat, Apr 26, 2014 at 6:39 PM, Atri Sharma atri.j...@gmail.com wrote:
 Without sorting, isnt the scope of a recursive UNION with custom datatypes
 pretty restrictive?

All the default data types are hashable. It's not hard to add a hash
operator class. In a clean slate design it would probably have been
simpler to just make it a requirement that any data type provide a
default hash operator (and probably a default btree comparator).
Postgres provides a lot of degrees of freedom but it should probably
be considered best practice to just provide both even if you don't
envision one or the other being used directly by users for indexes.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-26 Thread Stephen Frost
Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:
 JFYI we have 3 or 4 machines racked for the pgsql project in our DC. 

Oh, great!

 Tom informed me he would be lighting them up this week time permitting.  

Excellent, many thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-26 Thread Greg Stark
On Sat, Apr 26, 2014 at 8:58 PM, Josh Berkus j...@agliodbs.com wrote:
 However, there's a fundamental problem with the concept of the dataset
 ID in that there's absolutely no way for PostgreSQL to know when it has
 a unique dataset.  Consider a downtime database file cloning for
 example; the two databases would have the same identifier and yet both
 be standalones which quickly diverge.  So I haven't thought of a good
 solution to that.

If you're content to use random numbers then you could generate one
from system entropy on every startup. If you generated a new timeline
for every startup then the pair of system id and random startup id
(which would be the new timelineid) would let you look at any two
instances and determine if they're related and where they diverged
even if it was from a database clone.

I don't think MAC address or other hardware identifiers really saves
you from using system entropy anyways. You might very well install a
clone on the same machine and in an environment like Heroku you could
very easily end up restoring a database onto the same VM twice
entirely by accident. I actually think using /dev/urandom is a better
idea than depending on things like MAC address almost always.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should pg_stat_bgwriter.buffers_backend_fsync be removed?

2014-04-26 Thread Peter Geoghegan
Backend fsyncs are theoretically still possible after the fsync
request queue compaction patch (which was subsequently back-patched to
all supported release branches). However, I'm reasonably confident
that that patch was so effective as to make a backend fsync all but
impossible. As such, it seems like the buffers_backend_fsync column in
the pg_stat_bgwriter view is more or less obsolete.

I suggest removing it for 9.5, and instead logging individual
occurrences of backend fsync requests within ForwardFsyncRequest(). It
seems fair to treat that as an anomaly to draw particular attention
to.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hashable custom types

2014-04-26 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 On Sat, Apr 26, 2014 at 6:39 PM, Atri Sharma atri.j...@gmail.com wrote:
 Without sorting, isnt the scope of a recursive UNION with custom datatypes
 pretty restrictive?

 All the default data types are hashable. It's not hard to add a hash
 operator class. In a clean slate design it would probably have been
 simpler to just make it a requirement that any data type provide a
 default hash operator (and probably a default btree comparator).
 Postgres provides a lot of degrees of freedom but it should probably
 be considered best practice to just provide both even if you don't
 envision one or the other being used directly by users for indexes.

A btree opclass requires that you invent some one-dimensional sort order
for the datatype, which might be a difficult thing; so I think it's fully
reasonable not to require datatypes to have btree support.  Hashing
doesn't require any semantic assumptions beyond having an equality rule,
which is clearly *necessary* if you want to do stuff like UNION or
DISTINCT.  So from that standpoint it's perfectly reasonable for recursive
UNION to require a hashable equality operator, whereas the other case of
requiring a sortable operator would be a lot harder to defend.

Having said that, I can also believe that there might be datatypes for
which implementing a hash function would be a lot harder than implementing
sorting; this could be true if your equality rule allows for a lot of
different physical representations of equal values.  But I'm not so
excited about such cases that I want to do the work of figuring out a
way to implement recursive UNION by sorting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_stat_bgwriter.buffers_backend_fsync be removed?

2014-04-26 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 Backend fsyncs are theoretically still possible after the fsync
 request queue compaction patch (which was subsequently back-patched to
 all supported release branches). However, I'm reasonably confident
 that that patch was so effective as to make a backend fsync all but
 impossible.

What's your evidence for that claim?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_stat_bgwriter.buffers_backend_fsync be removed?

2014-04-26 Thread Peter Geoghegan
On Sat, Apr 26, 2014 at 9:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan p...@heroku.com writes:
 Backend fsyncs are theoretically still possible after the fsync
 request queue compaction patch (which was subsequently back-patched to
 all supported release branches). However, I'm reasonably confident
 that that patch was so effective as to make a backend fsync all but
 impossible.

 What's your evidence for that claim?

I don't have any evidence, but I think it unlikely that this is an
occurrence that is seen in the real world. I was not able to see any
instances of it on the entire Heroku fleet at one point a few months
back, for one thing. For another, I have never observed this with any
benchmark, even though pgbench-tools presents buffers_backend_fsync
for each test run. The queue compaction patch completely fixed Greg
Smith's original test case. Even then, I believe it was considered
more of a patch addressing an edge case than anything else. Even the
comments above ForwardFsyncRequest() consider the occurance of backend
fsyncs to be only theoretically possible.

If anyone is aware of any cases where this is still actually known to
happen in production, I'd like to hear about them. However, ISTM that
if this actually does still happen, those cases would be better served
by surfacing the problem in the logs.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers