date:20120522

Hi Alexander,

It was good seeing you in Ottawa!

 Hello, Ishii-san!
 
 We've talked on PGCon that I've questions about mule to wchar
 conversion. My questions about pg_mule2wchar_with_len function are
 following. In these parts of code:
 *
 *
 else if (IS_LCPRV1(*from)  len = 3)
 {
 from++;
 *to = *from++  16;
 *to |= *from++;
 len -= 3;
 }
 
 and
 
 else if (IS_LCPRV2(*from)  len = 4)
 {
 from++;
 *to = *from++  16;
 *to |= *from++  8;
 *to |= *from++;
 len -= 4;
 }
 
 we skip first character of original string. Are we able to restore it back
 from pg_wchar?

I think it's possible. The first characters are defined like this:

#define IS_LCPRV1(c)((unsigned char)(c) == 0x9a || (unsigned char)(c) == 
0x9b)
#define IS_LCPRV2(c)((unsigned char)(c) == 0x9c || (unsigned char)(c) == 
0x9d)

It seems IS_LCPRV1 is not used in any of PostgreSQL supported
encodings at this point, that means there's 0 chance which existing
databases include LCPRV1. So you could safely ignore it.

For IS_LCPRV2, it is only used for Chinese encodings (EUC_TW and BIG5)
in backend/utils/mb/conversion_procs/euc_tw_and_big5/euc_tw_and_big5.c
and it is fixed to 0x9d.  So you can always restore the value to 0x9d.

 Also in this part of code we're shifting first byte by 16 bits:
 
 if (IS_LC1(*from)  len = 2)
 {
 *to = *from++  16;
 *to |= *from++;
 len -= 2;
 }
 else if (IS_LCPRV1(*from)  len = 3)
 {
 from++;
 *to = *from++  16;
 *to |= *from++;
 len -= 3;
 }
 
 Why don't we shift it by 8 bits?

Because we want the first byte of LC1 case to be placed in the second
byte of wchar. i.e.

0th byte: always 0
1th byte: leading byte (the first byte of the multibyte)
2th byte: always 0
3th byte: the second byte of the multibyte

Note that we always assume that the 1th byte (called leading byte:
LB in short) represents the id of the character set (from 0x81 to
0xff) in MULE INTERNAL encoding. For the mapping between LB and
charsets, see pg_wchar.h.

 You can see my patch in this thread where I propose purely mechanical
 changes in this function which make inverse conversion possible.
 
 --
 With best regards,
 Alexander Korotkov.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

On 22 May 2012 06:50, Tom Lane t...@sss.pgh.pa.us wrote:

 Currently, the planner keeps paths that appear to win on the grounds of
 either cheapest startup cost or cheapest total cost.  It suddenly struck
 me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
 fast-start preference, etc) we could know a-priori that cheapest startup
 cost is not going to be interesting, and hence immediately discard any
 path that doesn't win on total cost.

My experience is that most people don't provide a LIMIT explicitly
even when they know that's the desired behaviour. That's because
either they simply don't understand that SQL can return lots of rows,
or SQL knowledge isn't enough, or worse that people don't even know
that specifying it would alter query plans.

Regrettably the current planning of LIMIT clauses causes more problems
so in many cases these have been explicitly removed from SQL by
developers that know how many rows they wish to see.

I would have proposed a default-LIMIT parameter before now, but for
that last point.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap metapages

On 22 May 2012 02:50, Robert Haas robertmh...@gmail.com wrote:

 Not very sure why a metapage is better than a catalog table.

 Mostly because there's no chance of the startup process accessing a
 catalog table during recovery, but it can read a metapage.

OK, sounds reasonable.

Based upon all you've said, I'd suggest that we make a new kind of
fork, in a separate file for this, .meta. But we also optimise the VM
and FSM in the way you suggest so that we can replace .vm and .fsm
with just .meta in most cases. Big tables would get a .vm and .fsm
appearing when they get big enough, but that won't challenge the inode
limits. When .vm and .fsm do appear, we remove that info from the
metapage - that means we keep all code as it is currently, accept for
an optimisation of .vm and .fsm when those are small enough to do so.

We can watermark data files using special space on block zero using
some code to sneak that in when the page is next written, but that is
regarded as optional, rather than an essential aspect of an
upgrade/normal operation.

Having pg_upgrade touch data files is both dangerous and difficult to
back out in case of mistake, so I am wary of putting the metapage at
block 0. Doing it the way I suggest means the .meta files would be
wholly new and can be deleted as a back-out. We can also clean away
any unnecessary .vm/.fsm files as a later step.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Changing the concept of a DATABASE

On 21 May 2012 20:40, Stephen Frost sfr...@snowman.net wrote:

 This is important. I like the idea of breaking down the barriers
 between databases to allow it to be an option for one backend to
 access tables in multiple databases. The current mechanism doesn't
 actually prevent looking at data from other databases using internal
 APIs, so full security doesn't exist. It's a very common user
 requirement to wish to join tables stored in different databases,
 which ought to be possible more cleanly with correct privileges.

 That's really a whole different ball of wax and I don't believe what
 Robert was proposing would actually allow that to happen due to the
 other database-level things which are needed to keep everything
 consistent...  That's my understanding, anyway.  I'd be happy as anyone
 if we could actually make it work, but isn't like the SysCache stuff per
 database?  Also, cross-database queries would actually make it more
 difficult to have per-database roles, which is one thing that I was
 hoping we might be able to work into this, though perhaps we could have
 a shared roles table and a per-database roles table and only 'global'
 roles would be able to issue cross-database queries..

So collecting a few requirements from various places:

* Ability to have a Role that can only access one Database

* Allow user info to be dumped with a database, to make a db
completely self-consistent

* Allow databases to be transportable

* Allow users to access tables in 1 database easily, with appropriate rights.


I don't see any reasons why these things would be against each other.

The main objectives are to make a Database a more easily used
administrative grouping. At present, people who use multiple Databases
face many problems - they aren't as separate as you'd like, but
neither can they be ignored when required.

The idea of one main database per session is fine, but wiring it so
closely into the backend has a few disadvantages, many of them weird
internal things.

Are there arguments against those requirements before we spend time on
design/thinking?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-05-22 Thread Alexander Korotkov

On Tue, May 22, 2012 at 11:50 AM, Tatsuo Ishii is...@postgresql.org wrote:

 I think it's possible. The first characters are defined like this:

 #define IS_LCPRV1(c)((unsigned char)(c) == 0x9a || (unsigned char)(c)
 == 0x9b)
 #define IS_LCPRV2(c)((unsigned char)(c) == 0x9c || (unsigned char)(c)
 == 0x9d)

 It seems IS_LCPRV1 is not used in any of PostgreSQL supported
 encodings at this point, that means there's 0 chance which existing
 databases include LCPRV1. So you could safely ignore it.

 For IS_LCPRV2, it is only used for Chinese encodings (EUC_TW and BIG5)
 in backend/utils/mb/conversion_procs/euc_tw_and_big5/euc_tw_and_big5.c
 and it is fixed to 0x9d.  So you can always restore the value to 0x9d.

  Also in this part of code we're shifting first byte by 16 bits:
 
  if (IS_LC1(*from)  len = 2)
  {
  *to = *from++  16;
  *to |= *from++;
  len -= 2;
  }
  else if (IS_LCPRV1(*from)  len = 3)
  {
  from++;
  *to = *from++  16;
  *to |= *from++;
  len -= 3;
  }
 
  Why don't we shift it by 8 bits?

 Because we want the first byte of LC1 case to be placed in the second
 byte of wchar. i.e.

 0th byte: always 0
 1th byte: leading byte (the first byte of the multibyte)
 2th byte: always 0
 3th byte: the second byte of the multibyte

 Note that we always assume that the 1th byte (called leading byte:
 LB in short) represents the id of the character set (from 0x81 to
 0xff) in MULE INTERNAL encoding. For the mapping between LB and
 charsets, see pg_wchar.h.


Thanks for your comments. They clarify a lot.
But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
Isn't it possible for them to produce same pg_wchar?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 11:46, Simon Riggs wrote:

On 21 May 2012 20:40, Stephen Frostsfr...@snowman.net  wrote:

This is important. I like the idea of breaking down the barriers
between databases to allow it to be an option for one backend to
access tables in multiple databases. The current mechanism doesn't
actually prevent looking at data from other databases using internal
APIs, so full security doesn't exist. It's a very common user
requirement to wish to join tables stored in different databases,
which ought to be possible more cleanly with correct privileges.

That's really a whole different ball of wax and I don't believe what
Robert was proposing would actually allow that to happen due to the
other database-level things which are needed to keep everything
consistent...  That's my understanding, anyway.  I'd be happy as anyone
if we could actually make it work, but isn't like the SysCache stuff per
database?  Also, cross-database queries would actually make it more
difficult to have per-database roles, which is one thing that I was
hoping we might be able to work into this, though perhaps we could have
a shared roles table and a per-database roles table and only 'global'
roles would be able to issue cross-database queries..


IMVHO:  s/database/schema/g does resolve many of the problems that you 
were referring to... and 'dblink' should solve the rest, right?
Please, feel free to point out what I am (most probably) not considering 
-- not experienced enough yet :)


On the other hand, the separation of databases allows what otherwise 
would only be possible by using multiple instances of the database 
server (à la Oracle, AFAIK ) -- save for resource management, but that 
is another question whatsoever.



So collecting a few requirements from various places:

* Ability to have a Role that can only access one Database


Yes, please


* Allow user info to be dumped with a database, to make a db
completely self-consistent

+1


* Allow databases to be transportable
+1. Ideally, the binary format could be make platform-independent, so 
that a snapshot/rsync of the cluster can span architectures easily.
AFAIK, endianness-change is relatively cheap on current processors [1 
ASM instruction?] and it's not like we are memory-mapping tuples anyway 
(TOASTed values can certainly not be mapped), so it shouldn't be 
noticeable performance-wise.



* Allow users to access tables in1 database easily, with appropriate rights.


See above, but I am probably wrong ...



I don't see any reasons why these things would be against each other.


Look quite orthogonal to me.


The main objectives are to make a Database a more easily used
administrative grouping. At present, people who use multiple Databases
face many problems - they aren't as separate as you'd like, but
neither can they be ignored when required.

The idea of one main database per session is fine, but wiring it so
closely into the backend has a few disadvantages, many of them weird
internal things.

Are there arguments against those requirements before we spend time on
design/thinking?


OTOH, the postmaster/cluster - session/database coupling looks to me 
clean, simple... and seems to make the code simpler. This is can only be 
good (but again, I don't know enough yet to be sure)


Regards,

Jose Luis Tallon




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-05-22 Thread PostgreSQL - Hans-Jürgen Schönig


On May 22, 2012, at 9:57 AM, Simon Riggs wrote:

 On 22 May 2012 06:50, Tom Lane t...@sss.pgh.pa.us wrote:
 
 Currently, the planner keeps paths that appear to win on the grounds of
 either cheapest startup cost or cheapest total cost.  It suddenly struck
 me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
 fast-start preference, etc) we could know a-priori that cheapest startup
 cost is not going to be interesting, and hence immediately discard any
 path that doesn't win on total cost.
 
 My experience is that most people don't provide a LIMIT explicitly
 even when they know that's the desired behaviour. That's because
 either they simply don't understand that SQL can return lots of rows,
 or SQL knowledge isn't enough, or worse that people don't even know
 that specifying it would alter query plans.
 
 Regrettably the current planning of LIMIT clauses causes more problems
 so in many cases these have been explicitly removed from SQL by
 developers that know how many rows they wish to see.
 
 I would have proposed a default-LIMIT parameter before now, but for
 that last point.


this sounds like a total disaster to me ... 
why in the world should we have a default LIMIT parameter? 
i guess if somebody is not able to use LIMIT he should better not touch the DB.
we clearly cannot fix incompetence by adding parameters. 

regards,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 12:05, José Luis Tallón jltal...@nosys.es wrote:

 IMVHO:  s/database/schema/g does resolve many of the problems that you were
 referring to... and 'dblink' should solve the rest, right?
 Please, feel free to point out what I am (most probably) not considering --
 not experienced enough yet :)

The choice of schema/database is an important one. If you get it
wrong, you are in major difficulty. In many cases schemas would be a
better choice, but not in all cases. So I'm interested in solving the
problems for people who have multiple databases on same server.

dblink is the only solution, but its very poor way to do this when we
have 2 databases on same server.

My thinking is that reaching out to multiple databases is actually
mostly easy, except in a few places where dbid is hardwired into the
backend.

 On the other hand, the separation of databases allows what otherwise would
 only be possible by using multiple instances of the database server (à la
 Oracle, AFAIK ) -- save for resource management, but that is another
 question whatsoever.

Separation of databases is fine. I have no intention to change that,
as long as the user wishes that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

 Thanks for your comments. They clarify a lot.
 But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
 Isn't it possible for them to produce same pg_wchar?

If LB is in 0x90 - 0x99 range, then they are LC2.
If LB is in 0xf0 - 0xff range, then they are LCPRV2.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

On 22 May 2012 12:12, PostgreSQL - Hans-Jürgen Schönig
postg...@cybertec.at wrote:

 On May 22, 2012, at 9:57 AM, Simon Riggs wrote:

 On 22 May 2012 06:50, Tom Lane t...@sss.pgh.pa.us wrote:

 Currently, the planner keeps paths that appear to win on the grounds of
 either cheapest startup cost or cheapest total cost.  It suddenly struck
 me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
 fast-start preference, etc) we could know a-priori that cheapest startup
 cost is not going to be interesting, and hence immediately discard any
 path that doesn't win on total cost.

 My experience is that most people don't provide a LIMIT explicitly
 even when they know that's the desired behaviour. That's because
 either they simply don't understand that SQL can return lots of rows,
 or SQL knowledge isn't enough, or worse that people don't even know
 that specifying it would alter query plans.

 Regrettably the current planning of LIMIT clauses causes more problems
 so in many cases these have been explicitly removed from SQL by
 developers that know how many rows they wish to see.

 I would have proposed a default-LIMIT parameter before now, but for
 that last point.


 this sounds like a total disaster to me ...
 why in the world should we have a default LIMIT parameter?

Its common for people to write SQL they want to have a fast response
to, yet to not specify that in their SQL code.

 i guess if somebody is not able to use LIMIT he should better not touch the 
 DB.
 we clearly cannot fix incompetence by adding parameters.

Not sure I understand that argument. Many things are set by default in
parameters.

Oracle allows the user to choose whether plans return rows quickly, or
execute plans efficiently overall. That is a relevant user choice, not
a hint. We simply don't know that if LIMIT is absent whether the user
wishes fast or efficient plans. If they use LIMIT we know they want
fast plan.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On May22, 2012, at 11:46 , Simon Riggs wrote:
 * Ability to have a Role that can only access one Database
 
 * Allow user info to be dumped with a database, to make a db
 completely self-consistent

These two could be achieved by having database-local roles I think.

 * Allow databases to be transportable

That would be very useful, but extremely hard to do unless we switch to
per-database XID spaces. Or unless we're content with having databases
only be transportable after some special MAKE TRANSPORTABLE operation,
which would freeze all tuples and make the database read-only.

 * Allow users to access tables in 1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 12:35, Florian Pflug f...@phlo.org wrote:

 * Allow users to access tables in 1 database easily, with appropriate 
 rights.

 That one I'm very sceptical about. In the long run, I think we want better
 separation of databases, not less, and this requirement carries a huge risk
 of standing in the way of that. Also, I think that once we integrate the 
 postgres
 FDW into core (that's the plan in the long run, right?), we're going to get
 a good approximation of that essentially for free.

It's a poor approximation of it, free or not.

If it actually worked well, I'd be happy. It doesn't. No proper
transaction support, no session pool, poor planning etc etc. At best
its a band-aid, not something sufficiently good to recommend for
general production use.

FDWs are very good but aren't good enough for intra-database usage.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On Tue, May 22, 2012 at 7:35 AM, Florian Pflug f...@phlo.org wrote:
 * Allow users to access tables in 1 database easily, with appropriate 
 rights.

 That one I'm very sceptical about. In the long run, I think we want better
 separation of databases, not less, and this requirement carries a huge risk
 of standing in the way of that. Also, I think that once we integrate the 
 postgres
 FDW into core (that's the plan in the long run, right?), we're going to get
 a good approximation of that essentially for free.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem with error response message

Hi,

ErrorResponse message from backend has some field type codes include:

S
Severity: the field contents are ERROR, FATAL, or PANIC (in an
error message), or WARNING, NOTICE, DEBUG, INFO, or LOG (in a
notice message), or a localized translation of one of
these. Always present.

C
Code: the SQLSTATE code for the error (see Appendix A). Not
localizable. Always present.

If severity is ERROR, current transaction is aborted but existing
session is continued. If it is FATAL or worse, backend disconnects the
current session, or even postmaster restarts all backends (which
results in session disconnection anyway). Cluster management tools
such as pgpool-II has to carefully handle the message because there's
big difference between just adborting transaction and disconnecting
session. In the former, pgpool just forwards the error message to
frontend, but for later, it should disconnect the session. For this
purpose, pgpool-II examines SQLSTAE code. Unfortunately some error
codes such as 40001(serialization_failure) comes with different
severities ERROR or FATAL.

Here are some examples:

./tcop/postgres.c-  ereport(FATAL,
./tcop/postgres.c:  
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
./tcop/postgres.c-errmsg(terminating connection due to 
conflict with recovery),
./tcop/postgres.c-   
errdetail_recovery_conflict()));

./commands/trigger.c-   ereport(ERROR,
./commands/trigger.c:   
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
./commands/trigger.c-
errmsg(could not serialize access due to concurrent update)));
./commands/trigger.c-   if 
(!ItemPointerEquals(update_ctid, tuple.t_self))

So I tried to look into severity code to find if it is a fatal error
or not.  Unfortunately it is almost impossible because it could be
localized. My question is, why we need to localize this? It shouldn't
be computer recognizable message?

Also I wonder why conflict with recovery generates same error code
as serialization error. For me it seems not very consistent
choice. IMO, we should assign different error code for at least brings
totally different effect to frontend: aborting transactions or
sessions.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 13:24, Simon Riggs wrote:

On 22 May 2012 12:05, José Luis Tallónjltal...@nosys.es  wrote:


IMVHO:  s/database/schema/g does resolve many of the problems that you were
referring to... and 'dblink' should solve the rest, right?
Please, feel free to point out what I am (most probably) not considering --
not experienced enough yet :)

The choice of schema/database is an important one. If you get it
wrong, you are in major difficulty. In many cases schemas would be a
better choice, but not in all cases. So I'm interested in solving the
problems for people who have multiple databases on same server.


Ok. Understood.
Thank you for the clarification


dblink is the only solution, but its very poor way to do this when we
have 2 databases on same server.

My thinking is that reaching out to multiple databases is actually
mostly easy, except in a few places where dbid is hardwired into the
backend.


The only drawback I see is that it might weaken the separation.

Even though arguably a kludge, dblink could have a shortcut added, 
whereby connections to another database within the same cluster would be 
serviced directly within the backend, as opossed to opening a new db 
connection. This is effectively a fastpath within dblink, which 
optimizes a relatively common case while at the same time not loosing 
generality.



On the other hand, the separation of databases allows what otherwise would
only be possible by using multiple instances of the database server (à la
Oracle, AFAIK ) -- save for resource management, but that is another
question whatsoever.

Separation of databases is fine. I have no intention to change that,
as long as the user wishes that.


Perfect.

Thanks,

Jose Luis Tallon


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 13:47, Simon Riggs wrote:

On 22 May 2012 12:35, Florian Pflugf...@phlo.org  wrote:

* Allow users to access tables in1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

It's a poor approximation of it, free or not.

If it actually worked well, I'd be happy. It doesn't. No proper
transaction support, no session pool, poor planning etc etc. At best
its a band-aid, not something sufficiently good to recommend for
general production use.

- Transaction support: should be fixed, I guess.

- Session pool: is this really needed? I would it externally -- for 
example, by specifying a connection string to a pgPool/pgBouncer as 
opposed to directly to the db server.


- Planning: add a tunable specifying a higher cost (with a exception for 
cluster-local dblinks, which would have a lower cost), and the rest 
should be straightforward. Of course, planning would'nt be as accurate 
---we can't access the other db statistics in order to potentially 
rewrite conditions---, but I don't think that would be a proper approach 
(separation concerns, again)



FDWs are very good but aren't good enough for intra-database usage.


The optimization I just proposed (plus the required fixes to FDW) might 
very well solve this, while providing useful enhancements for all users, 
whether they are accessing cluster-local databases or not (or even truly 
foreign datasources)



Regards,

Jose Luis Tallon


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On May22, 2012, at 13:47 , Simon Riggs wrote:
 On 22 May 2012 12:35, Florian Pflug f...@phlo.org wrote:
 * Allow users to access tables in 1 database easily, with appropriate 
 rights.
 
 That one I'm very sceptical about. In the long run, I think we want better
 separation of databases, not less, and this requirement carries a huge risk
 of standing in the way of that. Also, I think that once we integrate the 
 postgres
 FDW into core (that's the plan in the long run, right?), we're going to get
 a good approximation of that essentially for free.
 
 It's a poor approximation of it, free or not.
 
 If it actually worked well, I'd be happy. It doesn't. No proper
 transaction support, no session pool, poor planning etc etc. At best
 its a band-aid, not something sufficiently good to recommend for
 general production use.

These all sound fixable, though. 

 FDWs are very good but aren't good enough for intra-database usage.

OTOH, cross-database queries would by design be limited to databases
within one cluster, whereas a FDW-based solution would not. I don't really see
the different between telling people put all your tables into one database
if you need to access them from within one session and put all your databases
on one server if you need to access them from within one session.

Plus, the more tightly different databases in the same cluster are coupled,
the more people will setup one cluster per database for performance reasons.
Then, when they discovered they need inter-database queries after all, we'll
again have to tell them well, then don't set things up the way you have.

If we want to make it easier for people to migrate from multiple databases
to a single database with multiple schemas, maybe we should look into allowing
nested schemas? AFAIK, the main reason not to do that are ambiguities in the
meaning of identifiers, which cross-database queries would have to deal with
also.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

On Tue, May 22, 2012 at 1:50 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Currently, the planner keeps paths that appear to win on the grounds of
 either cheapest startup cost or cheapest total cost.  It suddenly struck
 me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
 fast-start preference, etc) we could know a-priori that cheapest startup
 cost is not going to be interesting, and hence immediately discard any
 path that doesn't win on total cost.

 This would require some additional logic to detect whether the case
 applies, as well as extra complexity in add_path.  So it's possible
 that it wouldn't be worthwhile overall.  Still, it seems like it might
 be a useful idea to investigate.

 Thoughts?

Yeah, I think we should investigate that.  Presumably you could easily
have a situation where one part of the tree is under a LIMIT or EXISTS
and therefore needs to preserve fast-start plans but the rest of the
(potentially large) tree isn't, so we need something fairly
fine-grained, I think.  Maybe we could add a flag to each RelOptInfo
indicating whether fast-start plans should be kept, or something like
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Andrew Dunstan




On 05/22/2012 07:56 AM, Robert Haas wrote:

On Tue, May 22, 2012 at 7:35 AM, Florian Pflugf...@phlo.org  wrote:

* Allow users to access tables in1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

+1.




That seems to be leaving aside the fact that we don't currently have any 
notion of how to allow FDWs to write the foreign tables.


What is more, isn't the postgres FDW about talking to any postgres 
source? If so, does it have special smarts for when we are talking to 
ourselves? And if it doesn't then it seems unlikely to be an acceptable 
substitute for allowing talking direct to a sibling database.


I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW 
would adequately meet the case if we wanted to go that way.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap metapages

On Tue, May 22, 2012 at 4:52 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Based upon all you've said, I'd suggest that we make a new kind of
 fork, in a separate file for this, .meta. But we also optimise the VM
 and FSM in the way you suggest so that we can replace .vm and .fsm
 with just .meta in most cases. Big tables would get a .vm and .fsm
 appearing when they get big enough, but that won't challenge the inode
 limits. When .vm and .fsm do appear, we remove that info from the
 metapage - that means we keep all code as it is currently, accept for
 an optimisation of .vm and .fsm when those are small enough to do so.

Well, let's see.  That would mean that a small heap relation has 2
forks instead of 3, and a large relation has 4 forks instead of 3.  In
my proposal, a small relation has 1 fork instead of 3, and a large
relation still has 3 forks.  So I like mine better.

Also, I think that we need a good chunk of the metadata here for both
tables and indexes.  For example, if we use the metapage to store
information about whether a relation is logged, unlogged, being
converted from logged to unlogged, or being converted from logged to
unlogged, we need that information both for tables and for indexes.
Now, there's no absolute reason why those cases have to be handled
symmetrically, but I think things will be a lot simpler if they are.
If we settle on the rule that block 0 of every relation contains a
certain chunk of metadata at a certain byte offset, then the code to
retrieve that data when needed is pretty darn simple.  If tables put
it in a separate fork and indexes put it in the main fork inside the
metablock somewhere, then things are not so simple.  And I sure don't
want to add a separate fork for every index just to hold the metadata:
that would be a huge hit in terms of total inode consumption.

 We can watermark data files using special space on block zero using
 some code to sneak that in when the page is next written, but that is
 regarded as optional, rather than an essential aspect of an
 upgrade/normal operation.

 Having pg_upgrade touch data files is both dangerous and difficult to
 back out in case of mistake, so I am wary of putting the metapage at
 block 0. Doing it the way I suggest means the .meta files would be
 wholly new and can be deleted as a back-out. We can also clean away
 any unnecessary .vm/.fsm files as a later step.

It seems pretty clear to me that making pg_upgrade responsible for
emptying block zero is a non-starter.  But I don't think that's a
reason to throw out the design; I think it's a problem we can work
around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Simon Riggs (si...@2ndquadrant.com) wrote:
 * Ability to have a Role that can only access one Database

Alright, I'd like to think about this one specifically and solicit
feedback on the idea that we keep the existing shared role tables but
add on additional tables for per-database roles.

In the past, I feel like we've been focused on the idea of moving all
roles to be per-database instead of per-cluster, which certainly has a
lot of problems associated with it, but in the end, I think people would
be really happy with some shared roles and some per-DB roles.

What would the semantics of that look like though?  Which is preferred
when you do a 'grant select' or 'grant role'?  Or do we just disallow
overlaps between per-DB roles and global roles?  If we don't allow
duplicates, I suspect a lot of the other questions suddenly become a lot
easier to deal with, but would that be too much of a restriction?  How
would you handle migrating an existing global role to a per-database
role?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Changing the concept of a DATABASE

On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan and...@dunslane.net wrote:
 That seems to be leaving aside the fact that we don't currently have any
 notion of how to allow FDWs to write the foreign tables.

 What is more, isn't the postgres FDW about talking to any postgres source?
 If so, does it have special smarts for when we are talking to ourselves? And
 if it doesn't then it seems unlikely to be an acceptable substitute for
 allowing talking direct to a sibling database.

 I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
 adequately meet the case if we wanted to go that way.

Well, I don't think anyone is claiming that FDWs as they exist today
solve all of the problems in this area.  But I think that improving
FDWs is a more promising line of attack than trying to make backends
talk to multiple databases.  Doing the latter will require massive
surgery on the relcache, the catcache, most of the secondary catalog
caches, the ProcArray, and every portion of the backend that thinks an
OID uniquely identifies an SQL object.  Basically, they'd all need
database OID as an additional key field, which is undesirable for
performance reasons even if there were no issue of code churn.

So I'd rather see us put the effort into pgsql_fdw, which, as Florian
says, will also let us talk to a completely separate server.  If
you've got multiple databases in the same cluster and really need to
be doing queries across all of them, that's what schemas are
supposedly for.  Now I know that people feel that doesn't work as well
as it needs to, but there again I think it would be easier to fix
schemas than to make cross-database queries work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers

On Mon, May 14, 2012 at 8:18 AM, chinnaobi chinna...@gmail.com wrote:
 I do base backup only first time on standby when it is going to be
 replicated. when ever primary goes down,  standby becomes primary and
 primary becomes standby when primary comes up. When primary becomes standby
 I am restoring data from WAL archive and start postgres service streaming
 replication to connect to primary.

 This setup is working.

I don't think this is safe.  The primary might have WAL that never
made it to the standby, in which case the two machines will be out of
sync with each other and all sorts of bad stuff could happen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: catalog is missing 2 attribute(s) for relid 16584

On Mon, May 14, 2012 at 1:01 AM, Prakash Itnal prakash...@gmail.com wrote:
 Recently we faced an issue with postgres server where it is throwing error:

 ERROR:  catalog is missing 2 attribute(s) for relid 16584
 CONTEXT:  automatic analyze of table DBRNW.public.act_wsta

 I checked in the database and found that this table is not present but the
 entry for the same is present in 'pg_tables'. This error is occuring when
 auto analyze is executed.

 I searched for the issue in postgres blogs and found that there is no way to
 recover from this situation except restore from last backup!! I wanted to
 confirm is there any way to recover at run time? because if we restore from
 last backup then we might loose some data that got updated after this error
 occurred.

 Any help would be appreciated.

It sounds like you are missing a couple of pg_attribute entries.  You
might be able to figure out what's missing and put them back, but of
course whatever got your system into this state might have also caused
other corruption...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Albe Laurenz

Simon Riggs wrote:
 On 21 May 2012 20:40, Stephen Frost sfr...@snowman.net wrote:
 
 This is important. I like the idea of breaking down the barriers
 between databases to allow it to be an option for one backend to
 access tables in multiple databases.

 So collecting a few requirements from various places:
[...]
 * Allow users to access tables in 1 database easily, with appropriate
rights.

 The main objectives are to make a Database a more easily used
 administrative grouping. At present, people who use multiple Databases
 face many problems - they aren't as separate as you'd like, but
 neither can they be ignored when required.
 
 The idea of one main database per session is fine, but wiring it so
 closely into the backend has a few disadvantages, many of them weird
 internal things.
 
 Are there arguments against those requirements before we spend time on
 design/thinking?

From my perspective it is a great advantage that a user connected
to one database cannot access objects from a different database
without additional software, no matter what permissions he or she has
(short of superuser, who can do anything).

This enables us to have many different databases in one cluster
without having to worry that they can affect each other.

If you need different applications to routinely access each other's
tables, why not assign them to different schemas in one database?

For those cases where you absolutely need access to a different
database, you can use dblink or a foreign data wrapper (hopefully
in 9.3).

So -1 on that particular suggestion.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Thom Brown

On 22 May 2012 14:04, Stephen Frost sfr...@snowman.net wrote:

 What would the semantics of that look like though?  Which is preferred
 when you do a 'grant select' or 'grant role'?  Or do we just disallow
 overlaps between per-DB roles and global roles?  If we don't allow
 duplicates, I suspect a lot of the other questions suddenly become a lot
 easier to deal with, but would that be too much of a restriction?  How
 would you handle migrating an existing global role to a per-database
 role?

Perhaps:

CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
option [ ... ] ]

Then:

CREATE LOCAL ROLE localrolename LIKE globalrolename;

REASSIGN OWNED BY globalrolename TO localrolename;

Conflicts would occur where localrolename matches an existing local
role name within the same database, or a global role name, but not a
local role name within another database.  The problem with this,
however, is that creating global roles would need conflict checks
against local roles in every database, unless a manifest of all local
roles were registered globally.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

On Tue, May 8, 2012 at 9:37 PM, Amit Kapila amit.kap...@huawei.com wrote:
 I have checked the code and logic according to which usage counter is
 increased when the buffer is pinned.

Fixed, thanks for the report.

 Another Doubt : Why in function BufferAlloc, it needs to hold the
 BufFreelistLock till it pin the buffer which increases its reference count.

Well, I think the problem is that, if we didn't do that, then, in
theory, the strategy point could wrap all the way around
shared_buffers and someone else could pin the buffer, and then we'd be
hosed.

Mind you, I think this whole area of the code needs some reengineering
for better performance, but I'm not sure this is the right place to
start.  What I think is really bad is that we're forcing every
BufferAlloc() to iterate over buffers checking whether each one is
evictable.  I think we ought to put only those buffers that we think
are likely to be evictable on the freelist, and then the actual buffer
eviction code would only need to recheck that nothing had changed,
instead of needing to scan over a potentially quite large number of
buffers that never had any chance of being selected in the first
place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enhanced error fields

On Wed, May 9, 2012 at 9:33 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 here is patch with enhancing ErrorData structure. Now constraints
 errors and RI uses these fields

Please add this to https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] could not open relation with OID errors after promoting the standby to master

On Tue, May 15, 2012 at 10:37 PM, Joachim Wieland j...@mcknight.de wrote:
 If it matters, I have not promoted the master with a trigger file but
 restarted it after deleting recovery.conf.

Hmm.  I think that if you do it this way, the minimum recovery point
won't be respected, which could leave you with a corrupted database.
Now, if all the WAL files that you need are present in pg_xlog anyway,
then they ought to get replayed anyway, but I think that if you are
using restore_command (as opposed to streaming replication) we restore
WAL segments under a different file name, which might cause this
problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

Robert Haas robertmh...@gmail.com writes:
 Mind you, I think this whole area of the code needs some reengineering
 for better performance, but I'm not sure this is the right place to
 start.  What I think is really bad is that we're forcing every
 BufferAlloc() to iterate over buffers checking whether each one is
 evictable.

Well, keep in mind that that action is not merely there to obtain a
victim buffer; it is also maintaining the global LRU state (by
decrementing the usage counts of buffers it passes over).  I don't think
you can change it to simply look only at a predetermined freelist
without seriously compromising the overall quality of our buffer
replacement decisions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog

On Mon, May 14, 2012 at 2:24 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, May 11, 2012 at 11:43 PM, Magnus Hagander mag...@hagander.net wrote:
 Should we go down the easy way and just reject connections when the flag is
 mismatching between the client and the server (trivial to do - see the
 attached patch)?

 +       char       *tmpparam;

 You forgot to add const before char, which causes a compile-time warning.

I went ahead and committed this, with this fix and a slight change to
the message text.

Hope that's OK with everyone...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

On Tue, May 22, 2012 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Mind you, I think this whole area of the code needs some reengineering
 for better performance, but I'm not sure this is the right place to
 start.  What I think is really bad is that we're forcing every
 BufferAlloc() to iterate over buffers checking whether each one is
 evictable.

 Well, keep in mind that that action is not merely there to obtain a
 victim buffer; it is also maintaining the global LRU state (by
 decrementing the usage counts of buffers it passes over).  I don't think
 you can change it to simply look only at a predetermined freelist
 without seriously compromising the overall quality of our buffer
 replacement decisions.

The idea would be to have a background process (like bgwriter)
maintain the global LRU state and push candidate buffers onto the
freelist.  Then foreground processes can just pop them off the list
and recheck that they haven't been pinned meanwhile.  As long as we
don't let the background sweep get too far ahead of actual allocation
needs, I don't think this would change the quality of buffer
allocation much at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

Thom Brown t...@linux.com writes:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.

Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
want local and global roles to have nonconflicting OIDs, else it's
necessary to carry around an indication of which type each role is;
which would be more or less a show-stopper in terms of the number of
catalogs and internal APIs affected.  But I don't currently see any
nice way to guarantee that if each database has a private table of
local roles.

You could possibly make it work if all roles, local and global, are
stored in a single shared catalog.  But that seems pretty ugly.

BTW, I wonder whether this type of problem isn't also pretty fatal for
the sorts of hierarchical catalog structures we were speculating about
at PGCon.  When we were talking about that I was sort of assuming that
the more-closely-nested levels could just hide conflicting objects at
outer levels, but on reflection that seems a bit scary.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

On Tue, May 22, 2012 at 9:37 AM, Thom Brown t...@linux.com wrote:
 On 22 May 2012 14:04, Stephen Frost sfr...@snowman.net wrote:
 What would the semantics of that look like though?  Which is preferred
 when you do a 'grant select' or 'grant role'?  Or do we just disallow
 overlaps between per-DB roles and global roles?  If we don't allow
 duplicates, I suspect a lot of the other questions suddenly become a lot
 easier to deal with, but would that be too much of a restriction?  How
 would you handle migrating an existing global role to a per-database
 role?

 Perhaps:

 CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
 option [ ... ] ]

 Then:

 CREATE LOCAL ROLE localrolename LIKE globalrolename;

 REASSIGN OWNED BY globalrolename TO localrolename;

 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.

There are race conditions to worry about, too.  In most cases, we rely
on the btree index machinery as a final backstop against duplicate
catalog entries.  But that doesn't work unless everything's in one
catalog, nor for anything more complicated than this set of columns
taken together should be unique over every role.  Even if we were OK
with incurring the ugliness of storing per-database roles in a shared
catalog, the uniqueness constraint you'd want is something like no
two roles can share the same name unless they have unequal database
OIDs neither of which is zero, which I don't believe we can enforce
via the btree machinery, at least not without an expression index that
won't work in a system catalog anyway.

In retrospect, I think the idea of shared catalogs was probably a bad
idea.  I think we should have made roles and tablespaces database
objects rather than shared objects, and come up with some ad-hoc
method of representing the set of available databases.  But that
decision seems to have been made sometime pre-1996, so the thought of
changing it now is pretty painful, but I can dream...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Thom Brown (t...@linux.com) wrote:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.

Hmm, right, that's a bit of a sticky point.  wrt your suggestion- it
works great if we don't allow duplicates, but then people have to accept
their role name is getting changed.  That said, perhaps that's not that
big of a deal, since you could rename it afterwards.

The issue with the constraints on other databases might actually be
cause enough to allow duplicates, just to avoid that issue..  We could
mirror the per-database roles into a shared space that isn't externally
visible, but at that point, maybe we should try to get RLS for the
catalogs instead, or just modify the views to only show roles which can
connect to this database.  That's not going to make them completely
transparent, but it might be enough for some use cases.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Draft release notes complete

On Mon, May 21, 2012 at 9:54 PM, Noah Misch n...@leadboat.com wrote:
 On Wed, May 09, 2012 at 11:11:02PM -0400, Bruce Momjian wrote:
 I have completed my draft of the 9.2 release notes, and committed it to
 git.

 Concerning Have psql \copy use libpq's SendQuery(), SendQuery() is a
 psql-internal interface, not a libpq interface.

 The array statistics patch added new columns to the pg_stats view, and it
 moved existing tsvector most-common-element statistics to those new columns.
 Let's mention that as a (minor) incompatibility.

 Proposed changes attached.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

Robert Haas robertmh...@gmail.com writes:
 On Tue, May 22, 2012 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, keep in mind that that action is not merely there to obtain a
 victim buffer; it is also maintaining the global LRU state (by
 decrementing the usage counts of buffers it passes over).  I don't think
 you can change it to simply look only at a predetermined freelist
 without seriously compromising the overall quality of our buffer
 replacement decisions.

 The idea would be to have a background process (like bgwriter)
 maintain the global LRU state and push candidate buffers onto the
 freelist.

Amit was trying to convince me of the same idea at PGCon, but I don't
buy it.  bgwriter doesn't scan the buffer array nearly fast enough to
provide useful adjustment of the usage counts under load.  And besides
if the decrements are decoupled from the allocation requests it's no
longer obvious that the algorithm is even an approximation of LRU.

But the larger issue here is that if that processing is a bottleneck
(which I agree it is), how does it help to force a single process to
be responsible for it?  Any real improvement in scalability here will
need to decentralize the operation more, not less.

My own thoughts about this had pointed in the direction of getting rid
of the central freelist entirely, instead letting each backend run its
own independent clock sweep as needed.  The main problem with that is
that if there's no longer any globally-visible clock sweep state, it's
pretty hard to figure out what the control logic for the bgwriter should
look like.  Maybe it would be all right to have global variables that
are just statistics counters for allocations and buffers swept over,
which backends would need to spinlock for just long enough to increment
the counters at the end of each buffer allocation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] weird error message in sepgsql

On Sun, May 20, 2012 at 11:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 I found this in contrib/sepgsql/expected/label.out:
 SECURITY LABEL ON COLUMN t2
     IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- be failed
 ERROR:  improper relation name (too many dotted names): nothing

 Contrast with:

 SECURITY LABEL ON COLUMN t2.b
     IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- ok

 I guess what's happening is that it's calling makeRangeVarFromNameList()
 with a list of length zero.

 We should either fix the SECURITY LABEL command to catch that case
 beforehand, or fix makeRangeVarFromNameList() to give a proper error
 message, or both.

 I think the appropriate error message is probably along the lines of
 column name must be qualified, and it's hard to justify having
 makeRangeVarFromNameList emit such a thing.  So this is the fault
 of the calling code.

Fixed and back-patched to 9.1.  Thanks for the report.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Susanne Ebrecht


Am 22.05.2012 15:27, schrieb Albe Laurenz:

If you need different applications to routinely access each other's
tables, why not assign them to different schemas in one database?


The use case in my mind for accessing more databases is when you want to
access stuff different languages.

You only can set encoding / LC_Collate per database not per schema.

So for different languages you might need different databases to do 
correct sorting / indexing.


Susanne

--
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

Susanne Ebrecht susa...@2ndquadrant.com writes:
 The use case in my mind for accessing more databases is when you want to
 access stuff different languages.

 You only can set encoding / LC_Collate per database not per schema.

 So for different languages you might need different databases to do 
 correct sorting / indexing.

Encoding yes, but since 9.1 we have pretty fine-grained control of
collation.  So I think this argument is a lot weaker than it used
to be.  It would only really apply if you have one of the corner
cases where utf8 doesn't work for you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

On May22, 2012, at 16:09 , Tom Lane wrote:
 Thom Brown t...@linux.com writes:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.
 
 Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
 want local and global roles to have nonconflicting OIDs, else it's
 necessary to carry around an indication of which type each role is;
 which would be more or less a show-stopper in terms of the number of
 catalogs and internal APIs affected.  But I don't currently see any
 nice way to guarantee that if each database has a private table of
 local roles.

Maybe we could simply make all global role's OIDs even, and all local ones
odd, or something like that.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

On Tue, May 22, 2012 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The idea would be to have a background process (like bgwriter)
 maintain the global LRU state and push candidate buffers onto the
 freelist.

 Amit was trying to convince me of the same idea at PGCon, but I don't
 buy it.  bgwriter doesn't scan the buffer array nearly fast enough to
 provide useful adjustment of the usage counts under load.  And besides
 if the decrements are decoupled from the allocation requests it's no
 longer obvious that the algorithm is even an approximation of LRU.

Well, bgwriter is *supposed* to anticipate which buffers are about to
be allocated and clean any of those that are dirty.  Having it
decrement the usage counts and stuff the resulting list of buffers
into a linked list seems like a pretty reasonable extension of that,
assuming that it works in the first place.  If it doesn't, then we
need a rethink.

 But the larger issue here is that if that processing is a bottleneck
 (which I agree it is), how does it help to force a single process to
 be responsible for it?  Any real improvement in scalability here will
 need to decentralize the operation more, not less.

Sure.  I think we could have the freelist and the clock sweep
protected by different locks.  The background writer would lock out
other people running the clock sweep, but the freelist could be
protected by a spinlock which no one would ever need to take for more
than a few cycles.  Right there, you should get a significant
scalability improvement, since the critical section would be so much
shorter than it is now.  If that's not enough, you could have several
freelists protected by different spinlocks; the bgwriter would put
1/Nth of the reusable buffers on each freelist, and backends would
pick a freelist at random to pull buffers off of.

 My own thoughts about this had pointed in the direction of getting rid
 of the central freelist entirely, instead letting each backend run its
 own independent clock sweep as needed.  The main problem with that is
 that if there's no longer any globally-visible clock sweep state, it's
 pretty hard to figure out what the control logic for the bgwriter should
 look like.  Maybe it would be all right to have global variables that
 are just statistics counters for allocations and buffers swept over,
 which backends would need to spinlock for just long enough to increment
 the counters at the end of each buffer allocation.

Hmm, that's certainly an interesting idea.  I fear that if the clock
sweeps from the different backends ended up too closely synchronized,
you would end up evicting whatever was in the way, be it hot or cold.
It might almost be better to have individual backends choose buffers
to evict at random; if the chosen buffer isn't evictable, we decrement
its usage count and pick another one, also at random.

With respect to the control logic for the background writer, one idea
I had was to get rid of the idea that the background writer's job is
to write in advance of the strategy point.  Instead, every time the
clock sweep passes over a dirty buffer that is otherwise evictable, we
add it to a queue of things that the bgwriter should clean.  Those
buffers, once cleaned, go on the free list.  Maybe some variant of
that could work with your idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Susanne Ebrecht


Am 22.05.2012 17:42, schrieb Tom Lane:

Encoding yes, but since 9.1 we have pretty fine-grained control of
collation.  So I think this argument is a lot weaker than it used
to be.  It would only really apply if you have one of the corner
cases where utf8 doesn't work for you.


Yeah it got better - but it isn't perfect yet.

Maybe I am blind or 9.1 documentation has a bug - but according to the
documentation you can't change default collation per schema or per table.
You can set collation per column - but do you really want to set 
collation for

every single column of every single supported language in your 200+ tables
web tool?

That is a huge effort and a huge maintenance effort.

Usually you want to set the collation once per language schema. E.g. schema
russian gets Russian collation and schema British gets British collation 
and so on.


CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
do it by creating a database.

I would like to get default collation per schema / table in 9.2 or 9.3 
but that is my personal

wish,

Susanne


--
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Thom Brown

On 22 May 2012 16:57, Florian Pflug f...@phlo.org wrote:
 On May22, 2012, at 16:09 , Tom Lane wrote:
 Thom Brown t...@linux.com writes:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.

 Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
 want local and global roles to have nonconflicting OIDs, else it's
 necessary to carry around an indication of which type each role is;
 which would be more or less a show-stopper in terms of the number of
 catalogs and internal APIs affected.  But I don't currently see any
 nice way to guarantee that if each database has a private table of
 local roles.

 Maybe we could simply make all global role's OIDs even, and all local ones
 odd, or something like that.

Wouldn't that instantly make all previous versions of database
clusters un-upgradable?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On Tue, May 22, 2012 at 12:00 PM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:
 Usually you want to set the collation once per language schema. E.g. schema
 russian gets Russian collation and schema British gets British collation and
 so on.

 CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
 do it by creating a database.

 I would like to get default collation per schema / table in 9.2 or 9.3 but
 that is my personal
 wish,

Interesting idea.  Sort of like ALTER DEFAULT PRIVILEGES.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

Robert Haas robertmh...@gmail.com writes:
 On Tue, May 22, 2012 at 10:25 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 My own thoughts about this had pointed in the direction of getting rid
 of the central freelist entirely, instead letting each backend run its
 own independent clock sweep as needed.

 Hmm, that's certainly an interesting idea.  I fear that if the clock
 sweeps from the different backends ended up too closely synchronized,
 you would end up evicting whatever was in the way, be it hot or cold.
 It might almost be better to have individual backends choose buffers
 to evict at random; if the chosen buffer isn't evictable, we decrement
 its usage count and pick another one, also at random.

Hmm ... yeah, in principle that could be better.  It's still a clock
sweep but following a hard-to-predict ordering of the buffers.  Being
hard to predict doesn't necessarily guarantee no bad behavior though.
Maybe we could do something a bit cheaper than random() and more
amenable to analysis, that would still ensure that backends couldn't end
up with closely sync'd scans.  I'm imagining advancing not 1 buffer
each time, but K buffers where each backend will use a different value
of K.  Perhaps backend with backendID N could use the N'th prime number,
say.  You'd want something relatively prime to shared_buffers to
guarantee that the backend can visit all the buffers, and just making it
prime would do fine.

 With respect to the control logic for the background writer, one idea
 I had was to get rid of the idea that the background writer's job is
 to write in advance of the strategy point.  Instead, every time the
 clock sweep passes over a dirty buffer that is otherwise evictable, we
 add it to a queue of things that the bgwriter should clean.  Those
 buffers, once cleaned, go on the free list.  Maybe some variant of
 that could work with your idea.

Both ends of that imply centralized data structures that could become
contention bottlenecks, so it doesn't sound tremendously appealing to
me.  Maybe it can work as long as nobody has to lock the lists for long,
but I'd rather think of approaches with no central point of contention.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On May22, 2012, at 18:00 , Susanne Ebrecht wrote:
 Am 22.05.2012 17:42, schrieb Tom Lane:
 Encoding yes, but since 9.1 we have pretty fine-grained control of
 collation.  So I think this argument is a lot weaker than it used
 to be.  It would only really apply if you have one of the corner
 cases where utf8 doesn't work for you.
 
 Yeah it got better - but it isn't perfect yet.

Still, the whole reason that the encoding is a per-database property is
that we don't tag each string with its encoding. To support cross-database
queries, we'd have to do that, and could then just as well make the encoding
a per-column property, or at least so I think.

 Maybe I am blind or 9.1 documentation has a bug - but according to the
 documentation you can't change default collation per schema or per table.
 You can set collation per column - but do you really want to set collation for
 every single column of every single supported language in your 200+ tables
 web tool?
 
 That is a huge effort and a huge maintenance effort.

You could always write at pl/pgsql function which iterates over all columns
of type text or varchar within a schema and sets the desired collation, but

 I would like to get default collation per schema / table in 9.2 or 9.3 but
 that is my personal wish,

yeah, that'd definitely be nice.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers

2012-05-22 Thread chinnaobi


Dear Robert,

Thank you very much for the reply. 

You mean when the primary which is going to switch its role to standby might
not have sent all the WAL records to the standby and If it is switched to
standby it has more WAL records than the standby which is now serves as
primary. Is it ??

It is actually the standby server which has to be restored from archive when
it is switching to primary right .. Not the primary which is switching to
standby ??

Regards,
Reddy.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637p5709495.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

Thom Brown t...@linux.com writes:
 On 22 May 2012 16:57, Florian Pflug f...@phlo.org wrote:
 Maybe we could simply make all global role's OIDs even, and all local ones
 odd, or something like that.

 Wouldn't that instantly make all previous versions of database
 clusters un-upgradable?

IIRC, pg_upgrade doesn't need to force role OIDs to be the same in the
new cluster, so we could get away with this trick for the specific case
of roles.  It wouldn't work for introducing local/global versions of
some other types of objects though.

Another objection is that it wouldn't scale up nicely to multiple levels
of catalog hierarchy.  But maybe local/global is enough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

On Tue, May 22, 2012 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 With respect to the control logic for the background writer, one idea
 I had was to get rid of the idea that the background writer's job is
 to write in advance of the strategy point.  Instead, every time the
 clock sweep passes over a dirty buffer that is otherwise evictable, we
 add it to a queue of things that the bgwriter should clean.  Those
 buffers, once cleaned, go on the free list.  Maybe some variant of
 that could work with your idea.

 Both ends of that imply centralized data structures that could become
 contention bottlenecks, so it doesn't sound tremendously appealing to
 me.  Maybe it can work as long as nobody has to lock the lists for long,
 but I'd rather think of approaches with no central point of contention.

If we're going to throw our current algorithm over wholesale, I'd
rather use some approach that has been demonstrated to work well in
other systems.  Buffer eviction is a problem that's been around since
the 1970s, and our algorithm is just about that old.  I realize that
there are (legitimate) concerns about what might be patented, but
hopefully that doesn't mean we're not allowed to consult the
literature in any way.  If that were the case, we wouldn't have SSI.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

On May22, 2012, at 18:03 , Thom Brown wrote:
 On 22 May 2012 16:57, Florian Pflug f...@phlo.org wrote:
 On May22, 2012, at 16:09 , Tom Lane wrote:
 Thom Brown t...@linux.com writes:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.
 
 Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
 want local and global roles to have nonconflicting OIDs, else it's
 necessary to carry around an indication of which type each role is;
 which would be more or less a show-stopper in terms of the number of
 catalogs and internal APIs affected.  But I don't currently see any
 nice way to guarantee that if each database has a private table of
 local roles.
 
 Maybe we could simply make all global role's OIDs even, and all local ones
 odd, or something like that.
 
 Wouldn't that instantly make all previous versions of database
 clusters un-upgradable?

Only if pg_upgrade needs to preserve the OIDs of roles. I kinda hoped it
wouldn't, because role OIDs aren't usually stored in non-system tables.

Hm… thinking about this further, it'd actually be sufficient for all newly
allocated role OIDs to follow the odd/even rule, if we additionally check
for conflicts with existing global role OIDs when allocating the OID of a new
local role. Which is much, much easier than checking for conflicts when
allocating a global OIDs, because for that you'd have to check against the
local role OIDs within *all* databases, not just against one shared table.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread David E. Wheeler

On May 13, 2012, at 3:45 PM, Robert Haas wrote:

 It seems like maybe we could work around this by remembering the
 contents of the pending list throughout the scan.  Every time we hit a
 TID while scanning the main index, we check whether we already
 returned it from the pending list; if so, we skip it, but if not, we
 return it.

Should this go onto the To-Do list, then?

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-22 Thread Jeff Janes

On Mon, May 21, 2012 at 2:29 PM, Merlin Moncure mmonc...@gmail.com wrote:
On Mon, May 21, 2012 at 4:17 PM, Jeff Janes jeff.ja...@gmail.com wrote:

For vaguely real life, take your example of pgbench -i -s200 -F 50,
and I have 2Gig RAM, which seems to be the same as you do.

With select only work load (pgbench -S -M prepared -T 30), I get

tps = 193

But now enable index-only scans:

psql -c create index on pgbench_accounts(aid, abalance);

and it goes up to.

tps = 10137

Right -- the main driver here is that your index fits neatly in ram
and the heap does not -- so you're effectively measuring the
difference between a buffered and non-buffered access. That's highly
contrived as you noted and unlikely to come up all *that* often in the
real world.

I don't think this one is highly contrived. With the index being 10
fold smaller than the table, there is plenty of window for one to fit
in RAM and the other one not to in a variety of real world situations.
(Although I only get 10 fold window because of -F 50. I don't why
Josh had that big of a different in his original, unless he also used
a nondefault fill factor setting. But of course many real world
tables will be wider than pgbench_accounts is.)

The highly contrived example useful for dissecting the implementation
would be to do:

set enable_seqscan=off;
set enable_indexonlyscan=off;
select count(*) from pgbench_accounts where aid is not null;

The key that I keep forgetting is the where aid is not null'.
Without that it uses the full table scan, even with enable_seqscan
off, rather than doing an ordinary index scan.

Generally though the real world wins (although the gains will be
generally less spectacular) are heavily i/o bound queries where the
indexed subset of data you want is nicely packed and the (non
clustered) heap records are all over the place. By skipping the semi
random heap lookups you can see enormous speedups. I figure 50-90%
improvement would be the norm there, but this is against queries that
are taking forever, being i/o bound.

See here:
http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
for a 'in the wild' gripe about about not having index scans.

But without scripts to recreate the data with the right selectivities
and correlations, and to generate a long stream of appropriate query
parameterizations so that they don't become cached, that is just a
gripe and not an example.

I tried to reproduce the problem as stated, and couldn't make IOS be
useful because I couldn't make it be slow even without them.
Presumably I'm doing something wrong, but how could I tell what? Have
we heard back on whether IOS was tried and proved useful to the
originator of that thread?

If you want an example where neither index nor table fit in memory,
then just bump up the scale to 2000---and get a machine with only 2G
of memory if you don't already have one :)

With the extra index in place:

with enable_indexonlyscan=off
tps = 155.1
with enable_indexonlyscan=on
tps = 453.7

It seems like that should have only doubled, I'm not sure why it did
more than double. Maybe the index became better cached when the table
stopped competing with it for buffers.

If you leave the select-only world and go back to doing updates, then
that extra index is doing to hurt you somewhat, but if the dominant
bottleneck is rpm of your WAL drive, it might not be noticeable.

Cheers,

Jeff

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

Robert Haas robertmh...@gmail.com writes:
 If we're going to throw our current algorithm over wholesale, I'd
 rather use some approach that has been demonstrated to work well in
 other systems.  Buffer eviction is a problem that's been around since
 the 1970s, and our algorithm is just about that old.

Um, if you're claiming that that code dates from Berkeley days, you're
quite mistaken.  We adopted the clock sweep in 2005, after trying a few
other things whose performance was worse.  I've not seen any argument in
this thread that suggests we should abandon clock sweep + usage counts
entirely.  Rather, to me the issue is that we haven't completely gotten
rid of the last vestiges of the old global freelist approach.

BTW, it might be worth pointing out something I was trying to explain
to Amit at PGCon: the key reason that we went with usage counters rather
than something like a global LRU chain is that in the fast path where
ReadBuffer finds the requested block already in a buffer, it does not
have to contend for any global data structure to update the buffer's
usage information.  It just has to bump the usage count in the buffer's
header.  The free list, and the contention for BufFreelistLock, only
matter in the slow path where you're going to have to incur I/O anyway
(or at least a visit to the kernel).  That seemed plenty good enough
in 2005.  Our ambitions have now advanced further, so I'm on board with
trying to reduce contention here too, but I think it would be a mistake
to make the fast case any slower.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
da...@justatheory.com wrote:
 On May 13, 2012, at 3:45 PM, Robert Haas wrote:

 It seems like maybe we could work around this by remembering the
 contents of the pending list throughout the scan.  Every time we hit a
 TID while scanning the main index, we check whether we already
 returned it from the pending list; if so, we skip it, but if not, we
 return it.

 Should this go onto the To-Do list, then?

If someone other than me can confirm that it's not a stupid approach,
I would say yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

Robert Haas robertmh...@gmail.com writes:
 On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
 da...@justatheory.com wrote:
 On May 13, 2012, at 3:45 PM, Robert Haas wrote:
 It seems like maybe we could work around this by remembering the
 contents of the pending list throughout the scan.  Every time we hit a
 TID while scanning the main index, we check whether we already
 returned it from the pending list; if so, we skip it, but if not, we
 return it.

 Should this go onto the To-Do list, then?

 If someone other than me can confirm that it's not a stupid approach,
 I would say yes.

It seems probably workable given that we expect the pending list to be
of fairly constrained size.  However, the commit message referenced
upthread also muttered darkly about GIN's partial match logic not working
in amgettuple.  I do not recall the details of that issue, but unless we
can solve that one too, there's not much use in fixing this one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap metapages

On 22 May 2012 13:52, Robert Haas robertmh...@gmail.com wrote:

 It seems pretty clear to me that making pg_upgrade responsible for
 emptying block zero is a non-starter.  But I don't think that's a
 reason to throw out the design; I think it's a problem we can work
 around.

I like your design better as well *if* you can explain how we can get
to it. My proposal was a practical alternative that would allow the
idea to proceed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How could we make it simple to access the log as a table?

2012-05-22 Thread David Fetter

On Mon, May 21, 2012 at 05:53:55PM -0700, Josh Berkus wrote:
 Hackers,
 
 There's a lot of great information in the postgres logs.  While we
 eventually want to get more sophisticated about providing users with
 status and history information, for 9.3 it would be really nice to
 just offer the current logs in system view format.
 
 Certainly this can be done ad-hoc using CSV format and csv_fdw.
 However, such setups are fragile due to log rotation and other
 issues.  It seems like we could come up with a better way.  Ideas?

Would it help to automate this stuff with that CSV format?  The trick,
as I see it, is to make those on-disk log files be partitions of a
table.

Stephen Frost's point is well-taken, but I'm not sure we need to make
it a blocker for this.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE


 1. Ability to have a Role that can only access one Database
 
 2. Allow user info to be dumped with a database, to make a db
 completely self-consistent
 
 3. Allow databases to be transportable
 
 4. Allow users to access tables in 1 database easily, with appropriate 
 rights.

The last requirement seems completely contradictory to the other three.
 Either we're trying to make databases even more isolated as
multi-tenant Catalogs, or we're not.  Trying to do both at the same time
is failure-by-design.

Given that we offer schema as an alternative to multiple databases, and
users are starting to get used to them, I think that requirement (4) is
just a bad idea, and not worth pursuing, except in the context of pgsql_fdw.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Jeff Janes

Now that there are index only scans, there is a use case for having a
composite index which has the primary key or a unique key as the
prefix column(s) but with extra columns after that.  Currently you
would also need another index with exactly the primary/unique key,
which seems like a waste of storage and maintenance.

Should there be a way to declare a unique index with the unique
property applying to a prefix of the indexed columns/expression?  And
having that, a way to turn that prefix into a primary key constraint?

Of course this is easier said then done, but is there some reason for
it not to be a to-do item?

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 14:05, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan and...@dunslane.net wrote:
 That seems to be leaving aside the fact that we don't currently have any
 notion of how to allow FDWs to write the foreign tables.

 What is more, isn't the postgres FDW about talking to any postgres source?
 If so, does it have special smarts for when we are talking to ourselves? And
 if it doesn't then it seems unlikely to be an acceptable substitute for
 allowing talking direct to a sibling database.

 I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
 adequately meet the case if we wanted to go that way.

 Well, I don't think anyone is claiming that FDWs as they exist today
 solve all of the problems in this area.  But I think that improving
 FDWs is a more promising line of attack than trying to make backends
 talk to multiple databases.  Doing the latter will require massive
 surgery on the relcache, the catcache, most of the secondary catalog
 caches, the ProcArray, and every portion of the backend that thinks an
 OID uniquely identifies an SQL object.  Basically, they'd all need
 database OID as an additional key field, which is undesirable for
 performance reasons even if there were no issue of code churn.

Ack, part from the bit about OIDs no longer being unique. That might
be an upgrade issue but its obviously something we wouldn't allow if
we did that.

I'm not sure I see changing the caches as being massive surgery.
Perhaps we could just bypass them altogether.

 So I'd rather see us put the effort into pgsql_fdw, which, as Florian
 says, will also let us talk to a completely separate server.  If
 you've got multiple databases in the same cluster and really need to
 be doing queries across all of them, that's what schemas are
 supposedly for.  Now I know that people feel that doesn't work as well
 as it needs to, but there again I think it would be easier to fix
 schemas than to make cross-database queries work.

We're a very long way from making that work well. IMHO easily much
further than direct access.

If I have a customer with 1 database per user, how do they run a query
against 100 user tables? It would require 100 connections to the
database. Doing that would require roughly x100 the planning time and
x100 the connection delay. Larger SQL statements pass their results
between executor steps using libpq rather than direct calls.

I find it hard to believe FDWs will ever work sufficiently well to fix
those problems.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

On Tue, May 22, 2012 at 12:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 If we're going to throw our current algorithm over wholesale, I'd
 rather use some approach that has been demonstrated to work well in
 other systems.  Buffer eviction is a problem that's been around since
 the 1970s, and our algorithm is just about that old.

 Um, if you're claiming that that code dates from Berkeley days, you're
 quite mistaken.

Note the code: the algorithm.  I believe that what we have implemented
is GCLOCK, which is very old.

In the interest of full disclosure, descriptions of exactly what
GCLOCK is seem to vary a bit from paper to paper, but at least some
papers describe the exact algorithm we use.

 We adopted the clock sweep in 2005, after trying a few
 other things whose performance was worse.  I've not seen any argument in
 this thread that suggests we should abandon clock sweep + usage counts
 entirely.  Rather, to me the issue is that we haven't completely gotten
 rid of the last vestiges of the old global freelist approach.

Well, I think that switching from one clock sweep to a clock sweep per
backend would be basically an abandonment of the current approach.
The results might be better or worse, but they'd surely be different.

 BTW, it might be worth pointing out something I was trying to explain
 to Amit at PGCon: the key reason that we went with usage counters rather
 than something like a global LRU chain is that in the fast path where
 ReadBuffer finds the requested block already in a buffer, it does not
 have to contend for any global data structure to update the buffer's
 usage information.  It just has to bump the usage count in the buffer's
 header.  The free list, and the contention for BufFreelistLock, only
 matter in the slow path where you're going to have to incur I/O anyway
 (or at least a visit to the kernel).  That seemed plenty good enough
 in 2005.  Our ambitions have now advanced further, so I'm on board with
 trying to reduce contention here too, but I think it would be a mistake
 to make the fast case any slower.

Totally agreed.  We're not the first people to think of this, either:
CLOCK and GLOCK have been extensively studied and found to be almost
as good as LRU in selecting good victim pages, but with less
contention.  That's why people are using them.

Here's a paper that defines GLOCK to be the algorithm we use (page 2,
second column, second paragraph from the bottom), and furthermore
mentions PostgreSQL (top of page 3):

http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

Stephen,

 Which is preferred
 when you do a 'grant select' or 'grant role'?  

The local role is preferred, the same way we allow objects in the local
schema to overshadow objects in the global schema.

 Or do we just disallow
 overlaps between per-DB roles and global roles?  If we don't allow
 duplicates, I suspect a lot of the other questions suddenly become a lot
 easier to deal with, but would that be too much of a restriction?

The feature wouldn't be useful if we didn't allow conflicts between two
local role names.  However, we could prohibit conflicts between a local
role name and a global role name if it made the feature considerably
easier.  Users would find workarounds which weren't too arduous.

 How
 would you handle migrating an existing global role to a per-database
 role?

Again, I think it would be OK not handling it.  i.e., the user needs to
do the following:

1. create a new local role
2. reassign all the objects belonging to the global role to the local role
3. drop the global role
4. rename the local role

It'd be somewhat of a PITA, but I suspect that most people using the
local roles feature would recreate their databases from scratch
anyway.  And we could offer some sample scripts for the above on the
wiki and elsewhere.  Obviously, a more elegant migration command would
be ideal, but that could wait for the following PG release; we usually
follow the make things possible first, and easy later plan anyway.

Given that I'd love to have this feature, I'm trying to pare down its
requirements to a managable size.  Trying to do everything at once will
only result in the feature stalling until 10.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 18:18, Josh Berkus j...@agliodbs.com wrote:

 1. Ability to have a Role that can only access one Database

 2. Allow user info to be dumped with a database, to make a db
 completely self-consistent

 3. Allow databases to be transportable

 4. Allow users to access tables in 1 database easily, with appropriate 
 rights.

 The last requirement seems completely contradictory to the other three.
  Either we're trying to make databases even more isolated as
 multi-tenant Catalogs, or we're not.  Trying to do both at the same time
 is failure-by-design.

Why is it OK to allow somebody to access multiple schema in one query,
but not multiple databases? Are you arguing that schemas are also
broken?

I see no failure by design. I see an idea for greater ease of use
being discussed.

 Given that we offer schema as an alternative to multiple databases, and
 users are starting to get used to them, I think that requirement (4) is
 just a bad idea, and not worth pursuing,

Personally, I have long recommended that people use schemas. But
people do use databases and when they do they are pretty much screwed.
I brought this up as a way of improving our ease of use.

 except in the context of pgsql_fdw.

That is not a realistic option.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE


 If I have a customer with 1 database per user, how do they run a query
 against 100 user tables? It would require 100 connections to the
 database. Doing that would require roughly x100 the planning time and
 x100 the connection delay. Larger SQL statements pass their results
 between executor steps using libpq rather than direct calls.

Why is this hypothetical customer using separate databases?  This really
seems like a case of doctor, it hurts when I do this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

On 22 May 2012 18:24, Jeff Janes jeff.ja...@gmail.com wrote:
 Now that there are index only scans, there is a use case for having a
 composite index which has the primary key or a unique key as the
 prefix column(s) but with extra columns after that.  Currently you
 would also need another index with exactly the primary/unique key,
 which seems like a waste of storage and maintenance.

 Should there be a way to declare a unique index with the unique
 property applying to a prefix of the indexed columns/expression?  And
 having that, a way to turn that prefix into a primary key constraint?

 Of course this is easier said then done, but is there some reason for
 it not to be a to-do item?

+1

Very useful

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

Jeff Janes jeff.ja...@gmail.com writes:
 Now that there are index only scans, there is a use case for having a
 composite index which has the primary key or a unique key as the
 prefix column(s) but with extra columns after that.  Currently you
 would also need another index with exactly the primary/unique key,
 which seems like a waste of storage and maintenance.

 Should there be a way to declare a unique index with the unique
 property applying to a prefix of the indexed columns/expression?  And
 having that, a way to turn that prefix into a primary key constraint?

 Of course this is easier said then done, but is there some reason for
 it not to be a to-do item?

Um ... other than it being ugly as sin?  I can't say that I can get
excited about this concept.  It'd be better to work on index-organized
tables, which is really more or less what you're wishing for here.
Duplicating most of a table into an index is always going to be a loser
in the end because of the redundant storage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 18:35, Josh Berkus j...@agliodbs.com wrote:

 If I have a customer with 1 database per user, how do they run a query
 against 100 user tables? It would require 100 connections to the
 database. Doing that would require roughly x100 the planning time and
 x100 the connection delay. Larger SQL statements pass their results
 between executor steps using libpq rather than direct calls.

 Why is this hypothetical customer using separate databases?  This really
 seems like a case of doctor, it hurts when I do this.

Databases are great for separating things, but sometimes you want to
un-separate them in a practical way.

I'm surprised that you're so negative about an ease of use feature. I
had understood you cared about fixing problems experienced by our
developers.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On Tue, May 22, 2012 at 1:27 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Ack, part from the bit about OIDs no longer being unique. That might
 be an upgrade issue but its obviously something we wouldn't allow if
 we did that.

And how exactly are you going to disallow that?  We currently enforce
the uniqueness of OIDs within a database by putting a unique index on
the relevant system catalog columns, but that clearly won't work as a
means of guaranteeing cross-database uniqueness.  Unless of course you
put all the catalog entries from all the databases into a single set
of catalogs; but then they're pretty much the same as the schemas we
already have.

 I'm not sure I see changing the caches as being massive surgery.
 Perhaps we could just bypass them altogether.

You're entitled to your opinion on this one, but we have those caches
for a very good reason: the system runs about 20,000 times slower
without them.

 So I'd rather see us put the effort into pgsql_fdw, which, as Florian
 says, will also let us talk to a completely separate server.  If
 you've got multiple databases in the same cluster and really need to
 be doing queries across all of them, that's what schemas are
 supposedly for.  Now I know that people feel that doesn't work as well
 as it needs to, but there again I think it would be easier to fix
 schemas than to make cross-database queries work.

 We're a very long way from making that work well. IMHO easily much
 further than direct access.

 If I have a customer with 1 database per user, how do they run a query
 against 100 user tables? It would require 100 connections to the
 database. Doing that would require roughly x100 the planning time and
 x100 the connection delay. Larger SQL statements pass their results
 between executor steps using libpq rather than direct calls.

 I find it hard to believe FDWs will ever work sufficiently well to fix
 those problems.

If you have got a customer with one database per user, and yet you
want to run queries across all those tables, then you're using the
database system for something for which it is not designed, and it's
probably not going to work very well.  That sounds like a use case for
schemas, or maybe better, some kind of row-level security we don't
have yet - but hopefully will in 9.3, since KaiGai intends to work on
it.  Databases are designed to act as a firewall, so that somebody
using one database isn't affected by what happen in some other
database.  Unfortunately, because of shared catalogs, that's not
completely true, but that's the idea, and if anything we need to
isolate them more, not less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Daniel Farina

On Tue, May 22, 2012 at 10:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 22 May 2012 18:35, Josh Berkus j...@agliodbs.com wrote:

 If I have a customer with 1 database per user, how do they run a query
 against 100 user tables? It would require 100 connections to the
 database. Doing that would require roughly x100 the planning time and
 x100 the connection delay. Larger SQL statements pass their results
 between executor steps using libpq rather than direct calls.

 Why is this hypothetical customer using separate databases?  This really
 seems like a case of doctor, it hurts when I do this.

 Databases are great for separating things, but sometimes you want to
 un-separate them in a practical way.

In my experience, these un-separations are (thankfully) relieved of
the requirement of consistency between databases, and so the contract
is much more favorable.

The planning time problem is quite hard.

However, I think the connection-delay one is not as hard a one to
answer: I think multiplexed protocols are going to become the norm in
the near future (they have been a pretty uncontested part of the SPDY
protocol, for example, after flow control was added) and have a number
of useful properties, and it may be time to consider how we're going
to divorce the notion of one socket implies exactly one backend.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with error response message

On 22 May 2012 12:59, Tatsuo Ishii is...@postgresql.org wrote:

 Also I wonder why conflict with recovery generates same error code
 as serialization error. For me it seems not very consistent
 choice. IMO, we should assign different error code for at least brings
 totally different effect to frontend: aborting transactions or
 sessions.

Some conflicts cause ERROR, some cause FATAL. So there is not a
distinct difference.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE


 Why is it OK to allow somebody to access multiple schema in one query,
 but not multiple databases? Are you arguing that schemas are also
 broken?

Because the purpose of a database is to be a Catalog, i.e. an isolated
container, which is not the purpose of schemas.  To the extent to which
we can make the isolation a reality (instead of lossy isolation the
way it is now), we can enable many multitenant hosting designs which
aren't currently possible.  However, pursuing interdatabase queries at
the same time we try to add new isolation features is a doomed effort.

For example, if we created local database users (see other thread), then
what would happen if a local user tries to execute a cross-database
query?  If we enable physical migration of a database to a new Postgres
instance, what happens to standing multi-database views?  If
interdatabase queries are allowed, how do I, as a hosting operator, make
sure that users can't even see the other databases on the system?

 I see no failure by design. I see an idea for greater ease of use
 being discussed.

You can't attempt mutually contradictory requirements and expect to
succeed, or to improve ease of use.  You can't ride two horses,
especially if they're going in opposite directions.

 Personally, I have long recommended that people use schemas. But
 people do use databases and when they do they are pretty much screwed.
 I brought this up as a way of improving our ease of use.

I'm not arguing that we don't have users who would like interdatabase
queries, especially when they port applications from MySQL or MSSQL.  We
have a lot of such users.  However, we *also* have a lot of users who
would like to treat separate databases as virtual private instances of
Postgres, and there's no way to satisfy both goals. We have to choose
one route or the other.

I personally see the isolation case as the more necessary because there
are several workarounds for the inter-database queries issue, but
nothing for the multitenant catalog case.  Also, treating databases as
catalogs is more consistent with our historical approach, and will thus
break less backwards compatibility.

Maybe interdatabase queries are more useful/desired than catalog
features.  If that's the case, then we need to pursue them and abandon
efforts like per-database users. But we have to make a choice.

An alternative idea -- and one which could be deployed a lot faster --
is to come up with a tool which makes it easy to migrate an entire
database into a separate schema or set of schemas in an existing
database.   And improvements to manage schema visility/path better, I
suppose.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

On Tue, May 22, 2012 at 1:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Now that there are index only scans, there is a use case for having a
 composite index which has the primary key or a unique key as the
 prefix column(s) but with extra columns after that.  Currently you
 would also need another index with exactly the primary/unique key,
 which seems like a waste of storage and maintenance.

 Should there be a way to declare a unique index with the unique
 property applying to a prefix of the indexed columns/expression?  And
 having that, a way to turn that prefix into a primary key constraint?

 Of course this is easier said then done, but is there some reason for
 it not to be a to-do item?

 Um ... other than it being ugly as sin?  I can't say that I can get
 excited about this concept.  It'd be better to work on index-organized
 tables, which is really more or less what you're wishing for here.
 Duplicating most of a table into an index is always going to be a loser
 in the end because of the redundant storage.

An index on pgbench_accounts (aid, abalance) is the same size as an
index on pgbench_accounts (aid), but even if it were larger, there's
no theoretical reason it couldn't have enough utility to justify its
existence.   A bigger problem is that creating such an index turns all
of pgbench's write traffic from HOT updates into non-HOT updates,
which means this is probably only going to be a win if the write
volume is miniscule.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread David E. Wheeler

On May 22, 2012, at 9:56 AM, Tom Lane wrote:

 It seems probably workable given that we expect the pending list to be
 of fairly constrained size.  However, the commit message referenced
 upthread also muttered darkly about GIN's partial match logic not working
 in amgettuple.  I do not recall the details of that issue, but unless we
 can solve that one too, there's not much use in fixing this one.

Well, what about a GiST operator family/class for arrays?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: parameterized function scan

On Fri, May 11, 2012 at 5:52 PM, Antonin Houska
antonin.hou...@gmail.com wrote:
 Hello,
 following this short discussion
 http://archives.postgresql.org/message-id/4f5aa202.9020...@gmail.com
 I gave it one more try and hacked the optimizer so that function can become
 an inner relation in NL join, parametrized with values from the outer
 relation.

 I tried to explain my thoughts in comments. Other than that:

 1. I haven't tried to use SpecialJoinInfo to constrain join order. Even if
 the order matters in query like
 SELECT * from a, func(a.i)
 it's still inner join by nature. SpecialJoinInfo is used for INNER join
 rarely, but never stored in PlannerInfo. Doing so only for these lateral
 functions would be rather disruptive.

 2. Simple SQL function (i.e. one that gets pulled-up into the main query) is
 a special case. The query that results from such a pull-up no longer
 contains any function (and thus is not affected by this patch) but such
 cases need to be newly taken into account and examined / tested (the patch
 unblocks them at parsing stage too).

 3. There might be some open questions about SQL conformance.

 I've spent quite a while looking into the optimizer code and after all I was
 surprised that it didn't require that many changes. At least to make few
 simple examples work. Do I ignore any important fact(s) ?

This implementation looks different than I'd expect: I would have
thought that it would work by generating paths with param_info set to
the appropriate set of rels to provide the necessary values, rather
than inventing its own mechanism for forcing a nestloop.

Also, I think we will want something that implements the LATERAL()
syntax, rather than just removing the prohibition on lateral
references.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Jeff Janes

On Tue, May 22, 2012 at 10:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Now that there are index only scans, there is a use case for having a
 composite index which has the primary key or a unique key as the
 prefix column(s) but with extra columns after that.  Currently you
 would also need another index with exactly the primary/unique key,
 which seems like a waste of storage and maintenance.

 Should there be a way to declare a unique index with the unique
 property applying to a prefix of the indexed columns/expression?  And
 having that, a way to turn that prefix into a primary key constraint?

 Of course this is easier said then done, but is there some reason for
 it not to be a to-do item?

 Um ... other than it being ugly as sin?  I can't say that I can get
 excited about this concept.  It'd be better to work on index-organized
 tables, which is really more or less what you're wishing for here.

IOT would probably be a nice feature too, but adding one more
strategically chosen column to an index is quite different from adding
every column into the index.  At least in the general case.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Vik Reykja

On Tue, May 22, 2012 at 1:36 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 22 May 2012 18:24, Jeff Janes jeff.ja...@gmail.com wrote:
  Now that there are index only scans, there is a use case for having a
  composite index which has the primary key or a unique key as the
  prefix column(s) but with extra columns after that.  Currently you
  would also need another index with exactly the primary/unique key,
  which seems like a waste of storage and maintenance.
 
  Should there be a way to declare a unique index with the unique
  property applying to a prefix of the indexed columns/expression?  And
  having that, a way to turn that prefix into a primary key constraint?
 
  Of course this is easier said then done, but is there some reason for
  it not to be a to-do item?

 +1

 Very useful


Semi-private note to Simon: isn't this pretty much what I was advocating at
the London meetup last month?

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Daniel Farina

On Tue, May 22, 2012 at 10:56 AM, Josh Berkus j...@agliodbs.com wrote:
 I'm not arguing that we don't have users who would like interdatabase
 queries, especially when they port applications from MySQL or MSSQL.  We
 have a lot of such users.  However, we *also* have a lot of users who
 would like to treat separate databases as virtual private instances of
 Postgres, and there's no way to satisfy both goals. We have to choose
 one route or the other.

I think the idea that a physical machine where catalogs are physically
(shared-everything) co-located is one that will not stand for long as
part of a useful contract between a user and the database.  I'd really
like to avoid an extra tier of functionality that exists only for
databases that happen to land on the same physical machine.

I think any inter-database feature should work identically between two
databases across a network as two machines on one machine/cluster.

Transparent optimizations to deal with the special case of physical
co-location are not contrary to that contract, but I don't have a
sense of how important those optimizations would be before getting a
lot of the usability issues figured out.  Right now, it seems to me
that getting interdatabase usability feeling better is already pretty
hard.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Daniel Farina

On Mon, May 21, 2012 at 5:25 PM, Joel Jacobson j...@trustly.com wrote:
 If one want to reuse the splitting to files-code of the directory
 format, maybe the existing option -F d could be tweaked to output in
 both a a machine-readable format (current way), and also a
 human-friendly tree of files and content (like suggested by my patch).

 I wonder what the option would be called then, having two chars
 options is not an option I guess, maybe -F t for tree instead of
 directory, as the -F d option only dumps to a single directory and
 not a tree-structure?

Is there a reason why the current directory format could not be
adjusted to become more human-readable friendly for mechanical
reasons?  I realize there is a backwards compatibility problem, but it
may be better than bloating a new option.

Andrew's approach of reading the TOC also be good...as so pg_dump can
avoid serving the additional master of schema versioning and
development usability in addition to dumping.  The TOC is the closest
thing we have to the library-ification of pg_dump in the near-term.
But I don't see how making the directory output format more
human-friendly could be seen as a bad thing overall, except in the
notable axis of implementation complexity.  Silly issues like naming
files on different platforms, case sensitivity, and file length
restrictions may rear their ugly head.

I think about this because in addition to the data types and operators
defined in the development process, there are often small tables that
need to be loaded with content and version controlled as well, rather
like userland-equivalents pg_enum entries.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
 The local role is preferred, the same way we allow objects in the local
 schema to overshadow objects in the global schema.

I would think we'd want the exact opposite.  I don't want my global
'postgres' user to be overwritten by some local one that the admin of
this particular DB created..  

 The feature wouldn't be useful if we didn't allow conflicts between two
 local role names.  However, we could prohibit conflicts between a local
 role name and a global role name if it made the feature considerably
 easier.  Users would find workarounds which weren't too arduous.

Sorry, I was meaning between global space and local space.  Clearly we
must allow and handle cleanly overlaps between local spaces.

The issue with not allowing global spaces to overlap local ones is that
we'd have to check every local list when creating a global account;
that doesn't seem very easy to do.  On the flip side, allowing
duplicates between global and local would remove the need to check local
lists when creating global accounts, but would add complexity and could
lead to odd semantics when there is a duplicate.

 1. create a new local role
 2. reassign all the objects belonging to the global role to the local role
 3. drop the global role
 4. rename the local role

Right, that seems like it would work fine.

 It'd be somewhat of a PITA, but I suspect that most people using the
 local roles feature would recreate their databases from scratch
 anyway.  And we could offer some sample scripts for the above on the
 wiki and elsewhere.  Obviously, a more elegant migration command would
 be ideal, but that could wait for the following PG release; we usually
 follow the make things possible first, and easy later plan anyway.


Sure.

 Given that I'd love to have this feature, I'm trying to pare down its
 requirements to a managable size.  Trying to do everything at once will
 only result in the feature stalling until 10.5.

If you could help me work out the semantics and the high-level issues,
I'd love to spend time on this for 9.3...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Ants Aasma

On Tue, May 22, 2012 at 8:28 PM, Robert Haas robertmh...@gmail.com wrote:
 Totally agreed.  We're not the first people to think of this, either:
 CLOCK and GLOCK have been extensively studied and found to be almost
 as good as LRU in selecting good victim pages, but with less
 contention.  That's why people are using them.

 Here's a paper that defines GLOCK to be the algorithm we use (page 2,
 second column, second paragraph from the bottom), and furthermore
 mentions PostgreSQL (top of page 3):

 http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf

Interesting paper, they make the whole buffer management lock free.
Getting rid of not only the BufFreelistLock but also BufMappingLocks
and buffer header spinlocks. Now AFAIK BufMappingLocks and buffer
header aren't really a big issue for us, and atleast the latter looks
like turning it lock-free would entail lots of pretty hairy and
bug-prone code. On the other hand, making the clock sweep lock-free
looks relatively easy.

As far as I can see there is no reason why nextVictimBuffer couldn't
be read, incremented and atomically cmpxchg'ed to let other continue
before checking for the usage count. Likewise completePasses and
numBufferAllocs can easily be atomically incremented, bgwriter
shouldn't mind if the values are slightly out of date. The free list
itself is a bit trickier, but if it's still necessary/useful then
SC-firstFreeBuffer and buf-freeNext are in effect a linked-list
stack, there should plenty of tested lock free algorithms floating
around for that. (btw. lastFreeBuffer looks like dead code, is that
correct?)

As for better buffer management algorithms, have you read about
CLOCK-Pro? [1] It looks like it's an adaptive variant of LIRS, the
algorithm the MySQL uses (or atleast used some time ago). Looks like
Linux kernel devs also at least thought about implementing it [2] [3]
(hard to tell exactly, their docs are pretty chaotic compared pg).
According to [4], LIRS is almost as good as or better than LRU, by
extension I'd expect CLOCK-Pro to be better than GLOCK. It still has a
single clock mechanism, so better replacement policies won't help a
whole lot with lock contention on buffer allocation.

[1] http://www.cse.ohio-state.edu/~fchen/paper/papers/usenix05.pdf
[2] http://linux-mm.org/ClockProApproximation
[3] http://linux-mm.org/PageReplacementDesign
[4] http://www.almaden.ibm.com/cs/people/dmodha/ARC.pdf

Cheers,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

On 22 May 2012 18:56, Josh Berkus j...@agliodbs.com wrote:

 I'm not arguing that we don't have users who would like interdatabase
 queries, especially when they port applications from MySQL or MSSQL.  We
 have a lot of such users.

Lots and lots, yes.

 However, we *also* have a lot of users who
 would like to treat separate databases as virtual private instances of
 Postgres, and there's no way to satisfy both goals. We have to choose
 one route or the other.

That's only true if you try to satisfy both goals at once, which I'm
not suggesting. So I believe that proposition to be false.

However, given sufficient people speaking against it, I'll leave this idea.

Though I'd suggest that people on this thread spend a little quality
time with FDWs. It's a great direction but there's a long way to go
yet. Sorry to Laurenz, who's done a great job so far on the Oracle
FDW.

 I personally see the isolation case as the more necessary because there
 are several workarounds for the inter-database queries issue

I also want that, per my original post.

 An alternative idea -- and one which could be deployed a lot faster --
 is to come up with a tool which makes it easy to migrate an entire
 database into a separate schema or set of schemas in an existing
 database.   And improvements to manage schema visility/path better, I
 suppose.

Yes, it is possible to improve things there also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

On Wed, May 16, 2012 at 05:30:27PM -0400, Bruce Momjian wrote:
 
 I will make the adjustments outlined below as soon as I can.

Done and committed.

---

 
 On Sun, May 13, 2012 at 12:37:52AM -0400, Robert Haas wrote:
  On Sat, May 12, 2012 at 8:11 PM, Euler Taveira eu...@timbira.com wrote:
   On 12-05-2012 10:27, Bruce Momjian wrote:
   How many names on a single item is ideal?  The activity of reviewers and
   their names on commit messages has greatly expanded the number of
   potential names per item.
  
   Main authors only. Reviewers should be mentioned only in the commit log. 
   If I
   coded a feature and Bruce got the idea worked in another patch (that is 
   better
   then mine), I think only Bruce should be credited in release notes (but I
   could be mentioned in the commit log as the feature designer). However, 
   if I
   posted a patch and Robert improved that patch using only 30% of my work, I
   should be credited (as coauthor) because he used a considerable part of 
   my work.
  
  Completely agreed.  If we're going to include names in the release
  notes, I agree that this is the way to do it, and I think it's what we
  have done in prior releases.
  
  I tend to err on the side of crediting people in the commit message
  (of course, occasionally I forget someone who should have been
  included), but I also try to make it clear by the phrasing whose code
  got included and who contributed in some other way - e.g. by reporting
  the problem, coming up with the original idea, or reviewing.  I do
  this in part because I assumed that we'd use that as the criteria for
  including names in the release notes, as we have done in prior
  releases.  So if I write:
  
  Euler Taveira, reviewed by Bruce Momjian, substantially rewritten by me
  
  ...then I expect that to turn up in the release notes as (Euler
  Taveira, Robert Haas).  If I write:
  
  Euler Taveira, reviewed by Bruce Momjian, with minor cleanup by me
  
  ...then I expect that to turn up as (Euler Taveira).  And if I write
  something like:
  
  Inspired by a patch from Euler Taveira.  Review (in earlier versions)
  by Bruce Momjian.
  
  ...then I expect that to turn up as (Robert Haas) or (Robert Haas,
  Euler Taveira).
  
  In doubtful cases, I think it's generally appropriate to err on the
  side of crediting the person who was the original driving force behind
  the patch, and also to err on the side of not crediting the committer.
   But if the committer chopped up the patch and committed something
  significantly different from the original, then they should be
  credited - or blamed - for the result.
  
  -- 
  Robert Haas
  EnterpriseDB: http://www.enterprisedb.com
  The Enterprise PostgreSQL Company
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + It's impossible for everything to be true. +
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

On May22, 2012, at 22:35 , Stephen Frost wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 The local role is preferred, the same way we allow objects in the local
 schema to overshadow objects in the global schema.
 
 I would think we'd want the exact opposite.  I don't want my global
 'postgres' user to be overwritten by some local one that the admin of
 this particular DB created..  

From a security POV, yup, you'd want global roles to take precedence.
But OTOH, you wouldn't want your application to suddenly break because
someone created a global role which shadows the local role you've created
a year ago.

Hm… thinking about this… No matter which takes precedence, we'd need some
way to explicitly specify global or local scope anyway. And we'd have to
prevent roles from being named in a way that conflicts with whatever
explicit specification we come up with, even if that causes pain for some
unlucky existing users. Avoiding these conflicts entirely is going to be
impossible, I fear, since we don't currently restrict role names in any way
AFAIK, and we store them in GUCs without any quoting. 

So maybe we should just pick some qualification like prefixing local roles
with 'local.', forbid global roles from starting with 'local.', and be done
with it? Not the most elegant solution maybe, but it avoids surprises...

 The issue with not allowing global spaces to overlap local ones is that
 we'd have to check every local list when creating a global account;
 that doesn't seem very easy to do. 

Not very easy is quite an understatement, I fear. Very nearly impossible
is more like it IMHO.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

On 22 May 2012 19:01, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 22, 2012 at 1:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Now that there are index only scans, there is a use case for having a
 composite index which has the primary key or a unique key as the
 prefix column(s) but with extra columns after that.  Currently you
 would also need another index with exactly the primary/unique key,
 which seems like a waste of storage and maintenance.

 Should there be a way to declare a unique index with the unique
 property applying to a prefix of the indexed columns/expression?  And
 having that, a way to turn that prefix into a primary key constraint?

 Of course this is easier said then done, but is there some reason for
 it not to be a to-do item?

 Um ... other than it being ugly as sin?  I can't say that I can get
 excited about this concept.  It'd be better to work on index-organized
 tables, which is really more or less what you're wishing for here.
 Duplicating most of a table into an index is always going to be a loser
 in the end because of the redundant storage.

 An index on pgbench_accounts (aid, abalance) is the same size as an
 index on pgbench_accounts (aid), but even if it were larger, there's
 no theoretical reason it couldn't have enough utility to justify its
 existence.

Agreed

  A bigger problem is that creating such an index turns all
 of pgbench's write traffic from HOT updates into non-HOT updates,
 which means this is probably only going to be a win if the write
 volume is miniscule.

Not sure whether you see that as an argument against the proposal.
This argument applies to any index. In particular covered indexes are
specifically encouraged by index only scans, so is not a reason to
avoid implementing the feature as Jeff describes.

The main reason for the feature as described by Jeff is that it avoids
having 2 indexes when only one is required. In 9.2, with index only
scans, Jeff showed elsewhere that we can get an amazing speed up by
having a covered index. However, what Jeff is noticing is that he
needs 2 indexes on the table: 1 PK on (aid) and another index on (aid,
abalance). The first index can be avoided altogether, allowing a good
improvement in cache efficiency and general performance.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote:
 Improve GiST box and point index performance by producing better trees with
 less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas, Kevin
 Grittner)
 Is this note about following two commits?
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
 7f3bd86843e5aad84585a57d3f6b80db3c609916
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
 d50e1251946a6e59092f0a84fc903532eb599a4f
 These improvements influence not only boxes and points but all geometrical
 datatypes.

OK, new wording:

Improve GiST geometric type index performance by producing better
trees with less memory allocation overhead (Alexander Korotkov)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

On Wed, May 16, 2012 at 10:49:25PM +0300, Heikki Linnakangas wrote:
 On 16.05.2012 22:38, Jeff Janes wrote:
 For item:
 Improve COPY performance by adding tuples to the heap in batches
 (Heikki Linnakangas)
 
 I think we should point out that the batching only applies for COPY
 into unindexed tables.  Nice as the feature is, that is pretty big
 limitation not to mention.
 
 No, it applies to indexed tables too. However, if there are indexes
 on the table, the overhead of updating the indexes will probably
 make any speedup in the heap insertions look tiny in comparison.
 
 The optimization doesn't apply if the table has BEFORE or INSTEAD OF
 triggers, or volatile DEFAULT expressions need to be evaluated.

So, I will assume the existing wording is fine.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

On 22 May 2012 18:41, Tom Lane t...@sss.pgh.pa.us wrote:

 It'd be better to work on index-organized tables

My earlier analysis showed that IOTs are essentially the same thing as
block-level indexes, referred to as GITs by Heikki. (Robert referred
to these as Lossy Indexes recently, which was not the case - that
aspect was exactly the reason for rejection previously, so we should
not retread that path - indexes can operate at block level without
being lossy).

The number of index pointers is identical in each case, so IOTs are
not any more efficient in terms of space usage or I/O.

IOTs are much more difficult to implement, so I can't see any reason
to work on them. For example, having heap rows migrate on a block
split will cause havoc with our index implementation. We haven't
worked out how to re-join blocks that have split while maintaining
concurrency, so IOTs would require some pretty drastic repacking with
a severe lock type. Please lets avoid IOTs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Alexander Korotkov

On Wed, May 23, 2012 at 1:26 AM, Bruce Momjian br...@momjian.us wrote:

 On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote:
  Improve GiST box and point index performance by producing better trees
 with
  less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas,
 Kevin
  Grittner)
  Is this note about following two commits?
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
  7f3bd86843e5aad84585a57d3f6b80db3c609916
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
  d50e1251946a6e59092f0a84fc903532eb599a4f
  These improvements influence not only boxes and points but all
 geometrical
  datatypes.

 OK, new wording:

Improve GiST geometric type index performance by producing better
trees with less memory allocation overhead (Alexander Korotkov)


Thanks!

Also, I've some notes about removing reviewers.
Improve GiST index build times (Alexander Korotkov)
I think Heikki Linnakangas should be also listed as author of that patch
because he didn't only review and commit, but actually put his hands on
code.

Isn't my authorship of this patch lost now?
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=80da9e68fdd70b796b3a7de3821589513596c0f7
I think earlier this patch was taken into account in entry Add support for
range data types. Probably, we need separate entry for this patch?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Per-Database Roles


 The issue with not allowing global spaces to overlap local ones is that
 we'd have to check every local list when creating a global account;
 that doesn't seem very easy to do.  On the flip side, allowing
 duplicates between global and local would remove the need to check local
 lists when creating global accounts, but would add complexity and could
 lead to odd semantics when there is a duplicate.

On the other hand, keep in mind that creating a global account can be
slow.  For anyone who has a huge multi-tenant setup with 200 database
each with their own local users, creating a new global account will be
an event which occurs once or twice a year.  Just so that we don't pay
the same check cost for people who don't use local accounts.

 If you could help me work out the semantics and the high-level issues,
 I'd love to spend time on this for 9.3...

Syntax seems simple: CREATE LOCAL ROLE ...

For that matter, let's keep other things simple:

1. local roles can inherit only from other local roles
2. global roles can inherit only from other global roles
3. only a global role can be a database owner


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE


 That's only true if you try to satisfy both goals at once, which I'm
 not suggesting. So I believe that proposition to be false.

Oh, ok.  Per your original email and follow-up arguments, you seemed to
be doing just that.

 An alternative idea -- and one which could be deployed a lot faster --
 is to come up with a tool which makes it easy to migrate an entire
 database into a separate schema or set of schemas in an existing
 database.   And improvements to manage schema visility/path better, I
 suppose.
 
 Yes, it is possible to improve things there also.

Feh, and nested schema, for that matter.  So, there's a fair bit of work
wherever we bite it off.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

I described the problem with possibly localized S filed of
ErrorResponse(and NoticeResponse) in Frontend/Backend protocol.
http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php

So I would like to propose a solution for this (for 9.3): add new
field to ErrorResponse and NoticeResponse. The new field will have the
same value as S except that it's never localized. This will not only
solve the problem I described but possibly reduce the cost to analyze
the error/notice messages in the frontend programs.

Adding new field can be possible without breaking current version of
Frontend/Backend protocol since the protocol is extensible in
this area:
Since more field types might be added in future, frontends should
silently ignore fields of unrecognized type.(from 46.5. Message
Formats of PostgreSQL 9.2 documentation)

BTW, changing existing S field not to be localized would work but
I'm afraid it breaks backward compatibility.

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Christopher Browne

On Tue, May 22, 2012 at 4:35 PM, Stephen Frost sfr...@snowman.net wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 The local role is preferred, the same way we allow objects in the local
 schema to overshadow objects in the global schema.

 I would think we'd want the exact opposite.  I don't want my global
 'postgres' user to be overwritten by some local one that the admin of
 this particular DB created..

In object-orientedness, the usual behaviour is for more specific
methods to override the more generic ones, which is reasonable.  I'm
not certain which direction is more to be preferred, whether:
a) To consider the global user as a default, to be overridden if possible, or
b) To consider the local user as the default, to be overridden if possible.
They're both tenable positions.

But I think I agree with Stephen, that what's desirable, with global
users, is to use them as the override.  They're gonna be expensive,
you should get something for the price :-).
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

Tatsuo Ishii is...@postgresql.org writes:
 I described the problem with possibly localized S filed of
 ErrorResponse(and NoticeResponse) in Frontend/Backend protocol.
 http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php

 So I would like to propose a solution for this (for 9.3): add new
 field to ErrorResponse and NoticeResponse. The new field will have the
 same value as S except that it's never localized. This will not only
 solve the problem I described but possibly reduce the cost to analyze
 the error/notice messages in the frontend programs.

This seems like a rather expensive solution to a problem that I'm not
really convinced is real.  Why should a client program care about the
severity level to a greater extent than whether the message is
ErrorResponse or NoticeResponse?  In particular, I'm entirely
unconvinced by your claim that pgpool ought to treat ERROR and FATAL
cases differently.  Whatever it does about session termination ought to
be driven by the connection closure, not by the content of the message.
(As a counterexample, what if the backend crashes without delivering any
message at all?)  Moreover, if we did add this starting in 9.3, it would
still be many years before clients could rely on it being provided,
which means you'd need some other solution anyway.

Another issue is that if we do this, we're essentially (IMO) promising
that the set of severity codes won't change in the future, which may
not be a good thing to promise.

 BTW, changing existing S field not to be localized would work but
 I'm afraid it breaks backward compatibility.

We made it localized intentionally, on the grounds that its principal
and probably sole use was for presentation to human users.  I've not
heard previous complaints about that decision.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Joel Jacobson

On Wed, May 23, 2012 at 3:24 AM, Daniel Farina dan...@heroku.com wrote:
 Is there a reason why the current directory format could not be
 adjusted to become more human-readable friendly for mechanical
 reasons?  I realize there is a backwards compatibility problem, but it
 may be better than bloating a new option.

I like your idea, then the format would be directory, while the option
would be something like, --human-friendly?

Currently, the directory format only dumps the data of tables
into separate files. Everything else goes into the toc.dat file.
To make it work, also the stuff written to the toc.dat file must
be written to separate files.

 But I don't see how making the directory output format more
 human-friendly could be seen as a bad thing overall, except in the
 notable axis of implementation complexity.  Silly issues like naming
 files on different platforms, case sensitivity, and file length
 restrictions may rear their ugly head.

If the entire function identity arguments would be included in the filename,
two dumps of the same schema in two different databases
would be guaranteed to produce the same dump.

This would render some very long filenames for functions with many arguments,
but this problem could at least be reduced by using the shorter aliases for each
data type, as varchar instead of character varying and timestamptz
instead of timestamp with time zone, etc.

http://www.postgresql.org/docs/devel/static/datatype.html#DATATYPE-TABLE

Also, to get even more space, as the name of the function can be long too,
the function name could be made a directory, and the different overloaded types
different files, e.g:

/public/FUNCTION/myfunc/int.sql
/public/FUNCTION/myfunc/int_timestamptz.sql

And functions with no arguments are written to a single file
(suffic .sql to avoid conflict with eventual directory name for function):

/public/FUNCTION/myfunc.sql

 I think about this because in addition to the data types and operators
 defined in the development process, there are often small tables that
 need to be loaded with content and version controlled as well, rather
 like userland-equivalents pg_enum entries.

Is there a term for such tables? I use the term lookup tables, but perhaps
there is a better one?

In my schema, they typically maps statusids, stateids,
etc to human friendly names.

E.g., if Orders is a huge table for all orders, I might have a
OrderStatuses table to
lookup all the OrderStatusID columns in Orders.
Orders.OrderStatusID -fk- OrderStatuses.OrderStatusID
OrderStatuses.Name is unqiue and contains the human friendly name of the status.

These small lookup tables also needs to be version controlled of course.

This is a tricky one though, because you might have small tables with base data,
but with references to other huge tables, which you don't want to
include in your
automatically version controlled schema dump.

I solved this problem by creating a quite complex recursive plpgsql function,
resolving all dependencies and joining only the rows from each table required,
allowing you to specify a regex matching a list of tables, which in
turn resolves
to all tables they have references to, and dumps these tables too, but only the
required rows. The result is a dump of each such table into a separate file,
in a restorable order not causing any dependency problems.
Then I have a similar function to do the restoring.

I use this approach to build a restorable clean test database of any version of
the system, may it be the production or some developer's local version of it.

And also, not to forget, to make it work all the sequences also needs
to be restarted
to the same values as in the original database after the dump is restored.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

 This seems like a rather expensive solution to a problem that I'm not
 really convinced is real.  Why should a client program care about the
 severity level to a greater extent than whether the message is
 ErrorResponse or NoticeResponse?  In particular, I'm entirely
 unconvinced by your claim that pgpool ought to treat ERROR and FATAL
 cases differently.Whatever it does about session termination ought to
 be driven by the connection closure, not by the content of the message.
 (As a counterexample, what if the backend crashes without delivering any
 message at all?)

Pgpool already detects session termination and handles it.  BTW,
pgpool uses blocking socket and issue select(2), watching
read/exception of the file descriptor before reading from the
socket. It seems even if the backend teminates the session, select()
does not respond immediately but wait forever. I thought in this case
select() returns and reading the socket returns EOF. So it seems
sometimes it's hard to know if the socket was closed by backend. This
is one of the reasons why I don't want entirely rely on the physical
session termination event.

Moreover what pgpool would like to do here is, better service to users
because pgpool is not just a proxy server to PostgreSQL.

For example pgpool does failover if one of PostgreSQL nodes goes 
down. Pgpool can detect postmaster's planning shutdown by watching
healthiess of PostgreSQL but it is much better to trigger it by
detecting PostgreSQL's admin shutdown case, because it is quicker
(less cluster down time) and is more reliable (health check needs to
retry before judging postmaster down because of temporary network
error or some such).

 Moreover, if we did add this starting in 9.3, it would
 still be many years before clients could rely on it being provided,
 which means you'd need some other solution anyway.

Pgpool relies on PostgreSQL and cannot provide services more than what
PostgreSQL does. User can do more if he/she uses newer versin of
PostgreSQL. This can be said to not only this particlular problem but
any other things.

 Another issue is that if we do this, we're essentially (IMO) promising
 that the set of severity codes won't change in the future, which may
 not be a good thing to promise.

Why do you care?  The list of severity is cleary stated in the
document and you cannot arbitality change it without major version
change anyway. There's nothing different thing here from any other
features what PostgreSQL explicitly provides.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

Joel Jacobson j...@trustly.com writes:
 If the entire function identity arguments would be included in the filename,
 two dumps of the same schema in two different databases
 would be guaranteed to produce the same dump.

 This would render some very long filenames for functions with many arguments,

Thus, not implausibly, causing the dump to fail entirely on some
filesystems.  Case sensitivity, encoding issues, and special characters
in names (eg slashes or backslashes, depending on platform) are
additional pain points.  This does not sound like a good plan from here.

Taking a step or two back, it seems to me that the thrust of your
proposal is essentially to throw away all dump ordering information,
which does not seem like a particularly good idea either.  It certainly
will not lead to a dump that can be restored reliably.  If the use-case
for this is database comparisons, I think we'd be a lot better off to
write a postprocessing tool for regular dumps to perform such
comparisons, rather than whacking pg_dump around to the point where it's
unable to perform its primary function.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete