date:20100722

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Markus Wanner


Hi,

On 07/22/2010 12:11 AM, Robert Haas wrote:

I'm not sure why merging the SLRU pools with shared_buffers would
benefit from dynamically allocated shared memory.


Well, I'm not sure how you'd merge SLRU pools with shared_buffers. IMO 
that inherently leads to the problem of allocating memory dynamically.


With such an allocator, I'd say you just port one module after another 
to use that, instead of pre-allocated, fixed portions of shared memory.



I might be at (or possibly beyond) the limit of my ability to comment
intelligently on this without looking more at what you want to use
these imessages for, but I'm still pretty skeptical about the idea of
storing them directly in shared memory.  It's possible, though, that I
am all wet.


Imessages are meant to be a replacement for unix pipes. (To my 
knowledge, those don't spill to disk either, but are blocking as soon as 
Linux considers the pipe to be 'full'. Whenever that is. Or am I wrong 
here?)


The reasons for replacing them were: they consume lots of file 
descriptors, they can only be established between the parent and its 
child process (at least for anonymous pipes that's the case) and last 
but not least, I got told they still aren't fully portable. Another nice 
thing about imessages compared to unix pipes is, that it's a zero-copy 
approach.


Hope that makes my opinions and decisions clearer. Thank you for sharing 
your concerns and for explaining SLRU to me.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_config problem on Solaris 10u7 X64

2010-07-22 Thread Bjorn Munch

On 21/07 06.57, Robert Haas wrote:
 On Tue, Jul 20, 2010 at 10:52 PM, Amber guxiaobo1...@gmail.com wrote:
   I am trying to build RPostgreSQL on Solaris 10u7 X64, but have problems
  with pg_config, the configure script of RPostgreSQL checks for pg_config and
  got ?checking for pg_config... /usr/bin/pg_config?. In Solaris 10u7 X64,
  three versions of PostgreSQL are installed, there are in
  /usr/postgres/8.2(8.2.9) and /usr/postgres/8.3(8.3.3), the corresponding bin
  files are in /usr/postgres/version/bin and
  /usr/postgres/version/bin/amd64, and the libraries in /usr/bin is 8.1.11
  and it seems a 32bit, and I can?t find the 64bit version bins for 8.1.11.
  My question is how to let RPostgreSQL configure script find the 64bit
  pg_config.
 
 My first guess would be to try changing your PATH before running configure.

Yes, you should set your PATH to have /usr/postgres/version/bin/64
before /usr/bin, if you want to use the 64 bit pg_config for 8.2 or
8.3.

I think 8.1 was only included as 32 bit in Solaris.

-- 
Bjorn Munch, Release Engineer
Oracle Corp. (formerly at Sun)
Trondheim, Norway

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] accentuated letters in text-search

2010-07-22 Thread Andreas Joseph Krogh


On 07/22/2010 07:42 AM, Guillaume Lelarge wrote:

Le 21/07/2010 23:23, Andreas Joseph Krogh a écrit :
   

[...]
I was googling for how to create a text-seach-config with the following
properties:
- Map unicode accentuated letters to an un-accentuated equivalent
- No stop-words
- Lowercase all words

And came over this from -general:
http://www.techienuggets.com/Comments?tx=106813

Then after some more googling I found this:
http://www.sai.msu.su/~megera/wiki/unaccent

Any reason the unaccent dict. and function did not make it in 9.0?

 

Well, AFAICT, it's available in 9.0:

   http://www.postgresql.org/docs/9.0/static/unaccent.html
   


My contrib-foo was pretty low last night it seems, sorry for the noise...

--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Dave Page

On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com wrote:
 We need to decide what email addresses committers will use on the new
 git repository when they commit.

Are you are aware that we already have a list of approved addresses
for the committers?

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Magnus Hagander

On Thu, Jul 22, 2010 at 10:04, Dave Page dp...@pgadmin.org wrote:
 On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com wrote:
 We need to decide what email addresses committers will use on the new
 git repository when they commit.

 Are you are aware that we already have a list of approved addresses
 for the committers?

Are you referring to the mapping list for the git mirror, or something else?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Dave Page

On Thu, Jul 22, 2010 at 9:11 AM, Magnus Hagander mag...@hagander.net wrote:
 On Thu, Jul 22, 2010 at 10:04, Dave Page dp...@pgadmin.org wrote:
 On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com wrote:
 We need to decide what email addresses committers will use on the new
 git repository when they commit.

 Are you are aware that we already have a list of approved addresses
 for the committers?

 Are you referring to the mapping list for the git mirror, or something else?

Yes, the mapping list.


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication

2010-07-22 Thread Yeb Havinga


Fujii Masao wrote:

How should the synchronous replication behave when the number of connected
standby servers is less than quorum?

1. Ignore quorum. The current patch adopts this. If the ACKs from all
   connected standbys have arrived, transaction commit is successful
   even if the number of standbys is less than quorum. If there is no
   connected standby, transaction commit always is successful without
   regard to quorum.

2. Observe quorum. Aidan wants this. Until the number of connected
   standbys has become more than or equal to quorum, transaction commit
   waits.

Which is the right behavior of quorum commit? Or we should add new
parameter specifying the behavior of quorum commit?
  
Initially I also expected the quorum to behave like described by 
Aidan/option 2. Also, IMHO the name quorom is a bit short, like having 
maximum but not saying a max_something.


quorum_min_sync_standbys
quorum_max_sync_standbys

The question remains what are the sync standbys? Does it mean not-async? 
Intuitively by looking at the enumeration of replication_mode I'd think 
that the sync standbys are all standby's that operate in a not async 
mode. That would be clearer with a boolean sync (or not) and for sync 
standbys the replication_mode specified.


regards,
Yeb Havinga




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Peter Eisentraut

On tor, 2010-07-22 at 09:18 +0100, Dave Page wrote:
 On Thu, Jul 22, 2010 at 9:11 AM, Magnus Hagander mag...@hagander.net wrote:
  On Thu, Jul 22, 2010 at 10:04, Dave Page dp...@pgadmin.org wrote:
  On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com wrote:
  We need to decide what email addresses committers will use on the new
  git repository when they commit.
 
  Are you are aware that we already have a list of approved addresses
  for the committers?
 
  Are you referring to the mapping list for the git mirror, or something else?
 
 Yes, the mapping list.

The mapping list was originally composed by me on a whim based on what I
thought people's email addresses tended to be.  It wouldn't hurt to
ponder Robert's points at this time.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] managing git disk space usage

2010-07-22 Thread Peter Eisentraut

On ons, 2010-07-21 at 23:06 +0200, Dimitri Fontaine wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  This does not work as cleanly as you suppose, because some build
  objects are stored in the source tree.  configure being one of them.
  So if you switch branches, configure is rerun even in a VPATH build,
  which is undesirable.
 
 Ouch. Reading -hackers led me to thinking this had received a cleaning
 effort in the Makefiles, so that any generated file appears in the build
 directory. Sorry to learn that's not (yet?) the case.

It is, but not in the back branches.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Magnus Hagander

On Thu, Jul 22, 2010 at 11:33, Peter Eisentraut pete...@gmx.net wrote:
 On tor, 2010-07-22 at 09:18 +0100, Dave Page wrote:
 On Thu, Jul 22, 2010 at 9:11 AM, Magnus Hagander mag...@hagander.net wrote:
  On Thu, Jul 22, 2010 at 10:04, Dave Page dp...@pgadmin.org wrote:
  On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com 
  wrote:
  We need to decide what email addresses committers will use on the new
  git repository when they commit.
 
  Are you are aware that we already have a list of approved addresses
  for the committers?
 
  Are you referring to the mapping list for the git mirror, or something 
  else?

 Yes, the mapping list.

 The mapping list was originally composed by me on a whim based on what I
 thought people's email addresses tended to be.  It wouldn't hurt to
 ponder Robert's points at this time.

Agreed. And per the discussion at the developer meeting, even if we
don't limit what can be used, we should at least give committters a
chance to pick a different address from the one they are on that list
with today.


*Personally*, I'd prefer to keep using my main email address for
commits. This is what I use for all other projects (postgresql or
others) that I commit or contribute to. It's an address on a domain I
own, and fully control. It's a pretty clear indication of my
identity in the opensource world, whereas close to nobody would know
who m...@postgresql.org is. Plus, email to it tends to be delivered
much quicker and more reliably than the @postgresql.org one - though
that has improvied significantly lately.

But I can also see Roberts point. If a committer doesn't have a
*stable* address, we won't be able to track this committer through
time. Say if he changes job and gets a new address, we can start using
that one for new commits, but not for old ones. And since we grant
commit status to the *person* and not the representative of a company,
using a company email address doesn't quite match up there.

When it comes to using generic @gmail.com or whatever addresses,
that's somewhere in between. For a lot of people, those can definitely
be considered stable, because a change in employment, a move to a
different country, things like that, won't affect the email address
(which it would be if it was an ISP-specific one for example - that
might not transfer to a new country or even a new city).


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git config user.email

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 5:41 AM, Magnus Hagander mag...@hagander.net wrote:
 On Thu, Jul 22, 2010 at 11:33, Peter Eisentraut pete...@gmx.net wrote:
 On tor, 2010-07-22 at 09:18 +0100, Dave Page wrote:
 On Thu, Jul 22, 2010 at 9:11 AM, Magnus Hagander mag...@hagander.net 
 wrote:
  On Thu, Jul 22, 2010 at 10:04, Dave Page dp...@pgadmin.org wrote:
  On Wed, Jul 21, 2010 at 5:54 PM, Robert Haas robertmh...@gmail.com 
  wrote:
  We need to decide what email addresses committers will use on the new
  git repository when they commit.
 
  Are you are aware that we already have a list of approved addresses
  for the committers?
 
  Are you referring to the mapping list for the git mirror, or something 
  else?

 Yes, the mapping list.

 The mapping list was originally composed by me on a whim based on what I
 thought people's email addresses tended to be.  It wouldn't hurt to
 ponder Robert's points at this time.

 Agreed. And per the discussion at the developer meeting, even if we
 don't limit what can be used, we should at least give committters a
 chance to pick a different address from the one they are on that list
 with today.


 *Personally*, I'd prefer to keep using my main email address for
 commits. This is what I use for all other projects (postgresql or
 others) that I commit or contribute to. It's an address on a domain I
 own, and fully control. It's a pretty clear indication of my
 identity in the opensource world, whereas close to nobody would know
 who m...@postgresql.org is. Plus, email to it tends to be delivered
 much quicker and more reliably than the @postgresql.org one - though
 that has improvied significantly lately.

 But I can also see Roberts point. If a committer doesn't have a
 *stable* address, we won't be able to track this committer through
 time. Say if he changes job and gets a new address, we can start using
 that one for new commits, but not for old ones. And since we grant
 commit status to the *person* and not the representative of a company,
 using a company email address doesn't quite match up there.

 When it comes to using generic @gmail.com or whatever addresses,
 that's somewhere in between. For a lot of people, those can definitely
 be considered stable, because a change in employment, a move to a
 different country, things like that, won't affect the email address
 (which it would be if it was an ISP-specific one for example - that
 might not transfer to a new country or even a new city).

As for me, I'd much prefer to be rh...@postgresql.org than
robertmh...@gmail.com.  While it's true that I'm unlikely to lose
control of robertmh...@gmail.com, I might decide I'm no longer happy
with their service, or whatever.  Assuming I stay on the sysadmin
team's good side, rh...@postgresql.org can always be repointed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 3:01 AM, Markus Wanner mar...@bluegap.ch wrote:
 On 07/22/2010 12:11 AM, Robert Haas wrote:

 I'm not sure why merging the SLRU pools with shared_buffers would
 benefit from dynamically allocated shared memory.

 Well, I'm not sure how you'd merge SLRU pools with shared_buffers. IMO that
 inherently leads to the problem of allocating memory dynamically.

 With such an allocator, I'd say you just port one module after another to
 use that, instead of pre-allocated, fixed portions of shared memory.

Well, shared_buffers has to be allocated as one contiguous slab
because we index into it that way.  So I don't really see how
dynamically allocating memory could help.  What you'd need is a
different system for assigning buffer tags, so that a particular tag
could refer to a buffer with either kind of contents.

 I might be at (or possibly beyond) the limit of my ability to comment
 intelligently on this without looking more at what you want to use
 these imessages for, but I'm still pretty skeptical about the idea of
 storing them directly in shared memory.  It's possible, though, that I
 am all wet.

 Imessages are meant to be a replacement for unix pipes. (To my knowledge,
 those don't spill to disk either, but are blocking as soon as Linux
 considers the pipe to be 'full'. Whenever that is. Or am I wrong here?)

I think you're right about that.

 The reasons for replacing them were: they consume lots of file descriptors,
 they can only be established between the parent and its child process (at
 least for anonymous pipes that's the case) and last but not least, I got
 told they still aren't fully portable. Another nice thing about imessages
 compared to unix pipes is, that it's a zero-copy approach.

That's sort of approaching the question from the opposite end from
what I was concerned about - I was wondering why you need a unicast
message-passing system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Markus Wanner


Hi,

On 07/22/2010 01:04 PM, Robert Haas wrote:

Well, shared_buffers has to be allocated as one contiguous slab
because we index into it that way.  So I don't really see how
dynamically allocating memory could help.  What you'd need is a
different system for assigning buffer tags, so that a particular tag
could refer to a buffer with either kind of contents.


Hm.. okay, then it might not be that easy. Thanks for pointing that out.


That's sort of approaching the question from the opposite end from
what I was concerned about - I was wondering why you need a unicast
message-passing system.


Well, the initial Postgres-R approach, being based on Postgres 
6.4.something used unix pipes. I coded imessages as a replacement.


Postgres-R basically uses imessages to pass around change sets and other 
information required to keep replicas in sync. The thinking in terms of 
message passing seems to originate from the GCS, which in itself is a 
message passing system (with some nice extras and varying delivery 
guarantees).


In Postgres-R the coordinator process receives messages from the GCS, 
does some minor controlling and book-keeping, but basically passes on 
the data via imessages to a backrgound worker.


Of course, as mentioned in the bgworker patch, this could be done 
differently. Using solely shared memory, or maybe SLRU to store change 
sets. However, I certainly like the abstraction and guarantees such a 
message passing system provides. It makes things easier to reason about, 
IMO.


For another example, see the bgworker patches, steps 1 and 2, where I've 
changed the current autovacuum infrastructure to use imessages (between 
launcher and worker).


[ And I've heard saying that current multi-core CPU designs tend to like 
message passing systems. Not sure how much that applies to imessages 
and/or how it's used in bgworkers or Postgres-R, though. ]


That much about why using a unicast message-passing system.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] little mistakes in HS/SR

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 1:41 AM, Fujii Masao masao.fu...@gmail.com wrote:
 We should enclose -1 with literal tag.

A quick survey of the documentation as a whole suggests that we
enclose -1 with literal in a few places but more commonly we don't.
I have no position on whether we should do it or not, but maybe we
should try to be consistent throughout the docs?  Or at least have a
consistent rule for deciding what to do in a particular case?

 We seems to have forgotten to add the declaration of WalSndLastCycleHandler().

I've committed this part.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explicit psqlrc

2010-07-22 Thread Robert Haas

On Wed, Jul 21, 2010 at 11:06 PM, David Christensen da...@endpoint.com wrote:

 On Jul 21, 2010, at 12:31 PM, Alvaro Herrera wrote:

 Excerpts from Peter Eisentraut's message of mié jul 21 10:24:26 -0400 2010:
 On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote:
 It's tempting to propose making .psqlrc apply only in interactive
 mode, period.  But that would be an incompatibility with previous
 releases, and I'm not sure it's the behavior we want, either.

 What is a use case for having .psqlrc be read in noninteractive use?

 Even if there weren't one, why does it get applied to -f but not -c?
 They're both noninteractive.


 So not to let the thread drop, it appears that we're faced with the following 
 situation:

Hmm.  I thought we almost had consensus on changing the historical
behavior of -c.  If we do that, this all gets much simpler.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 3:21 AM, Alexander Korotkov
aekorot...@gmail.com wrote:
 On Thu, Jul 22, 2010 at 1:59 AM, Robert Haas robertmh...@gmail.com wrote:

 Ah, I see.  That's pretty compelling, I guess.  Although it still
 seems like a lot of code...

 I think there is a way to merge single-byte and multi-byte versions of
 functions without loss in performance using macros and includes (like in
 'backend/utils/adt/like.c'). Do you think it is acceptable in this case?

I'm not sure.  I'd like to get a second opinion from someone else on
which way to go with this, but unfortunately 2 or 3 of the main
committers are on vacation at the moment.

Does anyone else have an opinion on what to do about this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Greg Smith


Markus Wanner wrote:

On 07/20/2010 09:05 PM, Alvaro Herrera wrote:

Hmm, deriving code from a paper published by IBM sounds like bad news --
who knows what patents they hold on the techniques there?


Yeah, that might be an issue. Note, however, that the lock-based 
variant differs substantially from what's been published. And I sort 
of doubt their patents covers a lot of stuff that's not lock-free-ish.


There's a fairly good mapping of what techniques are patented and which 
were only mentioned in research in the Sun dynamic memory patent at 
http://www.freepatentsonline.com/7328316.html ; that mentions an earlier 
paper by the author of the technique Markus is using, but this was from 
before that one was written.  It looks like Sun has a large portion of 
the patent portfolio in this area, which is particularly troublesome now.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-22 Thread Alexander Korotkov

Such version with macros and includes can look like this:

#ifdef MULTIBYTE
#define NEXT_X (x+= char_lens[i-1])
#define NEXT_Y (y+= y_char_len)
#define CMP (char_cmp(x, char_lens[i-1], y, y_char_len))
#else
#define NEXT_X (x++)
#define NEXT_Y (y++)
#define CMP (*x == *y)
#endif

static int
levenshtein_internal(text *s, text *t, int ins_c, int del_c, int sub_c)
{
intm,
n,
d;
int   *prev;
int   *curr;
inti,
j;
const char *x;
const char *y;
#ifdef MULTIBYTE
int   *char_lens;
inty_char_len;
#endif

/*
 * We should calculate number of characters for multibyte encodings
 */
m = pg_mbstrlen_with_len(VARDATA_ANY(s), VARSIZE_ANY_EXHDR(s));
n = pg_mbstrlen_with_len(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));

/*
 * We can transform an empty s into t with n insertions, or a non-empty
t
 * into an empty s with m deletions.
 */
if (m == 0)
return n * ins_c;
if (n == 0)
return m * del_c;

/*
 * For security concerns, restrict excessive CPU+RAM usage. (This
 * implementation uses O(m) memory and has O(mn) complexity.)
 */
if (m  MAX_LEVENSHTEIN_STRLEN ||
n  MAX_LEVENSHTEIN_STRLEN)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg(argument exceeds the maximum length of %d bytes,
MAX_LEVENSHTEIN_STRLEN)));

/* One more cell for initialization column and row. */
++m;
++n;

/*
 * Instead of building an (m+1)x(n+1) array, we'll use two different
 * arrays of size m+1 for storing accumulated values. At each step one
 * represents the previous row and one is the current row of the
 * notional large array.
 * For multibyte encoding we'll also store array of lengths of
 * characters of first string in order to avoid great number of
  * pg_mblen calls.
 */
#ifdef MULTIBYTE
prev = (int *) palloc(3 * m * sizeof(int));
curr = prev + m;
char_lens = prev + 2 * m;
for (i = 0, x = VARDATA_ANY(s); i  m - 1; i++)
{
char_lens[i] = pg_mblen(x);
x += char_lens[i];
}
char_lens[i] = 0;
#else
prev = (int *) palloc(2 * m * sizeof(int));
curr = prev + m;
#endif

/* Initialize the previous row to 0..cols */
for (i = 0; i  m; i++)
prev[i] = i * del_c;

/*
 * For multibyte encoding we should increase x and y pointers by the
 * results of pg_mblen function. Also we should use CHAR_CMP macros
 * for character comparison.
 */
for (y = VARDATA_ANY(t), j = 1; j  n; NEXT_Y, j++)
{
int   *temp;
#ifdef MULTIBYTE
y_char_len = pg_mblen(y);
#endif

curr[0] = j * ins_c;

for (x = VARDATA_ANY(s), i = 1; i  m; NEXT_X, i++)
{
intins;
intdel;
intsub;

ins = prev[i] + ins_c;
del = curr[i - 1] + del_c;
sub = prev[i - 1] + (CMP ? 0 : sub_c);
curr[i] = Min(ins, del);
curr[i] = Min(curr[i], sub);
}

temp = curr;
curr = prev;
prev = temp;
}

d = prev[m - 1];

/*
 * Because the final value was swapped from the previous row to the
 * current row, that's where we'll find it.
 */
return d;
}

What do you thing about it?


With best regards,
Alexander Korotkov.

Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-22 Thread Alexander Korotkov

On Thu, Jul 22, 2010 at 1:59 AM, Robert Haas robertmh...@gmail.com wrote:

 Ah, I see.  That's pretty compelling, I guess.  Although it still
 seems like a lot of code...

I think there is a way to merge single-byte and multi-byte versions of
functions without loss in performance using macros and includes (like in
'backend/utils/adt/like.c'). Do you think it is acceptable in this case?


With best regards,
Alexander Korotkov.

Re: [HACKERS] Query optimization problem

2010-07-22 Thread Zotov


 20.07.2010 18:31, Robert Haas:

According to the EXPLAIN ANALYZE output, your slow query is
executing in 0.007 ms, and your fast query is executing in 0.026 ms
(i.e. not as quickly as the slow query).  Since you mention that it
takes 7 s further down, I suspect this is not the real EXPLAIN ANALYZE
output on the real data that you're having a problem with.  You might
have better luck if you post the actual EXPLAIN ANALYZE output here.
Incidentally, sorry for not responding sooner to your private email -
I was on vacation last week.  But please do keep all replies on-list
so that everyone can comment.

All that having been said, I think the issue here is that the query
planner isn't inferring that d1.ID=some constant  implies d2.ID=some
constant, even though there's a join clause d1.ID=d2.ID.  I'm not
really sure why it isn't doing that...  I suspect Tom Lane is the only
person who can comment intelligently on that, and he's away this week
(but if anyone else has an idea, feel free to jump in...).
   
Yes, I have a mistake when EXPLAIN ANALYZE without data.. It create 
another plan, because seq scan were faster. Now I send results on real 
data (1 million rows)


*Slow Query:*
-
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT 
d1.ID, d2.ID

test-# FROM DocPrimary d1
test-#  JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d1.ID=234409763
test(# ) OR (d2.ID=234409763);

QUERY PLAN

--
 Merge Join  (cost=.72..53967.30 rows=1 width=8) (actual 
time=6697.847..6697.847 rows=0 loops=1)

   Output: d1.id, d2.id
   Merge Cond: (d1.id = d2.basedon)
   Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
   -  Index Scan using id_pk on public.docprimary d1  
(cost=0.00..37224.48 rows=1076842 width=4) (actual time=0.016..3184.474 
rows=1076795 loops=1)

 Output: d1.id, d1.basedon
   -  Index Scan using basedon_idx on public.docprimary d2  
(cost=0.00..46245.14 rows=1076842 width=8) (actual time=0.011..1861.570 
rows=235362 loops=1)

 Output: d2.id, d2.basedon
 Total runtime: 6697.968 ms
(9 rows)
---

*Fast Query:*
--
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT 
d1.ID, d2.ID

test-# FROM DocPrimary d1
test-#  JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d1.ID=234409763 and d2.BasedOn=234409763
test(# ) OR (d2.ID=234409763);
   QUERY PLAN
-
 Nested Loop  (cost=9.01..422.70 rows=1 width=8) (actual 
time=0.145..0.145 rows=0 loops=1)

   Output: d1.id, d2.id
   Join Filter: (((d1.id = 234409763) AND (d2.basedon = 234409763)) OR 
(d2.id = 234409763))
   -  Bitmap Heap Scan on public.docprimary d2  (cost=9.01..136.90 
rows=34 width=8) (actual time=0.141..0.141 rows=0 loops=1)

 Output: d2.id, d2.basedon
 Recheck Cond: ((d2.basedon = 234409763) OR (d2.id = 234409763))
 -  BitmapOr  (cost=9.01..9.01 rows=34 width=0) (actual 
time=0.136..0.136 rows=0 loops=1)
   -  Bitmap Index Scan on basedon_idx  (cost=0.00..4.62 
rows=33 width=0) (actual time=0.078..0.078 rows=0 loops=1)

 Index Cond: (d2.basedon = 234409763)
   -  Bitmap Index Scan on id_pk  (cost=0.00..4.38 rows=1 
width=0) (actual time=0.051..0.051 rows=0 loops=1)

 Index Cond: (d2.id = 234409763)
   -  Index Scan using id_pk on public.docprimary d1  (cost=0.00..8.39 
rows=1 width=4) (never executed)

 Output: d1.id, d1.basedon
 Index Cond: (d1.id = d2.basedon)
 Total runtime: 0.233 ms
(15 rows)
--

I use another fast query:
SELECT d1.ID, d2.ID
FROM DocPrimary d1
 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
WHERE (*d1.ID=234409763 and d2.BasedOn=234409763*) OR (d2.ID=234409763)

Bolded part of query was d2.BasedOn=234409763 I replace it because it 
can help find way to optimize it automaticaly


So sorry, but i can`t give programmer to do something in Postgres, because
we don`t use it now as supported DB, we think about it and do some tests.
It`s very hard and slow task (support another DB, now we use FireBird, 
and plan use another DB, and look for Postgres and MSSQL, maybe support 
it both as free and commercial DB solution)
And in our department only 4 (with me) programmers who can programm on 
System Level, and only one of us (doesn`t me) know C/C++
We all programming on Delphi... If we choose Postgres as free DB 
platform then I can think about give programmers for Postgress development.


In so large letters my

[HACKERS] SQL/MED security

2010-07-22 Thread Pavel Stehule

Hello

I see a SQL/MED security very unclean - it have to be very vell documented :(

I have a database on port 5401. With user Tom - it require a password

[pa...@nemesis pgsql]$ psql-dev1 postgres -U tom
Password for user tom:
Timing is on.
psql-dev1 (9.0devel)
Type help for help.

postgres=

I can't to do selects to table from this database as some non superuser

create foreign table test_table(id numeric) server dev1;

postgres=# create user mapping for public server dev1;
CREATE USER MAPPING
Time: 2,507 ms
postgres=# select count(*) from test_table;
  count
-
 102
(1 row)

postgres=# create user mapping for tom server dev1 options (user
'tom', password 'tom');
CREATE USER MAPPING
Time: 32,929 ms
postgres=# set role to tom;
SET
Time: 0,450 ms
postgres= select * from test_table;
ERROR:  password is required
DETAIL:  Non-superuser cannot connect if the server does not request a password.
HINT:  Target server's authentication method must be changed.

What I doing wrong?

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Markus Wanner


Greg,

On 07/22/2010 03:59 PM, Greg Smith wrote:

There's a fairly good mapping of what techniques are patented and which
were only mentioned in research in the Sun dynamic memory patent at
http://www.freepatentsonline.com/7328316.html ; that mentions an earlier
paper by the author of the technique Markus is using, but this was from
before that one was written. It looks like Sun has a large portion of
the patent portfolio in this area, which is particularly troublesome now.


Thanks for the pointer, very helpful.

Anybody ever checked jemalloc, or any other OSS allocator out there 
against these patents?


Remembering similar patent-discussions, it might be better to not bother 
too much and just go with something widely used, based on the assumption 
that such a thing is going to enjoy broad support in case of an attack 
from a patent troll.


What do you think? What'd be your favorite allocator?

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] knngist - 0.8

2010-07-22 Thread Teodor Sigaev


http://www.sigaev.ru/misc/builtin_knngist_core-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_itself-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_proc-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_pg_trgm-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_btree_gist-0.8.gz

Changes:
* pg_amop has boolean column amoporder which points to clause's usage of
  operator
* Syntax of CREATE OPERATOR CLASS
 CREATE OPERATOR CLASS ...
 [ORDER] OPERATOR 
  ORDER OPERATOR is marked with pg_amop.amoporder = true
* Bool-returning operator could not be used as ORDER OPERATOR, but type of
  returning value still should have a default Btree operator class.
* Added flag SK_ORDER to SkanKey flag to indicate order operation, so access
  methods (only GiST right now) should check this flag (in previous versions of
  patch GiST checked returned value of operator's function)

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, part.2

2010-07-22 Thread Robert Haas

2010/7/14 KaiGai Kohei kai...@ak.jp.nec.com:
 The attached patch is a part of efforts to support security label
 on database objects.

This is similar to what I had in mind as a design for this feature,
but I have some gripes:

1. I am inclined to suggest the syntax SECURITY LABEL ON ... IS ...,
following COMMENT ON (it's also somewhat similar to the GRANT syntax).
 While the hook in ExecCheckRTPerms() will only allow meaningful
permissions checks on the use of relations, there will presumably be
ongoing demand to add additional hooks to cover other types of
objects, and I'd rather add label support all at once rather than
bit-by-bit.  I also think that the SECURITY LABEL syntax is more
future-proof; we don't need to worry about conflicts in other parts of
the grammar.

2. Similarly, the design of the hook in secabel.h is way too
short-sighted and won't scale to any other object type.  We don't need
or want one hook per object type here.  Use an ObjectAddress.

3. I am firmly of the view that we want to allow multiple security
providers.  I think the way this should work here is that we should
keep a list somewhere of security providers known to the system, which
loadable modules should add themselves to.  Each such security
provider should be represented by a struct containing, at least, a
name and a function that gets called on relabel.  The labels should be
stored in the catalog.  That way there is never any possibility of one
security provider getting a label that was originally applied by some
other security provider.  If we were storing these labels in pg_class
or pg_attribute or similar, the storage cost for this might be worth
worrying about, but as this is a separate catalog, it's not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bitmap indexes - performance

2010-07-22 Thread Simon Riggs

On Thu, 2010-07-01 at 16:30 +, Leonardo F wrote:

 To sum up: IMHO nor improvements in memory usage nor 
 in startup time would be good reasons to switch to bitmap
 indexes... but bulk index creation time (10 minutes to index
 what it takes 60 minutes with btree... and maybe more if tables
 are bigger) and (maybe) index disk space might be...
 but I'm not 100% convinced...

Are you intending to work on this for 9.1? 

-- 
 Simon Riggs   www.2ndQuadrant.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Copy path in Dynamic programming

2010-07-22 Thread vamsi krishna

Hi everybody

I am doing a modification to Dynamic programming algorithm in postgreSQL. I
want to re-use a plan of one  joinrels for another of the same level.
For example,

if lev=5 , and let's say there are two combinations setA = {1,2,3,4,5} and
set B={6,7,8,9,10}.

I want to reuse the plan of {1.2,3,4,5} for {6,7,8,9,10}.

But for this I need to create a relation that holds all the paths for
{6,7,8,9,10}.

I cannot just copy the relation holding join paths from setA. I need to
create a new join relations using build_join_rel(). But I cannot do
add_paths_to_joinrel() because that is the default path creation procedure.
Instead I must be copying the paths of joinrel fo setA.

So what I am doing now is :
1) Access the join relation for setA = {1,2,3,4,5}.

2) Pathlist = pathlist of joinrel(A).

3) foreach path p in pathlist of joinrel(A){
 create new path new_p from path p.
 left_child(new_p) = recursively copy from left child(p)
 right_child(new_p) = recursively copy from right child(p).
 }


4)But to do step 3 I need to know how to create pathkeys and I need to do
the copy recursively, as I am trying to copy the path tree from top to
bottom.

Errors I get in doing this:
   ERROR:  variable not found in subplan target lists

Please give some insight. And please discuss with me on how to go about
doing the path copy.

Thanks
Vamsi

[HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE

2010-07-22 Thread Marko Tiikkaja


Hi,

From the code I understood that when executing a query normally, in 
READ COMMITTED mode, we take a new snapshot for every query that comes 
out of rewrite.  But in an EXPLAIN ANALYZE, we only update the CID of 
the snapshot taken when the EXPLAIN started.


Did I misunderstand the code?  And if I didn't, why do we do this 
differently?



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CommitFest 2010-07 week one progress report

2010-07-22 Thread Kevin Grittner

New numbers on where we are with this CommitFest:
 
71 patches were submitted
 3 patches were withdrawn (deleted) by their authors
--
68 total patches currently in the application
--
 3 committed to 9.0
--
65 9.1 patches
--
 1 rejected
 5 returned with feedback
11 committed for 9.1
--
17 9.1 patches disposed
--
48 pending
 8 ready for committer
--
40 will still need reviewer attention
 9 waiting on author to respond to review
--
31 need review before further action
13 Needs Review patches don't have a reviewer assigned
--
18 patches have reviews due within four days or less
 

I had particular concerns about the synchronous replication patches,
because I've seen competing patches like that lead to significant
wheel-spinning and duplication of effort.  To try to minimize that,
I asked Yeb to do a preliminary review of both.  He has not
completed reporting on that, but expects to finish within a day or
two.  My hope is that all parties who want to move this forward will
join efforts and bring their ideas to bear on a single patch.
 
I see that nobody has signed up for the GSoC patch regarding
materialized views.  Do we have an official mentor for this
effort?  Would it be a good or bad idea for that person to do the
review?
 
Although we've had some discussion around Markus Wanner's WIP
refactoring patches and the prerequisite miscellaneous patches,
there's nobody down as a Reviewer for any of them.  I understand
that the six WIP patches are there for feedback, not with
expectation of a commit in this CF, but I'm less clear about the two
prerequisite patches.
 
The WIP patch for serializable transactions with predicate locking
is big, but there's no expectation of a commit this CF, and Joe
assures me he's working on it every day; so nobody should be
concerned about the lack of a review post on that so far.
 
I am hopeful that in the next week we can clear a lot of the pending
patches which have already had some review, and get someone on every
unclaimed patch.
 
-Kevin
 
 
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 68 patches were submitted
  3 patches were withdrawn (deleted) by their authors
 --
 65 total patches currently in the application
 --
  3 committed to 9.0
 --
 62 9.1 patches
 --
  1 rejected
  3 returned with feedback
  1 committed for 9.1
 --
 57 pending
 10 ready for committer
 --
 47 will still need reviewer attention
  6 waiting on author to respond to review
 --
 41 need review before further action
 23 patches Needs Review patches don't have a reviewer assigned
 --
 18 patches have reviews due within four days


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Alvaro Herrera

Excerpts from Markus Wanner's message of jue jul 22 08:49:29 -0400 2010:

 Of course, as mentioned in the bgworker patch, this could be done 
 differently. Using solely shared memory, or maybe SLRU to store change 
 sets. However, I certainly like the abstraction and guarantees such a 
 message passing system provides. It makes things easier to reason about, 
 IMO.

FWIW I don't think you should be thinking in replacing imessages with
SLRU.  I rather think you should be thinking in how can you implement
the imessages API on top of SLRU.  So as far as the coordinator and
background worker are concerned, there wouldn't be any difference --
they keep using the same API they are using today.

Also let me repeat my earlier comment about imessages being more similar
to multixact than to notify.  The content of each multixact entry is
just an arbitrary amount of bytes.  If imessages are numbered from a
monotonically increasing sequence, it should be possible to use a very
similar technique, and perhaps you should be able to reduce locking
requirements as well (write messages with only a shared lock, after
you've determined and reserved the area you're going to write).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RRR] CommitFest 2010-07 week one progress report

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 2:09 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 48 pending
  8 ready for committer

Note that all of the patches except one which are marked as Ready for
Committer were either submitted by a committer, or the reviewer is a
committer.  Of those, 3 are mine.  Two of those are patches I'm
postponing committing at the request of Tom Lane to avoid making the
9.1 and 9.0 trees drift too much before 9.0 is out.  However, given
the rapidly decreasing frequency of commits to the 9.0 branch, I'm not
sure how much longer it makes sense to hold off: I'm currently
thinking I'll commit those two after beta4 wraps.  The last of those
is the 5-key syscaches patch, which only makes sense if knngist needs
it, so it may get bumped to the next CF, as knngist was not submitted
in time for this CF.  The other 4 patches were either submitted or
reviewed by Simon Riggs or Itagaki Takahiro, and I am presuming they
will commit them themselves unless I hear otherwise (in which case I'm
happy to pick them up).  That leaves just one patch that's actually
been reviewed and is ready to be picked up by a committer, so we
actually have a bit of a pipelines stall here.

 18 patches have reviews due within four days or less

This is a very big number... I hope some of these reviews start to
come in soon.  I think this is where our bottleneck is at present.

 Although we've had some discussion around Markus Wanner's WIP
 refactoring patches and the prerequisite miscellaneous patches,
 there's nobody down as a Reviewer for any of them.  I understand
 that the six WIP patches are there for feedback, not with
 expectation of a commit in this CF, but I'm less clear about the two
 prerequisite patches.

It seems to me that the discussion is Alvaro and I are having with
Markus is tilted toward having Markus rewrite the imessages interface
to use an SLRU, in which case neither of them will go in this CF.  I'm
hopeful that Heikki or Tom will comment on this also when they get
back from their vacations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Copy path in Dynamic programming

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 12:38 PM, vamsi krishna
vamsikrishna1...@gmail.com wrote:
 if lev=5 , and let's say there are two combinations setA = {1,2,3,4,5} and
 set B={6,7,8,9,10}.

 I want to reuse the plan of {1.2,3,4,5} for {6,7,8,9,10}.

I don't think that makes any sense.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-22 Thread Markus Wanner


Hi,

On 07/22/2010 08:31 PM, Alvaro Herrera wrote:

FWIW I don't think you should be thinking in replacing imessages with
SLRU.  I rather think you should be thinking in how can you implement
the imessages API on top of SLRU.


Well, I'm rather comparing SLRU with the dynamic allocator. So far I'm 
unconvinced that SLRU would be a better base for imessages than a 
dynamic allocator. (And I'm arguing that SLRU should use a dynamic 
allocator underneath).



So as far as the coordinator and
background worker are concerned, there wouldn't be any difference --
they keep using the same API they are using today.


Agreed, the imessages API to the upper layer doesn't need to care about 
the underlying stuff.



Also let me repeat my earlier comment about imessages being more similar
to multixact than to notify.  The content of each multixact entry is
just an arbitrary amount of bytes.  If imessages are numbered from a
monotonically increasing sequence,


Well, there's absolutely no need to serialize imessages. So they don't 
currently carry any such number. And opposed to multixact entries, they 
are clearly directed at exactly one single consumer. Every consumer has 
its own receive queue. Sending messages concurrently to different 
recipients may happen completely parallelized, without any (b)locking in 
between.


The dynamic allocator is the only part of the chain which might need to 
do some locking to protect the shared resource (memory) against 
concurrent access. Note, however, that wamalloc (as any modern dynamic 
allocator) is parallelized to some extent, i.e. concurrent malloc/free 
calls don't necessarily need to block each other.



it should be possible to use a very
similar technique, and perhaps you should be able to reduce locking
requirements as well (write messages with only a shared lock, after
you've determined and reserved the area you're going to write).


Writing to the message is currently (i.e. imessages-on-dynshmem) done 
without *any* kind of lock held. So that would rather increase locking 
requirements and lower parallelism, I fear.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: to_string, to_array functions

2010-07-22 Thread David Fetter

On Wed, Jul 21, 2010 at 02:38:17PM -0400, Merlin Moncure wrote:
 On Wed, Jul 21, 2010 at 2:28 PM, Robert Haas robertmh...@gmail.com wrote:
  Yeah, I'd like some more votes, too.  Aside from what I suggested
  (array_join/array_split), I think my favorite is your #5.
 
 -1 for me for any name that is of the form of:
 type_operation();
 
 we don't have bytea_encode, array_unnest(), date_to_char(), etc.  the
 non-internal ones that we do have (mostly array funcs), are improperly
 named imo.  this is sql, not c.  suppose we want to extend string
 serialization to row types?
 
 why not serialize/unserialize?

Because that's not what the function actually does.

FWIW, I'm for (im|ex)plode, as join()/split(), which I'd otherwise
like, would run into too many issues with JOIN.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RRR] CommitFest 2010-07 week one progress report

2010-07-22 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
  18 patches have reviews due within four days or less
 
 This is a very big number... I hope some of these reviews start to
 come in soon.  I think this is where our bottleneck is at present.
 
Based on off-list emails, I expect most of these to clear by this
weekend.  Part of this was caused by people reserving several
patches up front, and posting reviews on some but just now getting
to others; part has been caused by people traveling, and not being
at home base to work on things; part has been due to high priority
non-PostgreSQL issues taking people away from reviewing for a few
days; and the predicate locking/serializable patch is just too big
to review in a few days and I didn't bother taking it out of the
count.
 
You'd probably feel better about things if you had read all the
off-list emails.
 
Not that we couldn't use another reviewer or two.  I'm still
welcoming volunteers!
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] need more ALTER TABLE guards for typed tables

2010-07-22 Thread Peter Eisentraut

On ons, 2010-07-21 at 15:48 -0400, Alvaro Herrera wrote:
 Excerpts from Peter Eisentraut's message of mié jul 21 15:18:58 -0400 2010:
  After some investigation I figured that I need to add two more checks
  into the ALTER TABLE code to prevent certain types of direct changes to
  typed tables (see attached patch).
  
  But it's not clear to me whether such checks should go into the Prep
  or the Exec phases.  Prep seems more plausible to me, but some
  commands such as DropColumn don't have a Prep handler.  A clarification
  would be helpful.
 
 I think if there's no Prep phase, you should add it.  I don't think it
 makes sense to have this kind of check in Exec.

OK, here is my patch for this.  (should go into 9.0 and 9.1)

Index: src/backend/commands/tablecmds.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/tablecmds.c,v
retrieving revision 1.332
diff -u -3 -p -r1.332 tablecmds.c
--- src/backend/commands/tablecmds.c	6 Jul 2010 19:18:56 -	1.332
+++ src/backend/commands/tablecmds.c	22 Jul 2010 14:53:24 -
@@ -288,6 +288,7 @@ static void ATExecSetOptions(Relation re
  Node *options, bool isReset);
 static void ATExecSetStorage(Relation rel, const char *colName,
  Node *newValue);
+static void ATPrepDropColumn(Relation rel, bool recurse, AlterTableCmd *cmd);
 static void ATExecDropColumn(List **wqueue, Relation rel, const char *colName,
  DropBehavior behavior,
  bool recurse, bool recursing,
@@ -327,7 +328,8 @@ static void ATExecEnableDisableTrigger(R
 		   char fires_when, bool skip_system);
 static void ATExecEnableDisableRule(Relation rel, char *rulename,
 		char fires_when);
-static void ATExecAddInherit(Relation rel, RangeVar *parent);
+static void ATPrepAddInherit(Relation child_rel);
+static void ATExecAddInherit(Relation child_rel, RangeVar *parent);
 static void ATExecDropInherit(Relation rel, RangeVar *parent);
 static void copy_relation_data(SMgrRelation rel, SMgrRelation dst,
    ForkNumber forkNum, bool istemp);
@@ -2499,10 +2501,8 @@ ATPrepCmd(List **wqueue, Relation rel, A
 			break;
 		case AT_DropColumn:		/* DROP COLUMN */
 			ATSimplePermissions(rel, false);
+			ATPrepDropColumn(rel, recurse, cmd);
 			/* Recursion occurs during execution phase */
-			/* No command-specific prep needed except saving recurse flag */
-			if (recurse)
-cmd-subtype = AT_DropColumnRecurse;
 			pass = AT_PASS_DROP;
 			break;
 		case AT_AddIndex:		/* ADD INDEX */
@@ -2579,6 +2579,12 @@ ATPrepCmd(List **wqueue, Relation rel, A
 			/* No command-specific prep needed */
 			pass = AT_PASS_MISC;
 			break;
+		case AT_AddInherit:		/* INHERIT */
+			ATSimplePermissions(rel, false);
+			/* This command never recurses */
+			ATPrepAddInherit(rel);
+			pass = AT_PASS_MISC;
+			break;
 		case AT_EnableTrig:		/* ENABLE TRIGGER variants */
 		case AT_EnableAlwaysTrig:
 		case AT_EnableReplicaTrig:
@@ -2591,8 +2597,7 @@ ATPrepCmd(List **wqueue, Relation rel, A
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
 		case AT_DisableRule:
-		case AT_AddInherit:		/* INHERIT / NO INHERIT */
-		case AT_DropInherit:
+		case AT_DropInherit:	/* NO INHERIT */
 			ATSimplePermissions(rel, false);
 			/* These commands never recurse */
 			/* No command-specific prep needed */
@@ -3568,6 +3573,11 @@ static void
 ATPrepAddColumn(List **wqueue, Relation rel, bool recurse,
 AlterTableCmd *cmd)
 {
+	if (rel-rd_rel-reloftype)
+		ereport(ERROR,
+(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg(cannot add column to typed table)));
+
 	/*
 	 * Recurse to add the column to child classes, if requested.
 	 *
@@ -3616,11 +3626,6 @@ ATExecAddColumn(AlteredTableInfo *tab, R
 	Form_pg_type tform;
 	Expr	   *defval;
 
-	if (rel-rd_rel-reloftype)
-		ereport(ERROR,
-(errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg(cannot add column to typed table)));
-
 	attrdesc = heap_open(AttributeRelationId, RowExclusiveLock);
 
 	/*
@@ -4326,6 +4331,19 @@ ATExecSetStorage(Relation rel, const cha
  * correctly.)
  */
 static void
+ATPrepDropColumn(Relation rel, bool recurse, AlterTableCmd *cmd)
+{
+	if (rel-rd_rel-reloftype)
+		ereport(ERROR,
+(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg(cannot drop column from typed table)));
+
+	/* No command-specific prep needed except saving recurse flag */
+	if (recurse)
+		cmd-subtype = AT_DropColumnRecurse;
+}
+
+static void
 ATExecDropColumn(List **wqueue, Relation rel, const char *colName,
  DropBehavior behavior,
  bool recurse, bool recursing,
@@ -4337,11 +4355,6 @@ ATExecDropColumn(List **wqueue, Relation
 	List	   *children;
 	ObjectAddress object;
 
-	if (rel-rd_rel-reloftype)
-		ereport(ERROR,
-(errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg(cannot drop column from typed table)));
-
 	/* At top level, permission check was done in ATPrepCmd, else do it */
 	if (recursing)
 		ATSimplePermissions(rel, false);
@@ -5788,6 +5801,11 @@ ATPrepAlterColumnType(List **wqueue,

Re: [HACKERS] [9.1] pg_stat_get_backend_server_addr

2010-07-22 Thread Peter Eisentraut

On ons, 2010-07-21 at 22:12 -0700, Jeff Davis wrote:
 The two functions aren't perfectly symmetric, because 
 pg_stat_get_backend_server_port() returns -1 if it's a unix socket,
 and
 pg_stat_get_backend_server_addr() returns NULL (which is also
 overloaded
 to mean that you don't have permissions). So, perhaps it's better to
 just have pg_stat_get_backend_server_addr(), which is the one you
 want,
 anyway. 

This mirrors exactly the pg_stat_get_backend_client_* behaviors.  I
don't much like them either, but I think it'd be worse to make it
inconsistent.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [JDBC] Trouble with COPY IN

2010-07-22 Thread Kris Jurka



Per discussion and investigation on the -jdbc list, the server appears to 
violate the frontend/backend protocol when binary copy data is sent to the 
server.  Upon receiving the binary copy end of data marker (a -1 field 
count), the server immediately responds with CommandComplete and 
ReadyForQuery without waiting for the frontend to issue CopyDone or 
CopyFail.  This confuses the JDBC driver as it doesn't think the command 
sequence should have finished yet.


Attached is a patch to make the server continue to consume protocol data 
until instructed to stop by the client in the same way as copying text 
data to the server currently works.


http://www.postgresql.org/docs/8.4/static/protocol-flow.html#PROTOCOL-COPY
http://www.postgresql.org/docs/8.4/static/sql-copy.html

Kris Jurka*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2058,2069  CopyFrom(CopyState cstate)
int16   fld_count;
ListCell   *cur;
  
!   if (!CopyGetInt16(cstate, fld_count) ||
!   fld_count == -1)
{
done = true;
break;
}
  
if (fld_count != attr_count)
ereport(ERROR,
--- 2058,2087 
int16   fld_count;
ListCell   *cur;
  
!   if (!CopyGetInt16(cstate, fld_count))
{
done = true;
break;
}
+   
+   if (fld_count == -1)
+   {
+   /*
+* Reached EOF.  In protocol version 3, we 
should ignore
+* anything after the end of copy data marker 
up to the
+* protocol end of copy data (CopyDone/Fail).
+*/
+   if (cstate-copy_dest == COPY_NEW_FE)
+   {
+   do
+   {
+   cstate-raw_buf_index = 
cstate-raw_buf_len;
+   } while (CopyLoadRawBuf(cstate));
+   continue;
+   }
+   done = true;
+   break;
+   }
  
if (fld_count != attr_count)
ereport(ERROR,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add column if not exists (CINE)

2010-07-22 Thread Kjell Rune Skaaraas

Hello,

At least from a performance point of view CINE should never cause a table 
rewrite, it should either execute as a plain CREATE or as nothing. I don't 
mind if the CINE fails if the column already exists but with a different 
definition, so maybe it could be worded differently to make it clearer what you 
get? How about something like:

ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER
a) doesn't exist = create
b) exists and matches = nothing
c) exists and doesn't match = error

if COR semantics should ever be implmented they could be

ALTER TABLE foo ADD OR REPLACE COLUMN bar INTEGER
a) doesn't exist = create
b) exists and matches = nothing
c) exists and doesn't match = replace

However, I don't want it to fail unless there's an explicit conflict, because I 
tend to modify the columns later:
ALTER TABLE foo ADD COLUMN bar INTEGER
ALTER TABLE foo ALTER COLUMN bar SET DEFAULT 0
ALTER TABLE foo ALTER COLUMN bar SET NOT NULL
ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER -- succeed or fail?

Personally, I'm only interested to match on TYPE so possibly:
ALTER TABLE foo ADD OR MATCH TYPE COLUMN bar INTEGER -- succeed
ALTER TABLE foo ADD OR MATCH [ALL] COLUMN bar INTEGER -- fail

To be honest, I think this becomes more complicated than a CINE, but as I felt 
that got a rather lukewarm reception maybe this sounds better. The syntax 
leaves it open for COR later, and the matching code should be useful to 
determine if the COR actually needs to do a REPLACE. Opinions?

Regards,
Kjell Rune

--- Den tor 2010-07-22 skrev Simon Riggs si...@2ndquadrant.com:

 Fra: Simon Riggs si...@2ndquadrant.com
 Emne: Re: [HACKERS] Add column if not exists (CINE)
 Til: Tom Lane t...@sss.pgh.pa.us
 Kopi: Robert Haas robertmh...@gmail.com, Heikki Linnakangas 
 heikki.linnakan...@enterprisedb.com, Andrew Dunstan 
 and...@dunslane.net, Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp, 
 Kjell Rune Skaaraas kjell...@yahoo.no, pgsql-hackers@postgresql.org
 Dato: Torsdag 22. juli 2010 02.43
 On Wed, 2010-04-28 at 21:15 -0400,
 Tom Lane wrote:
 
  I still say
  that COR rather than CINE semantics would be
 appropriate for columns.
 
 Viewed from a locking perspective, I would disagree.
 
 COR semantics force a table rewrite, in certain cases. That
 makes it
 hard to predict externally how long the command will run
 for.
 
 As a DBA, I would want a command that executes without
 rewrite (if
 appropriate) or does nothing.
 
 Predictable behaviour is the most important concern.
 
 That isn't necessarily an argument in favour of CINE, which
 seems
 slightly less clear about what we might expect from that,
 in my reading
 at least.
 
 -- 
  Simon Riggs       
    www.2ndQuadrant.com
  PostgreSQL Development, 24x7 Support, Training and
 Services
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: psql: edit function, show function commands patch

2010-07-22 Thread Jan Urbański

On 21/07/10 14:43, Pavel Stehule wrote:
 Hello
 
 I am sending a actualised patch.

Hi, thanks!

 I understand to your criticism about line numbering. I have to
 agree. With line numbering the patch is longer. I have a one
 significant reason for it.

   CREATE OR REPLACE FUNCTION public.foo()    RETURNS integer 
    LANGUAGE plpgsql   AS $function$ 1  begin 2return 
 10/0; 3  end;   $function$
 
 postgres=# select foo(); ERROR:  division by zero CONTEXT:  SQL 
 statement SELECT 10/0 PL/pgSQL function foo line 2 at RETURN 
 postgres=#

OK, that convinced me, and also others seem to like the feature. So I'll
just make code remarks this time.


In the \e handling code, if the file name was given and there is no line
number, an uninitialised variable will be passed to do_edit. I see that
you want to avoid passing a magic number to do_edit, but I think you
should just treat anything 0 as that magic value, initialise lineno
with -1 and simplify all these nested ifs.

Typo in a comment:
when wirst row of function - when first row of function

I think the #define for DEFAULT_BODY_SEPARATOR should either be at the
beginning of the file (and then also used in \ef handling) or just
hardcoded in both places.

Any reason why DEFAULT_NAVIGATION_COMMAND has a space in front of it?

Some lines have 80 characters.

If you agree that a -1 parameter do do_edit will mean no lineno, then
I think you can change get_lineno_for_navigation to not take a
backslashResult argument and signal errors by returning -1.

In get_lineno_for_navigation you will have to protect the large comment
block with /*-- otherwise pgindent will reflow it.

PSQL_NAVIGATION_COMMAND needs to be documented in the SGML docs.


Uff, that's all from me, sorry for bringing up all these small issues, I
hope this will go in soon!

I'm going to change it to waiting on author again, mainly because of the
uninitialised variable in \d handling, the rest are just stylistic nitpicks.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: psql: edit function, show function commands patch

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 6:56 PM, Jan Urbański wulc...@wulczer.org wrote:
 the rest are just stylistic nitpicks.

But, if the patch author doesn't fix them, the committer has to, so
your nitpicking is much appreciated, at least by me!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [JDBC] Trouble with COPY IN

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 5:34 PM, Kris Jurka bo...@ejurka.com wrote:
 Per discussion and investigation on the -jdbc list, the server appears to
 violate the frontend/backend protocol when binary copy data is sent to the
 server.  Upon receiving the binary copy end of data marker (a -1 field
 count), the server immediately responds with CommandComplete and
 ReadyForQuery without waiting for the frontend to issue CopyDone or
 CopyFail.  This confuses the JDBC driver as it doesn't think the command
 sequence should have finished yet.

 Attached is a patch to make the server continue to consume protocol data
 until instructed to stop by the client in the same way as copying text data
 to the server currently works.

 http://www.postgresql.org/docs/8.4/static/protocol-flow.html#PROTOCOL-COPY
 http://www.postgresql.org/docs/8.4/static/sql-copy.html

 Kris Jurka

I guess the obvious question is whether we shouldn't instead change
the docs to match the behavior.  I suspect there's almost no chance
we'd consider back-patching a change of this type, since it is a clear
behavior change.  And even if we did, there would still be people
running servers with the old behavior with which JDBC and other
drivers would have to cope.  Having two different behaviors might be
worse than the status quo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, part.2

2010-07-22 Thread KaiGai Kohei

Thanks for your reviewing.

(2010/07/23 0:54), Robert Haas wrote:
 2010/7/14 KaiGai Koheikai...@ak.jp.nec.com:
 The attached patch is a part of efforts to support security label
 on database objects.
 
 This is similar to what I had in mind as a design for this feature,
 but I have some gripes:
 
 1. I am inclined to suggest the syntax SECURITY LABEL ON ... IS ...,
 following COMMENT ON (it's also somewhat similar to the GRANT syntax).
   While the hook in ExecCheckRTPerms() will only allow meaningful
 permissions checks on the use of relations, there will presumably be
 ongoing demand to add additional hooks to cover other types of
 objects, and I'd rather add label support all at once rather than
 bit-by-bit.  I also think that the SECURITY LABEL syntax is more
 future-proof; we don't need to worry about conflicts in other parts of
 the grammar.
 
Hmm. Indeed, we cannot deny the possibility to conflict with other part
in the future, if we use ALTER xxx statement here.

But, right now, we have no statement that starts in noun, rather than verb.
The comment is a noun, but comment on is a phrasal-verb, isn't it?

How about RELABEL object TO label, instead?

The relabel is a transitive-verb, we don't need ON between RELABEL
and object. And, it tries to change a property of the object, so
it seems to me TO is more appropriate than IS.

 2. Similarly, the design of the hook in secabel.h is way too
 short-sighted and won't scale to any other object type.  We don't need
 or want one hook per object type here.  Use an ObjectAddress.
 
I think the relation type is an exceptional object class, because of
the recursion due to the table inheritances.
For other object classes, I also think one security hook which takes
ObjectAddress as an argument is enough to implement.

So, I expect we need two hooks on relabeling eventually.
(One for relation, one for other object classes)

 3. I am firmly of the view that we want to allow multiple security
 providers.  I think the way this should work here is that we should
 keep a list somewhere of security providers known to the system, which
 loadable modules should add themselves to.  Each such security
 provider should be represented by a struct containing, at least, a
 name and a function that gets called on relabel.  The labels should be
 stored in the catalog.  That way there is never any possibility of one
 security provider getting a label that was originally applied by some
 other security provider.  If we were storing these labels in pg_class
 or pg_attribute or similar, the storage cost for this might be worth
 worrying about, but as this is a separate catalog, it's not.
 
What I'm worrying about is that we cannot estimate amount of works when
we expand the concept to row-level security. We will need to revise the
implementation, if individual user tuples have its security label in the
future version.
If we don't support multiple labels right now, it will not be feature
degradation, even if it will be hard to implement multiple label support
for each user tuples. :(

I don't deny worth of multiple security providers concurrently, however,
I doubt whether it should be supported from the beginning, or not.
It seems to me it is not too late after we can find out the way to label
individual user tuples.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, part.2

2010-07-22 Thread Robert Haas

2010/7/22 KaiGai Kohei kai...@ak.jp.nec.com:
 Thanks for your reviewing.
 1. I am inclined to suggest the syntax SECURITY LABEL ON ... IS ...,
 following COMMENT ON (it's also somewhat similar to the GRANT syntax).
   While the hook in ExecCheckRTPerms() will only allow meaningful
 permissions checks on the use of relations, there will presumably be
 ongoing demand to add additional hooks to cover other types of
 objects, and I'd rather add label support all at once rather than
 bit-by-bit.  I also think that the SECURITY LABEL syntax is more
 future-proof; we don't need to worry about conflicts in other parts of
 the grammar.

 Hmm. Indeed, we cannot deny the possibility to conflict with other part
 in the future, if we use ALTER xxx statement here.

 But, right now, we have no statement that starts in noun, rather than verb.
 The comment is a noun, but comment on is a phrasal-verb, isn't it?

 How about RELABEL object TO label, instead?

Well, I like SECURITY LABEL better because it's more clear about what
kind of label we're talking about, but if there's consensus on some
other option it's OK with me.  Actually, we need to work the security
provider name in there too, I think, so perhaps SECURITY LABEL FOR
provider ON object IS labeltext.  I realize it's slightly odd
grammatically, but it's no worse than the COMMENT syntax.

 2. Similarly, the design of the hook in secabel.h is way too
 short-sighted and won't scale to any other object type.  We don't need
 or want one hook per object type here.  Use an ObjectAddress.

 I think the relation type is an exceptional object class, because of
 the recursion due to the table inheritances.
 For other object classes, I also think one security hook which takes
 ObjectAddress as an argument is enough to implement.

 So, I expect we need two hooks on relabeling eventually.
 (One for relation, one for other object classes)

Please explain in more detail.

 3. I am firmly of the view that we want to allow multiple security
 providers.  I think the way this should work here is that we should
 keep a list somewhere of security providers known to the system, which
 loadable modules should add themselves to.  Each such security
 provider should be represented by a struct containing, at least, a
 name and a function that gets called on relabel.  The labels should be
 stored in the catalog.  That way there is never any possibility of one
 security provider getting a label that was originally applied by some
 other security provider.  If we were storing these labels in pg_class
 or pg_attribute or similar, the storage cost for this might be worth
 worrying about, but as this is a separate catalog, it's not.

 What I'm worrying about is that we cannot estimate amount of works when
 we expand the concept to row-level security. We will need to revise the
 implementation, if individual user tuples have its security label in the
 future version.
 If we don't support multiple labels right now, it will not be feature
 degradation, even if it will be hard to implement multiple label support
 for each user tuples. :(

I think it is 100% clear that row-level security will require
completely separate infrastructure, and therefore I'm not even a tiny
bit worried about this.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, part.2

2010-07-22 Thread KaiGai Kohei

(2010/07/23 10:05), Robert Haas wrote:
 2010/7/22 KaiGai Koheikai...@ak.jp.nec.com:
 Thanks for your reviewing.
 1. I am inclined to suggest the syntax SECURITY LABEL ON ... IS ...,
 following COMMENT ON (it's also somewhat similar to the GRANT syntax).
While the hook in ExecCheckRTPerms() will only allow meaningful
 permissions checks on the use of relations, there will presumably be
 ongoing demand to add additional hooks to cover other types of
 objects, and I'd rather add label support all at once rather than
 bit-by-bit.  I also think that the SECURITY LABEL syntax is more
 future-proof; we don't need to worry about conflicts in other parts of
 the grammar.

 Hmm. Indeed, we cannot deny the possibility to conflict with other part
 in the future, if we use ALTER xxx statement here.

 But, right now, we have no statement that starts in noun, rather than verb.
 The comment is a noun, but comment on is a phrasal-verb, isn't it?

 How about RELABELobject  TOlabel, instead?
 
 Well, I like SECURITY LABEL better because it's more clear about what
 kind of label we're talking about, but if there's consensus on some
 other option it's OK with me.  Actually, we need to work the security
 provider name in there too, I think, so perhaps SECURITY LABEL FOR
 provider ON object IS labeltext.  I realize it's slightly odd
 grammatically, but it's no worse than the COMMENT syntax.
 
The FOR provider clause should be optional. I expect most use
cases installs only one security provider, rather than multiple.

If no explicit provider is specified, all the security providers
check the given security label. If two or more providers are here,
of course, either of them will raise an error, because they have
different label formats. It is right.

Anyway, I'd like to implement according to the idea.

  SECURITY LABEL [FOR provider] ON object IS label;

 2. Similarly, the design of the hook in secabel.h is way too
 short-sighted and won't scale to any other object type.  We don't need
 or want one hook per object type here.  Use an ObjectAddress.

 I think the relation type is an exceptional object class, because of
 the recursion due to the table inheritances.
 For other object classes, I also think one security hook which takes
 ObjectAddress as an argument is enough to implement.

 So, I expect we need two hooks on relabeling eventually.
 (One for relation, one for other object classes)
 
 Please explain in more detail.
 
For relations, one SECURITY LABEL statement may relabel multiple tables
when it has child tables, if ONLY clause was not given.
So, we need to obtain oids to be relabeled using find_all_inheritors(),
and need to ask providers whether it allows, or not.
But, obviously, it is specific for relations.

For other object class, the target object to be relabeled is identified
by object in SECURITY LABEL statement. It will be parsed by the upcoming
parse_object.c feature, then it solves the object name to ObjectAddress.
So, we can apply access controls after setting up the ObjectAddress with
common hooks for object classes except for relations, like:

  void check_object_relabel(ObjectAddress object, const char *new_label);

 3. I am firmly of the view that we want to allow multiple security
 providers.  I think the way this should work here is that we should
 keep a list somewhere of security providers known to the system, which
 loadable modules should add themselves to.  Each such security
 provider should be represented by a struct containing, at least, a
 name and a function that gets called on relabel.  The labels should be
 stored in the catalog.  That way there is never any possibility of one
 security provider getting a label that was originally applied by some
 other security provider.  If we were storing these labels in pg_class
 or pg_attribute or similar, the storage cost for this might be worth
 worrying about, but as this is a separate catalog, it's not.

 What I'm worrying about is that we cannot estimate amount of works when
 we expand the concept to row-level security. We will need to revise the
 implementation, if individual user tuples have its security label in the
 future version.
 If we don't support multiple labels right now, it will not be feature
 degradation, even if it will be hard to implement multiple label support
 for each user tuples. :(
 
 I think it is 100% clear that row-level security will require
 completely separate infrastructure, and therefore I'm not even a tiny
 bit worried about this.  :-)
 
Hmm. Are you saying we may degrade the feature when we switch to the
completely separate infrastructure? Is it preferable??

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED security

2010-07-22 Thread Itagaki Takahiro

2010/7/22 Pavel Stehule pavel.steh...@gmail.com:
 I see a SQL/MED security very unclean - it have to be very vell documented :(
 ERROR:  password is required
 DETAIL:  Non-superuser cannot connect if the server does not request a 
 password.

The security model of current FDW heavily depends on implementation of
existing COPY FROM and dblink. It should be improved in the future.

Postgresql_fdw is based on dblink, that always requires password
for non-superuser connections. File_fdw is based on COPY FROM,
that only allows superusers to read local files. So, present FDWs
also require password or superuser privileges to read foreign tables.

I think the ideal design is creating foreign tables are restricted
only for superusers, or requiring password on the creation time,
but don't require such privileges on SELECT time. But I didn't
develop such feature at the moment.

There are some security issues here - ALTERing generic options.
Superusers can define file_fdw on a local file, and can grant the
ownership to non-superusers. But the non-superusers should
not modify 'filename' option with ALTER OPTION. If allowed, they
can read another local file on the server.

There is another security issue: password can be retrieved by all users to
query system catalogs. The password is stored in generic options,
that are visible for all users and not encrypted. We can allow users to read
other normal options, but should not password and local filesystem path.
However, we don't have such equipments for now.

Such security issues are ones of the most complex problems in FDW.
Comments and ideas welcome.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch (for 9.1) string functions

2010-07-22 Thread Itagaki Takahiro

2010/7/21 Pavel Stehule pavel.steh...@gmail.com:
 It is about 2% slower for UTF8 encoding. So it isn't significant for me.
 I agree with your changes. Thank You very much

Thanks. The core-part is almost ready to commit.
I'll continue to review the contrib part.

But I found there is a design issue in format() :
Appending a '%' is common use-case, but format() cannot append
% char without any spaces between placeholder and the raw % char.

itagaki=# SELECT format('%%%', 10), format('% %%', 10);
 format | format
+
 %10| 10 %
(1 row)

It is a design issue, and RAISE in PL/pgSQL has the same issue, too.
Do we accept the restriction? Or should we add another escape
syntax and/or placeholder pattern?

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-22 Thread KaiGai Kohei

David,

I checked your patch. Then, there are a few comments in the code.
Apart from the discussion in this thread, would you fix them please.

| *** a/src/bin/initdb/initdb.c
| --- b/src/bin/initdb/initdb.c
| *** static char infoversion[100];
| *** 111,116 
| --- 111,117 
|   static bool caught_signal = false;
|   static bool output_failed = false;
|   static intoutput_errno = 0;
| + static char **append_config_buffer;
|
|   /* defaults */
|   static intn_connections = 10;

It should be initialized by NULL.

| + static char **
| + append_line(char **src, char *line)
| + {
| + char **buf;
| + int i;
| + int src_lines = 0;
| +
| + if (!line)
| + return src;
| +
| + for (i=0; src[i]; i++)
| + src_lines++;
| +
| + /* we assume that anything existing in the buffer already has been
| +  * xstrdup'd, and so only xstrdup new lines
| +  */
| + buf = (char**) pg_malloc((src_lines + 2) * (sizeof(char *)));
| + memcpy(buf,src,(sizeof(char *))*src_lines);
| +
| + buf[src_lines] = xstrdupln(line);
| + buf[src_lines+1] = 0;
| +
| + return buf;
| + }

The append_line() is just append one lines on the tail of append_config_buffer
array. Why is it needed to compute number of elements for each invocations?
If we have a static variable to hold number of elements in append_config_buffer,
we can just implement it as follows:
  append_config_buffer[append_config_buffer_index++] = xstrdupln(line);
  append_config_buffer[append_config_buffer_index] = NULL;

And, it calls pg_malloc() for each invocations, but it may be costful.
Could you allocate 100 elements at first, then reallocate it only when
append_config_buffer_index reaches the boundary of the array?

And, don't use 0 for pointers, instead of NULL.


Anyway, it is an obvious feature, and seems to me works fine.

However, it is not clear for me how do we make progress this feature.
If we support a command to include other configuration, it also needs
to patch on the postgresql backend, not only initdb.
In my personal opinion, as long as you don't need any special configuration
under the single user mode or bootstraping mode launched by initdb,
we can modify it using shell script or others later.

It seems to me we need to make clear where is our consensus at first. :(

Thanks,

(2010/07/15 9:16), KaiGai Kohei wrote:
 David,
 
 I'd like to volunteer reviewing your patch at first in this commit fest.
 
 We already had a few comments on the list before. I want to see your
 opinion for the suggestions prior to code reviews.
 
 Itagaki-san suggested:
 |  Enclosed is a patch to add a -C option to initdb to allow you to easily
 |  append configuration directives to the generated postgresql.conf file
 | Why don't you use just echo 'options'  $PGDATA/postgresql.conf ?
 | Could you explain where the -C options is better than initdb + echo?
 
 Greg suggested:
 | We had a patch not quite make it for 9.0 that switched over the 
 postgresql.conf
 | file to make it easy to scan a whole directory looking for configuration 
 files:
 | 
 http://archives.postgresql.org/message-id/9837222c0910240641p7d75e2a4u2cfa6c1b5e603...@mail.gmail.com
 |
 | The idea there was to eventually reduce the amount of postgresql.conf 
 hacking
 | that initdb and other tools have to do.  Your patch would add more code into
 | a path that I'd like to see reduced significantly.
 |
 | That implementation would make something easy enough for your use case too
 | (below untested but show the general idea):
 |
 | $ for cluster in 1 2 3 4 5 6;
 |  do initdb -D data$cluster
 |  (
 |  catEOF
 |  port = 1234$cluster;
 |  max_connections = 10;
 |  shared_buffers=1M;
 |  EOF
 |  )  data$cluster/conf.d/99clustersetup
 | done
 |
 | This would actually work just fine for what you're doing right now if you 
 used
 |   data$cluster/postgresql.conf for that next to last line there.
 | There would be duplicates, which I'm guessing is what you wanted to avoid 
 with
 | this patch, but the later values set for the parameters added to the end 
 would
 | win and be the active ones.
 
 Peter suggested:
 |  Enclosed is a patch to add a -C option to initdb to allow you to easily
 |  append configuration directives to the generated postgresql.conf file
 |  for use in programmatic generation.
 | I like this idea, but please use small -c for consistency with the
 | postgres program.
 
 It seems to me what Greg suggested is a recent trend. Additional 
 configurations
 within separated files enables to install/uninstall third-party plugins easily
 from the perspective of packagers, rather than consolidated configuration.
 
 However, $PGDATA/postgresql.conf is created on initdb, so it does not belong
 to a certain package. I don't have certainty whether the recent trend is also
 suitable for us, or not.
 
 Thanks,
 
 (2010/03/29 14:04), David Christensen wrote:
 Hackers,

 Enclosed is a patch to add a -C option to initdb to allow you to easily

Re: [HACKERS] security label support, part.2

2010-07-22 Thread Robert Haas

2010/7/22 KaiGai Kohei kai...@ak.jp.nec.com:
 Well, I like SECURITY LABEL better because it's more clear about what
 kind of label we're talking about, but if there's consensus on some
 other option it's OK with me.  Actually, we need to work the security
 provider name in there too, I think, so perhaps SECURITY LABEL FOR
 provider ON object IS labeltext.  I realize it's slightly odd
 grammatically, but it's no worse than the COMMENT syntax.

 The FOR provider clause should be optional. I expect most use
 cases installs only one security provider, rather than multiple.

 If no explicit provider is specified, all the security providers
 check the given security label. If two or more providers are here,
 of course, either of them will raise an error, because they have
 different label formats. It is right.

Hmm.  How about if there's just one provider loaded, you can omit it,
but if you fail to specify it and there's 1 loaded, we just throw an
error saying you didn't specify whose label it is.

 So, I expect we need two hooks on relabeling eventually.
 (One for relation, one for other object classes)

 Please explain in more detail.

 For relations, one SECURITY LABEL statement may relabel multiple tables
 when it has child tables, if ONLY clause was not given.
 So, we need to obtain oids to be relabeled using find_all_inheritors(),
 and need to ask providers whether it allows, or not.
 But, obviously, it is specific for relations.

 For other object class, the target object to be relabeled is identified
 by object in SECURITY LABEL statement. It will be parsed by the upcoming
 parse_object.c feature, then it solves the object name to ObjectAddress.
 So, we can apply access controls after setting up the ObjectAddress with
 common hooks for object classes except for relations, like:

  void check_object_relabel(ObjectAddress object, const char *new_label);

So just construct an ObjectAddress for each relation and call the
check function once for each.

 I think it is 100% clear that row-level security will require
 completely separate infrastructure, and therefore I'm not even a tiny
 bit worried about this.  :-)

 Hmm. Are you saying we may degrade the feature when we switch to the
 completely separate infrastructure? Is it preferable??

Uh... no, not really.  I'm saying that I don't think we're backing
ourselves into a corner.  What makes you think we are?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-22 Thread Robert Haas

2010/7/22 KaiGai Kohei kai...@ak.jp.nec.com:
 Anyway, it is an obvious feature, and seems to me works fine.

So this makes it sound like you like the feature.

 However, it is not clear for me how do we make progress this feature.
 If we support a command to include other configuration, it also needs
 to patch on the postgresql backend, not only initdb.

I don't know what this means.

 In my personal opinion, as long as you don't need any special configuration
 under the single user mode or bootstraping mode launched by initdb,
 we can modify it using shell script or others later.

But here it sounds like you're saying we don't need the feature
because may as well just edit postgresql.conf by hand.

So I'm confused.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-22 Thread KaiGai Kohei

(2010/07/23 13:00), Robert Haas wrote:
 2010/7/22 KaiGai Koheikai...@ak.jp.nec.com:
 Anyway, it is an obvious feature, and seems to me works fine.
 
 So this makes it sound like you like the feature.
 
 However, it is not clear for me how do we make progress this feature.
 If we support a command to include other configuration, it also needs
 to patch on the postgresql backend, not only initdb.
 
 I don't know what this means.
 
 In my personal opinion, as long as you don't need any special configuration
 under the single user mode or bootstraping mode launched by initdb,
 we can modify it using shell script or others later.
 
 But here it sounds like you're saying we don't need the feature
 because may as well just edit postgresql.conf by hand.
 
 So I'm confused.
 
Sorry for the confusion.

What I wanted to say is the patch itself is fine but we need to make consensus
before the detailed code reviewing.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security label support, part.2

2010-07-22 Thread KaiGai Kohei

(2010/07/23 12:56), Robert Haas wrote:
 2010/7/22 KaiGai Koheikai...@ak.jp.nec.com:
 Well, I like SECURITY LABEL better because it's more clear about what
 kind of label we're talking about, but if there's consensus on some
 other option it's OK with me.  Actually, we need to work the security
 provider name in there too, I think, so perhaps SECURITY LABEL FOR
 provider ON object IS labeltext.  I realize it's slightly odd
 grammatically, but it's no worse than the COMMENT syntax.

 The FORprovider clause should be optional. I expect most use
 cases installs only one security provider, rather than multiple.

 If no explicitprovider  is specified, all the security providers
 check the given security label. If two or more providers are here,
 of course, either of them will raise an error, because they have
 different label formats. It is right.
 
 Hmm.  How about if there's just one provider loaded, you can omit it,
 but if you fail to specify it and there's1 loaded, we just throw an
 error saying you didn't specify whose label it is.
 
Perhaps, we need to return the caller a state whether one provider checked
the given label at least, or not.

If invalid provider was specified so nobody checked it, nobody returns
the caller a state of checked, then it raises an error to notice invalid
security provider.

If valid provider was specified, only specified provider checks the given
label, and returns the caller a state of it was checked by .

If it was omitted, all the providers try to check the given label, but it
has mutually different format, so one of providers will raise an error at
least.

It means we have to specify the provider when two or more providers are
loaded, but not necessary when just one provider.

 So, I expect we need two hooks on relabeling eventually.
 (One for relation, one for other object classes)

 Please explain in more detail.

 For relations, one SECURITY LABEL statement may relabel multiple tables
 when it has child tables, if ONLY clause was not given.
 So, we need to obtain oids to be relabeled using find_all_inheritors(),
 and need to ask providers whether it allows, or not.
 But, obviously, it is specific for relations.

 For other object class, the target object to be relabeled is identified
 byobject  in SECURITY LABEL statement. It will be parsed by the upcoming
 parse_object.c feature, then it solves the object name to ObjectAddress.
 So, we can apply access controls after setting up the ObjectAddress with
 common hooks for object classes except for relations, like:

   void check_object_relabel(ObjectAddress object, const char *new_label);
 
 So just construct an ObjectAddress for each relation and call the
 check function once for each.
 
OK, I'll revise it.

 I think it is 100% clear that row-level security will require
 completely separate infrastructure, and therefore I'm not even a tiny
 bit worried about this.  :-)

 Hmm. Are you saying we may degrade the feature when we switch to the
 completely separate infrastructure? Is it preferable??
 
 Uh... no, not really.  I'm saying that I don't think we're backing
 ourselves into a corner.  What makes you think we are?
 
Sorry, meaning of the last question was unclear for me Is it a idiom?

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch (for 9.1) string functions

2010-07-22 Thread Pavel Stehule

2010/7/23 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/21 Pavel Stehule pavel.steh...@gmail.com:
 It is about 2% slower for UTF8 encoding. So it isn't significant for me.
 I agree with your changes. Thank You very much

 Thanks. The core-part is almost ready to commit.
 I'll continue to review the contrib part.

 But I found there is a design issue in format() :
 Appending a '%' is common use-case, but format() cannot append
 % char without any spaces between placeholder and the raw % char.

 itagaki=# SELECT format('%%%', 10), format('% %%', 10);
  format | format
 +
  %10    | 10 %
 (1 row)

 It is a design issue, and RAISE in PL/pgSQL has the same issue, too.
 Do we accept the restriction? Or should we add another escape
 syntax and/or placeholder pattern?


I prefer a current behave - RAISE statement uses same and it is not
reported as bug for ten years, what I read a mailing lists. I would to
have a FORMAT implementation simple as possible.

and there is simple workaround:

postgres=# create or replace function fx()
returns void as $$
begin
  raise notice '%', '%';
end;
$$ language plpgsql;
CREATE FUNCTION
Time: 560.063 ms
postgres=# select fx();
NOTICE:  %
 fx


(1 row)

Regards

Pavel Stehule

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

52 matches

Mail list logo