date:20101206

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tatsuo Ishii

> On 12/07/2010 01:22 AM, Tom Lane wrote:
>> Josh Berkus  writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?
>> 
>>> If you had several standbys, you could distribute the work of the
>>> pg_dump among them.  This would be a huge speedup for a large database,
>>> potentially, thanks to parallelization of I/O and network.  Imagine
>>> doing a pg_dump of a 300GB database in 10min.
>> 
>> That does sound kind of attractive.  But to do that I think we'd have to
>> go with the pass-the-snapshot-through-the-client approach.  Shipping
>> internal snapshot files through the WAL stream doesn't seem attractive
>> to me.
> 
> this kind of functionality would also be very useful/interesting for
> connection poolers/loadbalancers that are trying to distribute load
> across multiple hosts and could use that to at least give some sort of
> consistency guarantee.

In addition to this, that will greatly help query based replication
tools such as pgpool-II. Sounds great.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] To Signal The postmaster

2010-12-06 Thread aaliya zarrin

Hi,

I want to signal the postmaster (User defined signal) once I created the
trigger file (for switch over).
I want to reduce my switch over time further by signaling the postmaster
once the trigger file is created.

Is this provision already there in the code?
There ate already SIGUSR1 and SIGUSR2 are handled in the postmaster? is it
for the same?

If I handle my own signal using SIGUSR1/SIGUSR2 is it possible without
affecting the other functionality of existing code?





-- 
Thanks & Regards,

Aaliya Zarrin
(+91)-9160665888

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Stefan Kaltenbrunner

On 12/07/2010 01:22 AM, Tom Lane wrote:
> Josh Berkus  writes:
>>> However, if you were doing something like parallel pg_dump you could
>>> just run the parent and child instances all against the slave, so the
>>> pg_dump scenario doesn't seem to offer much of a supporting use-case for
>>> worrying about this.  When would you really need to be able to do it?
> 
>> If you had several standbys, you could distribute the work of the
>> pg_dump among them.  This would be a huge speedup for a large database,
>> potentially, thanks to parallelization of I/O and network.  Imagine
>> doing a pg_dump of a 300GB database in 10min.
> 
> That does sound kind of attractive.  But to do that I think we'd have to
> go with the pass-the-snapshot-through-the-client approach.  Shipping
> internal snapshot files through the WAL stream doesn't seem attractive
> to me.

this kind of functionality would also be very useful/interesting for
connection poolers/loadbalancers that are trying to distribute load
across multiple hosts and could use that to at least give some sort of
consistency guarantee.



Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Craig Ringer

On 07/12/10 10:48, Josh Berkus wrote:
> 
>> It seems plausible to fix the first one, but how would you fix the
>> second one?  You either allow SET ROLE (which you need, to support the
>> pooler changing authorization), or you don't.  There doesn't seem to be
>> a usable middleground.
> 
> Well, this is why such a pooler would *have* to be built into the
> backend.  It would need to be able to SET ROLE even though SET ROLE
> would not be accepted over the client connection.

There's actually another way to do that that could be retrofitted onto
an existing external pooler. It's not lovely, but if the approach above
proved too hard...

SET ROLE could accept a cookie / one-time password that had to be passed
to RESET ROLE in order for RESET ROLE to accept the command.

SET ROLE fred WITH COOKIE 'goqu8Mi6choht8ie';
-- hand to the user
-- blah blah user work blah
-- returned by the user
RESET ROLE WITH COOKIE 'goqu8Mi6choht8ie';

The tricky bit might be that the user should still be permitted to SET
ROLE, but only to roles that the role the pooler switched them to
("fred") has rights to SET ROLE to, not to roles that the pooler user
its self has rights to switch to.

> We'd also need
> bookkeeping to track the ROLE (and other GUCs) of each client connection
> and reset them whenever that client connection switches back.

I'm really interested in this direction. Taken just a little further, it
could bring Pg to the point where query executors (backends) are
separated from connection state, so a given backend could pick up and
work on queries by several different connections in rapid succession.
The advantage there is that idle connections would become cheap,
low-overhead affairs.

As I (poorly) understand how Pg is designed it'd only be possible for a
backend to work on queries that act on the same database, it couldn't
really switch databases. That'd still be a real bonus especially for
newer users who don't realize they *need* a connection pool.

-- 
System & Network Administrator
POST Newspapers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Tom Lane

Alvaro Herrera  writes:
> Maybe we should have a single tunable for processes that just sleep
> waiting for events or postmaster death.  For example pgstats has a
> hardcoded 2 seconds, and the archiver process has a hardcoded value too
> AFAICS.

That would make sense once we get to the point where for all of those
processes, the sleep delay *only* affects the time to notice postmaster
death.  Right now I think there are still several other behaviors mixed
in with that, and not all of them necessarily want the same response
time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Rethinking hint bits WAS: Protecting against unexpected zero-pages: proposal

2010-12-06 Thread Jim Nasby

On Nov 14, 2010, at 3:40 PM, Greg Stark wrote:On Sun, Nov 14, 2010 at 8:52 PM, Josh Berkus  wrote:For example, imagine if the hint bits were moved to a separate per-tablebitmap outside the table instead of being stored with each row, as thecurrent FSM is.How many times do we have to keep going around the same block?We *already* have separate bitmap outside the table for transactioncommit bits. It's the clog.The only reason the hint bits exist is to cache that so we don't needto do extra I/O to check tuple visibility. If the hint bits are movedoutside the table then they serve no purpose whatsover. Then you havean additional I/O to attempt to save an additional I/O.Are you sure hint bits are only for IO savings? Calculating visibility from CLOG involves a hell of a lot more CPU than checking a hint bit.It would be extremely interesting if the CPU overhead wasn't very noticeable however. That would mean we *only* have to worry about CLOG IO, and there's probably a lot of ways around that (memory mapping CLOG is one possibility), especially considering that 4G isn't exactly a large amount of memory these days.--Jim C. Nasby, Database Architect                   j...@nasby.net512.569.9461 (cell)                         http://jim.nasby.net

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Fujii Masao

On Tue, Dec 7, 2010 at 12:22 PM, Robert Haas  wrote:
>> Fair enough. How about increasing the default to 10 seconds?
>> Since bgwriter has already using 10s as a nap time if there is no
>> configured activity, I think that 10s is non-nonsense default value.
>
> What do we get out of making this non-configurable?

Which would make the setting of replication simpler, I think.
But I agree to just increase the default value of wal_sender_delay
rather than dropping it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 10:07 PM, Fujii Masao  wrote:
> On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane  wrote:
>> Fujii Masao  writes:
>>> One problem with the patch is that it takes longer (at most 10s) to
>>> detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
>>> This is OK for me. But does anyone want to specify the delay to detect
>>> that within a short time?
>>
>> Oh.  Hm.  I'm hesitant to remove the setting if there's still some
>> behavior that it would control.  Maybe we should just crank up the
>> default value instead.
>
> Fair enough. How about increasing the default to 10 seconds?
> Since bgwriter has already using 10s as a nap time if there is no
> configured activity, I think that 10s is non-nonsense default value.

What do we get out of making this non-configurable?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Koichi Suzuki

We may need other means to ensure that the snapshot is available on
the slave.  It could be a bit too early to use the snapshot on the
slave depending upon the delay of WAL replay.
--
Koichi Suzuki



2010/12/7 Tom Lane :
> marcin mank  writes:
>> On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane  wrote:
>>> IIRC, in old discussions of this problem we first considered allowing
>>> clients to pull down an explicit representation of their snapshot (which
>>> actually is an existing feature now, txid_current_snapshot()) and then
>>> upload that again to become the active snapshot in another connection.
>
>> Could a hot standby use such a snapshot representation? I.e. same
>> snapshot on the master and the standby?
>
> Hm, that's a good question.  It seems like it's at least possibly
> workable, but I'm not sure if there are any showstoppers.  The other
> proposal of publish-a-snapshot would presumably NOT support this, since
> we'd not want to ship the snapshot temp files down the WAL stream.
>
> However, if you were doing something like parallel pg_dump you could
> just run the parent and child instances all against the slave, so the
> pg_dump scenario doesn't seem to offer much of a supporting use-case for
> worrying about this.  When would you really need to be able to do it?
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Fujii Masao

On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane  wrote:
> Fujii Masao  writes:
>> One problem with the patch is that it takes longer (at most 10s) to
>> detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
>> This is OK for me. But does anyone want to specify the delay to detect
>> that within a short time?
>
> Oh.  Hm.  I'm hesitant to remove the setting if there's still some
> behavior that it would control.  Maybe we should just crank up the
> default value instead.

Fair enough. How about increasing the default to 10 seconds?
Since bgwriter has already using 10s as a nap time if there is no
configured activity, I think that 10s is non-nonsense default value.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Alvaro Herrera

Excerpts from Tom Lane's message of lun dic 06 23:49:52 -0300 2010:
> Fujii Masao  writes:
> > One problem with the patch is that it takes longer (at most 10s) to
> > detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
> > This is OK for me. But does anyone want to specify the delay to detect
> > that within a short time?
> 
> Oh.  Hm.  I'm hesitant to remove the setting if there's still some
> behavior that it would control.  Maybe we should just crank up the
> default value instead.

Maybe we should have a single tunable for processes that just sleep
waiting for events or postmaster death.  For example pgstats has a
hardcoded 2 seconds, and the archiver process has a hardcoded value too
AFAICS.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 9:37 PM, Alvaro Herrera
 wrote:
> Excerpts from Robert Haas's message of lun dic 06 23:09:56 -0300 2010:
>> On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus  wrote:
>> >
>> >> Please explain more precisely what is wrong with SET SESSION
>> >> AUTHORIZATION / SET ROLE.
>> >
>> > 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
>> > had any time to work on)
>> >
>> > 2) Users can always issue their own SET ROLE and then "hack into" other
>> > users' data.
>>
>> Makes sense.  It would be nice to fix those issues, independent of
>> anything else.
>
> It seems plausible to fix the first one, but how would you fix the
> second one?  You either allow SET ROLE (which you need, to support the
> pooler changing authorization), or you don't.  There doesn't seem to be
> a usable middleground.

You could add a protocol message that does a "permanent" role switch
in a way that can't be undone except by another such protocol message.
 Then connection poolers could simply refuse to proxy that particular
message.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Tom Lane

Fujii Masao  writes:
> One problem with the patch is that it takes longer (at most 10s) to
> detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
> This is OK for me. But does anyone want to specify the delay to detect
> that within a short time?

Oh.  Hm.  I'm hesitant to remove the setting if there's still some
behavior that it would control.  Maybe we should just crank up the
default value instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus


> It seems plausible to fix the first one, but how would you fix the
> second one?  You either allow SET ROLE (which you need, to support the
> pooler changing authorization), or you don't.  There doesn't seem to be
> a usable middleground.

Well, this is why such a pooler would *have* to be built into the
backend.  It would need to be able to SET ROLE even though SET ROLE
would not be accepted over the client connection.  We'd also need
bookkeeping to track the ROLE (and other GUCs) of each client connection
and reset them whenever that client connection switches back.

Mind you, I'm not entirely convinced that the end result of this would
be performant.  And they would certainly be complicated.  I think that
we should start by dealing with the simplest situation, ignoring SET
ROLE and GUC issues for now.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread Itagaki Takahiro

On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut  wrote:
> Here is an updated patch to address the issues discussed during this
> commitfest.

I found another issue in the patch; ILIKE in WHERE clause doesn't work.
It was surprising because LIKE in WHERE clause and ILIKE in SELECT list
works expectedly.
 - SELECT * FROM pg_class WHERE relname LIKE 'pg%'
 - SELECT relname ILIKE 'pg%' FROM pg_class;


postgres=# SELECT name, setting FROM pg_settings
 WHERE name IN ('lc_ctype', 'lc_collate', 'server_encoding');
  name   | setting
-+-
 lc_collate  | C
 lc_ctype| C
 server_encoding | UTF8
(3 rows)

postgres=# SELECT * FROM pg_class WHERE relname ILIKE 'pg%';
ERROR:  no collation was derived


-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Fujii Masao

On Tue, Dec 7, 2010 at 12:08 AM, Tom Lane  wrote:
> Fujii Masao  writes:
>> Walsender doesn't need the periodic wakeups anymore, thanks to
>> the latch feature. So wal_sender_delay is basically useless now.
>> How about dropping wal_sender_delay or increasing the default
>> value?
>
> If we don't need it, we should remove it.

The attached patch removes wal_sender_delay and uses hard-coded
10 seconds instead of wal_sender_delay as the delay between activity
rounds for walsender.

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

drop_wal_sender_delay_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Alvaro Herrera

Excerpts from Robert Haas's message of lun dic 06 23:09:56 -0300 2010:
> On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus  wrote:
> >
> >> Please explain more precisely what is wrong with SET SESSION
> >> AUTHORIZATION / SET ROLE.
> >
> > 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
> > had any time to work on)
> >
> > 2) Users can always issue their own SET ROLE and then "hack into" other
> > users' data.
> 
> Makes sense.  It would be nice to fix those issues, independent of
> anything else.

It seems plausible to fix the first one, but how would you fix the
second one?  You either allow SET ROLE (which you need, to support the
pooler changing authorization), or you don't.  There doesn't seem to be
a usable middleground.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus

All,

Geirth's results from his FreeBSD 7.1 server using 8.4's test_fsync:

Simple write timing:
write0.007081

Compare fsync times on write() and non-write() descriptor:
If the times are similar, fsync() can sync data written
on a different descriptor.
write, fsync, close  5.937933
write, close, fsync  8.056394

Compare one o_sync write to two:
one 16k o_sync write 7.366927
two 8k o_sync writes15.299300

Compare file sync methods with one 8k write:
(o_dsync unavailable)
open o_sync, write   7.512682
(fdatasync unavailable)
write, fsync 5.856480

Compare file sync methods with two 8k writes:
(o_dsync unavailable)
open o_sync, write  15.472910
(fdatasync unavailable)
write, fsync 5.880319


... again, open_sync does not look very impressive.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

2010-12-06 Thread Alvaro Herrera

Excerpts from Greg Smith's message of dom dic 05 20:02:48 -0300 2010:

> When ends up happening if you push toward fully sync I/O is the design 
> you see in some other databases, where you need multiple writer 
> processes.  Then requests for new pages can continue to allocate as 
> needed, while keeping any one write from blocking things.  That's one 
> sort of a way to simulate asynchronous I/O, and you can substitute true 
> async I/O instead in many of those implementations.  We didn't have much 
> luck with portability on async I/O when that was last experimented with, 
> and having multiple background writer processes seems like overkill; 
> that whole direction worries me.

Why would multiple bgwriter processes worry you?

Of course, it wouldn't work to have multiple processes trying to execute
a checkpoint simultaneously, but what if we separated the tasks so that
one process is in charge of checkpoints, and another oneZis in charge of
the LRU scan?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Tom Lane

Josh Berkus  writes:
> OK, patch coming then.  Right now test_fsync aborts when O_DIRECT fails.
>  What should I have it do instead?

Report that it fails, and keep testing the other methods.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus

On 12/6/10 6:10 PM, Tom Lane wrote:
> Robert Haas  writes:
>> On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus  wrote:
>>> Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
>>> From my run, it looks like even so regular fsync might be better than
>>> open_sync.
> 
>> But I think you need to use fsync_writethrough if you actually want 
>> durability.
> 
> Yeah.  Unless your laptop contains an SSD, those numbers are garbage on
> their face.  So that's another problem with test_fsync: it omits
> fsync_writethrough.

Yeah, the issue with test_fsync appears to be that it's designed to work
without os-specific switches no matter what, not to accurately reflect
how we access wal.

I'll see if I can do better.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane

Robert Haas  writes:
> On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus  wrote:
>> Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
>> From my run, it looks like even so regular fsync might be better than
>> open_sync.

> But I think you need to use fsync_writethrough if you actually want 
> durability.

Yeah.  Unless your laptop contains an SSD, those numbers are garbage on
their face.  So that's another problem with test_fsync: it omits
fsync_writethrough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus  wrote:
>
>> Please explain more precisely what is wrong with SET SESSION
>> AUTHORIZATION / SET ROLE.
>
> 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
> had any time to work on)
>
> 2) Users can always issue their own SET ROLE and then "hack into" other
> users' data.

Makes sense.  It would be nice to fix those issues, independent of
anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus  wrote:
>
>> Mac OS X:  Like Solaris, there's a similar mechanism but it's not
>> O_DIRECT; see
>> http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag
>> for notes about the F_NOCACHE  feature used.  Same basic situation as
>> Solaris; there's an API, but PostgreSQL doesn't use it yet.
>
> Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
> From my run, it looks like even so regular fsync might be better than
> open_sync.

But I think you need to use fsync_writethrough if you actually want durability.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Josh Berkus


> Whether it's complex or not isn't really the issue.  The issue is that
> what test_fsync is testing had better match what the backend does, or
> people will be making choices based on not-comparable test results.
> I think we should have test_fsync just automatically fold in O_DIRECT
> the same way the backend does.

OK, patch coming then.  Right now test_fsync aborts when O_DIRECT fails.
 What should I have it do instead?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus


> Mac OS X:  Like Solaris, there's a similar mechanism but it's not
> O_DIRECT; see
> http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag
> for notes about the F_NOCACHE  feature used.  Same basic situation as
> Solaris; there's an API, but PostgreSQL doesn't use it yet.

Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
>From my run, it looks like even so regular fsync might be better than
open_sync.  Results from a MacBook:

Sidney-Stratton:fsync josh$ ./test_fsync
Loops = 1

Simple write:
8k write   2121.004/second

Compare file sync methods using one write:
(open_datasync unavailable)
open_sync 8k write 1993.833/second
(fdatasync unavailable)
8k write, fsync1878.154/second

Compare file sync methods using two writes:
(open_datasync unavailable)
2 open_sync 8k writes  1005.009/second
(fdatasync unavailable)
8k write, 8k write, fsync  1709.862/second

Compare open_sync with different sizes:
open_sync 16k write1728.803/second
2 open_sync 8k writes   969.416/second

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
8k write, fsync, close 1772.572/second
8k write, close, fsync 1939.897/second


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep

2010-12-06 Thread Fujii Masao

On Mon, Dec 6, 2010 at 11:54 PM, Heikki Linnakangas
 wrote:
> Hmm, that's actually a quite different timeout than what's required for
> synchronous replication. In synchronous replication, you need to get an
> acknowledgment within a timeout. This patch only puts a timeout on how long
> we wait to have enough room in the TCP send buffer. That doesn't seem all
> that useful.

Yeah, I'm planning to implement that timeout for synchronous replication later.
Since I thought that we should implement the timeout for *asynchronous*
replication first and then extend it for synchronous replication, I created this
patch. This kind of timeout is required for asynchronous replication since
since there is no acknowledgement from the standby in it.

Most part of the patch implements the non-blocking send function and
changes walsender so that it uses that function instead of existing blocking
one. This will be infrastructure for the timeout for synchronous replication.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus

Steve,

> If you tell me which options to pgbench and which .conf file settings
> you'd like to see I can probably arrange to run some tests on AIX.

Compile and run test_fsync in PGSRC/src/tools/fsync.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 08:38 PM, Tom Lane wrote:

Josh Berkus  writes:

Making it support O_DIRECT would be possible but more complex; I don't
see the point unless we think we're going to have open_sync_with_odirect
as a seperate option.

Whether it's complex or not isn't really the issue.  The issue is that
what test_fsync is testing had better match what the backend does, or
people will be making choices based on not-comparable test results.
I think we should have test_fsync just automatically fold in O_DIRECT
the same way the backend does.




Indeed. We were quite confused for a while when we were dealing with 
this about a week ago, and my handwritten test program failed as 
expected but test_fsync didn't. Anything other than behaving just as the 
backend does violates POLA, in my view.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Tom Lane

Josh Berkus  writes:
> Making it support O_DIRECT would be possible but more complex; I don't
> see the point unless we think we're going to have open_sync_with_odirect
> as a seperate option.

Whether it's complex or not isn't really the issue.  The issue is that
what test_fsync is testing had better match what the backend does, or
people will be making choices based on not-comparable test results.
I think we should have test_fsync just automatically fold in O_DIRECT
the same way the backend does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane

Greg Smith  writes:
> So my guess is that some small percentage of Windows users might notice 
> a change here, and some testing on FreeBSD would be useful too.  That's 
> about it for platforms that I think anybody needs to worry about.

To my mind, O_DIRECT is not really the key issue here, it's whether to
prefer O_DSYNC or fdatasync.  I looked back in the archives, and I think
that the main reason we prefer O_DSYNC when available is the results
I got here:

http://archives.postgresql.org/pgsql-hackers/2001-03/msg00381.php

which demonstrated a performance benefit on HPUX 10.20, though with a
test tool much more primitive than test_fsync.  I still have that
machine, although the disk that was in it at the time died awhile back.
What's in there now is a Seagate ST336607LW spinning at 1 RPM (166
rev/sec) and today I get numbers like this from test_fsync:

Simple write:
8k write  28331.020/second

Compare file sync methods using one write:
open_datasync 8k write  161.190/second
open_sync 8k write  156.478/second
8k write, fdatasync  54.302/second
8k write, fsync  51.810/second

Compare file sync methods using two writes:
2 open_datasync 8k writes81.702/second
2 open_sync 8k writes80.172/second
8k write, 8k write, fdatasync40.829/second
8k write, 8k write, fsync39.836/second

Compare open_sync with different sizes:
open_sync 16k write  80.192/second
2 open_sync 8k writes78.018/second

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
8k write, fsync, close   52.527/second
8k write, close, fsync   54.092/second

So *on that rather ancient platform* there's a measurable performance
benefit to O_DSYNC, but this seems to be largely because fdatasync is
stubbed to fsync in userspace rather than because fdatasync wouldn't
be a better idea in the abstract.  Also, a lot of the argument against
fsync at the time was that it forced the kernel to iterate through all
the buffers for the WAL file to see if any were dirty.  I would imagine
that modern kernels are a tad smarter about that; and even if they
aren't, the CPU speed versus disk speed tradeoff has changed enough
since 2001 that iterating through 16MB of buffers isn't as interesting
as it was then.

So to my mind, switching to the preference order fdatasync,
fsync_writethrough, fsync seems like the thing to do.  Since we assume
fsync is always available, that means that O_DSYNC/O_SYNC will not be
the defaults on any platform.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Steve Singer


On 10-12-06 06:56 PM, Greg Smith wrote:

Tom Lane wrote:

The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.


Survey of some popular platforms:






So my guess is that some small percentage of Windows users might notice
a change here, and some testing on FreeBSD would be useful too. That's
about it for platforms that I think anybody needs to worry about.


If you tell me which options to pgbench and which .conf file settings 
you'd like to see I can probably arrange to run some tests on AIX.






--
Greg Smith   2ndQuadrant usg...@2ndquadrant.comBaltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Josh Berkus

On 12/5/10 2:12 PM, Greg Smith wrote:
> Josh Berkus wrote:
>> I modified test_fsync in two ways to run this; first, to make it support
>> O_DIRECT, and second to make it run in the *current* directory.
> 
> Patch please?  I agree with the latter change; what test_fsync does is
> surprising.

Attached.

Making it support O_DIRECT would be possible but more complex; I don't
see the point unless we think we're going to have open_sync_with_odirect
as a seperate option.

> I suggested a while ago that we refactor test_fsync to use a common set
> of source code as the database itself for detecting things related to
> wal_sync_method, perhaps just extract that whole set of DEFINE macro
> logic to somewhere else.  That happened at a bad time in the development
> cycle (right before a freeze) and nobody ever got back to the idea
> afterwards.  If this code is getting touched, and it's clear it is in
> some direction, I'd like to see things change so it's not possible for
> the two to diverge again afterwards.

I don't quite follow you.  Maybe nobody else did last time, either.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com
diff --git a/src/tools/fsync/test_fsync.c b/src/tools/fsync/test_fsync.c
index 28c2119..12a83e1 100644
*** a/src/tools/fsync/test_fsync.c
--- b/src/tools/fsync/test_fsync.c
***
*** 23,34 
  #include 
  
  
! #ifdef WIN32
  #define FSYNC_FILENAME	"./test_fsync.out"
- #else
- /* /tmp might be a memory file system */
- #define FSYNC_FILENAME	"/var/tmp/test_fsync.out"
- #endif
  
  #define WRITE_SIZE	(8 * 1024)	/* 8k */
  
--- 23,32 
  #include 
  
  
! /* put the temp files in the local directory
!this is a change from older versions which used
!/var/tmp */
  #define FSYNC_FILENAME	"./test_fsync.out"
  
  #define WRITE_SIZE	(8 * 1024)	/* 8k */
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

Josh Berkus  writes:
>> However, if you were doing something like parallel pg_dump you could
>> just run the parent and child instances all against the slave, so the
>> pg_dump scenario doesn't seem to offer much of a supporting use-case for
>> worrying about this.  When would you really need to be able to do it?

> If you had several standbys, you could distribute the work of the
> pg_dump among them.  This would be a huge speedup for a large database,
> potentially, thanks to parallelization of I/O and network.  Imagine
> doing a pg_dump of a 300GB database in 10min.

That does sound kind of attractive.  But to do that I think we'd have to
go with the pass-the-snapshot-through-the-client approach.  Shipping
internal snapshot files through the WAL stream doesn't seem attractive
to me.

While I see Robert's point about preferring not to expose the snapshot
contents to clients, I don't think it outweighs all other considerations
here; and every other one is pointing to doing it the other way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] the number of file descriptors when using POSIX semaphore

2010-12-06 Thread flyusa2010 fly

Hi, folks,

in src/template/darwin:

# Select appropriate semaphore support. Darwin 6.0 (Mac OS X 10.2) and up
# support System V semaphores; before that we have to use POSIX semaphores,
# which are less good for our purposes because they eat a file descriptor
# per backend per max_connection slot.

To my understanding, the number of descriptors created by POSIX semaphores
would be # of actual clients times max_connection.
However, I monitor the number of open files using sysctl, and I find that
kern.num_files doesn't match the result calculated by the formula that is
inferred by me...

So, what would the number of file descriptors be, when using POSIX
semaphore?

Thanks!

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Greg Smith


Tom Lane wrote:

The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.


Survey of some popular platforms:

Linux:  don't want O_DIRECT by default for reliability reasons, and 
there's no clear performance win in the default config with small 
wal_buffers


Solaris:  O_DIRECT doesn't work, there's another API support has never 
been added for; see 
http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and


Windows:  Small reported gains for O_DIRECT, i.e 10% at 
http://archives.postgresql.org/pgsql-hackers/2007-03/msg01615.php


FreeBSD:  It probably works there, but I've never seen good performance 
tests of it on this platform.


Mac OS X:  Like Solaris, there's a similar mechanism but it's not 
O_DIRECT; see 
http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag 
for notes about the F_NOCACHE  feature used.  Same basic situation as 
Solaris; there's an API, but PostgreSQL doesn't use it yet.


So my guess is that some small percentage of Windows users might notice 
a change here, and some testing on FreeBSD would be useful too.  That's 
about it for platforms that I think anybody needs to worry about.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Josh Berkus


> However, if you were doing something like parallel pg_dump you could
> just run the parent and child instances all against the slave, so the
> pg_dump scenario doesn't seem to offer much of a supporting use-case for
> worrying about this.  When would you really need to be able to do it?

If you had several standbys, you could distribute the work of the
pg_dump among them.  This would be a huge speedup for a large database,
potentially, thanks to parallelization of I/O and network.  Imagine
doing a pg_dump of a 300GB database in 10min.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner

Heikki Linnakangas  wrote:
> On 06.12.2010 22:53, Kevin Grittner wrote:
>> What would be the correct way for a process to put itself to
>> sleep, and for another process to later wake it up?
> 
> See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.
 
Is there a reason to prefer one over the other?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Heikki Linnakangas


On 06.12.2010 22:53, Kevin Grittner wrote:

What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?


See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner

Florian Pflug  wrote:
> On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
>> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE
>> under SSI would be to have each non-read-only serializable
>> transaction acquire a heavyweight lock which can coexist with
>> other locks at the same level (SHARE looks good) on some common
>> object and hold that for the duration of the transaction, while a
>> SERIALIZABLE READ ONLY DEFERRABLE transaction would need to
>> acquire a conflicting lock (EXCLUSIVE looks good) before it could
>> acquire a snapshot, and release the lock immediately after
>> acquiring the snapshot.
> 
> Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to
> acquire the lock, no other transaction would be allowed to start
> until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been
> able to acquire its snapshot. For pg_dump's purposes at least,
> that seems undesirable, since a single long-running transaction at
> the time you start pg_dump would effectly DoS your system until
> the long-running transaction finishes.

Well, when you put it that way, it sounds pretty grim.  :-(  Since
one of the bragging points of SSI is that it doesn't introduce any
blocking beyond current snapshot isolation, I don't want to do
something here which blocks anything except the transaction which
has explicitly requested the DEFERRABLE property.  I guess that,
simple as that technique might be, it just isn't a good idea.

> The alternative seems to be to drop the guarantee that a
> SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
> stream of overlapping non-READ ONLY transactions. Then a flag in
> the proc array that marks non-READ ONLY transactions should be
> sufficient, plus a wait-and-retry loop to take snapshots for
> SERIALIZABLE READ ONLY DEFERRABLE transactions.

If I can find a way to pause an active process I already have
functions in which I maintain the count of active SERIALIZABLE READ
WRITE transactions as they begin and end -- I could release pending
DEFERRABLE transactions when the count hits zero without any
separate loop.  That has the added attraction of being a path to the
more complex checking which could allow the deferrable process to
start sooner in some circumstances.  The "simple" solution with the
heavyweight lock would not have been a good path to that.

What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

marcin mank  writes:
> On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane  wrote:
>> IIRC, in old discussions of this problem we first considered allowing
>> clients to pull down an explicit representation of their snapshot (which
>> actually is an existing feature now, txid_current_snapshot()) and then
>> upload that again to become the active snapshot in another connection.

> Could a hot standby use such a snapshot representation? I.e. same
> snapshot on the master and the standby?

Hm, that's a good question.  It seems like it's at least possibly
workable, but I'm not sure if there are any showstoppers.  The other
proposal of publish-a-snapshot would presumably NOT support this, since
we'd not want to ship the snapshot temp files down the WAL stream.

However, if you were doing something like parallel pg_dump you could
just run the parent and child instances all against the slave, so the
pg_dump scenario doesn't seem to offer much of a supporting use-case for
worrying about this.  When would you really need to be able to do it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas


On 06.12.2010 21:48, marcin mank wrote:

On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane  wrote:

IIRC, in old discussions of this problem we first considered allowing
clients to pull down an explicit representation of their snapshot (which
actually is an existing feature now, txid_current_snapshot()) and then
upload that again to become the active snapshot in another connection.


Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?


Hmm, I suppose it could. That's an interesting idea, you could run 
parallel pg_dump or something else against master and/or multiple hot 
standby servers, all working on the same snapshot.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane

Greg Smith  writes:
> Regardless, I'm now leaning heavily toward the idea of avoiding 
> open_datasync by default given this bug, and backpatching that change to 
> at least 8.4.  I'll do some more database-level performance tests here 
> just as a final sanity check on that.  My gut feel is now that we'll 
> eventually be taking something like Marti's patch, adding some more 
> documentation around it, and applying that to HEAD as well as some 
> number of back branches.

I think we have got consensus that (1) open_datasync should not be the
default on Linux, and (2) this change needs to be back-patched.  What
is not clear to me is whether we have consensus to change the option
preference order globally, or restrict the change to just be effective
on Linux.  The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.  However, it seems
reasonable to me to suppose that open_datasync could only be a win in
very restricted scenarios and thus shouldn't be a preferred default.
Also, I dread trying to document the behavior if the preference order
becomes platform-dependent.

With the holidays fast approaching, our window to do something about
this in a timely fashion grows short.  If we don't schedule update
releases to be made this week, I think we're looking at not getting the
updates out till after New Year's.  Do we want to wait that long?  Is
anyone actually planning to do performance testing that would prove
anything about non-Linux platforms?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread marcin mank

On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane  wrote:
> IIRC, in old discussions of this problem we first considered allowing
> clients to pull down an explicit representation of their snapshot (which
> actually is an existing feature now, txid_current_snapshot()) and then
> upload that again to become the active snapshot in another connection.

Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Florian Pflug

On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
> SSI would be to have each non-read-only serializable transaction
> acquire a heavyweight lock which can coexist with other locks at the
> same level (SHARE looks good) on some common object and hold that for
> the duration of the transaction, while a SERIALIZABLE READ ONLY
> DEFERRABLE transaction would need to acquire a conflicting lock
> (EXCLUSIVE looks good) before it could acquire a snapshot, and
> release the lock immediately after acquiring the snapshot.

Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, 
no other transaction would be allowed to start until the SERIALIZABLE READ ONLY 
DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's 
purposes at least, that seems undesirable, since a single long-running 
transaction at the time you start pg_dump would effectly DoS your system until 
the long-running transaction finishes.

The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY 
DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY 
transactions. Then a flag in the proc array that marks non-READ ONLY 
transactions should be sufficient, plus a wait-and-retry loop to take snapshots 
for SERIALIZABLE READ ONLY DEFERRABLE transactions.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 11:36 AM, Tom Lane wrote:

> There's a difference between whether an extension as such is considered
> to belong to a schema and whether its contained objects do.  We can't
> really avoid the fact that functions, operators, etc must be assigned to
> some particular schema.  

Right, of course.

> It seems not particularly important that
> extension names be schema-qualified, though --- the use-case for having
> two different extensions named "foo" installed simultaneously seems
> pretty darn small.  On the other hand, if we were enforcing that all
> objects contained in an extension belong to the same schema, it'd make
> logistical sense to consider that the extension itself belongs to that
> schema as well.  But last I heard we didn't want to enforce such a
> restriction.

Okay.

> I believe what the search_path substitution is actually about is to
> provide a convenient shorthand for the case that all the contained
> objects do indeed live in one schema, and you'd like to be able to
> select that schema at CREATE EXTENSION time.  Which seems like a useful
> feature for a common case.  We've certainly heard multiple complaints
> about the fact that you can't do that easily now.

Yes, it *is* useful. But what happens if I have 

  SET search_path = whatever;

In my extension install script, and someone executes CREATE EXTENSION FOO WITH 
SCHEMA bar; Surprise! Everything is in whatever, not in bar.

> BTW, I did think of a case where substitution solves a problem we don't
> presently have any other solution for: referring to the target schema
> within the definition of a contained object.  As an example, you might
> wish to attach "SET search_path = @target_schema@" to the definition of
> a SQL function in an extension, to prevent search-path-related security
> issues in the use of the function.  Without substitution you'll be
> reduced to hard-wiring the name of the target schema.

You lost me. :-(

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread Alexandre Riveira


Please

It would be very important to us that the Brazilian LIKE collate worked 
with, and possible case-insensitive and accent-insensitive


Tank's

Alexandre Riveira
Brazil

Peter Eisentraut escreveu:

On mÃ¥n, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote:
  

I've been wondering if this patch will support case-insensitve
collations. If so, then citext should probably be revised to use one.



This has been touch upon several times during the discussions on past
patches.

Essentially, the current patch only arranges that you can specify a sort
order for data.  The system always breaks ties using a binary
comparison.  This could conceivably be changed, but it's a separate
problem.  Some of the necessary investigation work has presumably
already been done in the context of citext.



  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus


> Please explain more precisely what is wrong with SET SESSION
> AUTHORIZATION / SET ROLE.

1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
had any time to work on)

2) Users can always issue their own SET ROLE and then "hack into" other
users' data.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Label switcher function

2010-12-06 Thread Robert Haas

2010/11/25 KaiGai Kohei :
> The attached patch is a revised one.
>
> It provides two hooks; the one informs core PG whether the supplied
> function needs to be hooked, or not. the other is an actual hook on
> prepare, start, end and abort of function invocations.
>
>  typedef bool (*needs_function_call_type)(Oid fn_oid);
>
>  typedef void (*function_call_type)(FunctionCallEventType event,
>                                     FmgrInfo *flinfo, Datum *private);
>
> The hook prototype was a bit modified since the suggestion from
> Robert. Because FmgrInfo structure contain OID of the function,
> it might be redundant to deliver OID of the function individually.
>
> Rest of parts are revised according to the comment.
>
> I also fixed up source code comments which might become incorrect.

FCET_PREPARE looks completely unnecessary to me.  Any necessary
one-time work can easily be done at FCET_START time, assuming that the
private-data field is initialized to (Datum) 0.

I'm fairly certain that the following is not portable:

+   ObjectAddress   object = { .classId = ProcedureRelationId,
+  .objectId = fn_oid,
+  .objectSubId = 0 };

I'd suggest renaming needs_function_call_type and function_call_type
to needs_fmgr_hook_type and fmgr_hook_type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 11:29 AM, Peter Eisentraut wrote:

> This has been touch upon several times during the discussions on past
> patches.
> 
> Essentially, the current patch only arranges that you can specify a sort
> order for data.  The system always breaks ties using a binary
> comparison.  This could conceivably be changed, but it's a separate
> problem.  Some of the necessary investigation work has presumably
> already been done in the context of citext.

Okay, thanks, good to know.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane

"David E. Wheeler"  writes:
> The other question I have, though, is how important is it to have extensions 
> live in a particular schema since there seems to be no advantage to doing so. 
> With the current patch, I can put extension "foo" in schema "bar", but I 
> can't put any other extension named "foo" in any other schema. It's in schema 
> "bar" but is at the same time global. That doesn't make much sense to me.

There's a difference between whether an extension as such is considered
to belong to a schema and whether its contained objects do.  We can't
really avoid the fact that functions, operators, etc must be assigned to
some particular schema.  It seems not particularly important that
extension names be schema-qualified, though --- the use-case for having
two different extensions named "foo" installed simultaneously seems
pretty darn small.  On the other hand, if we were enforcing that all
objects contained in an extension belong to the same schema, it'd make
logistical sense to consider that the extension itself belongs to that
schema as well.  But last I heard we didn't want to enforce such a
restriction.

I believe what the search_path substitution is actually about is to
provide a convenient shorthand for the case that all the contained
objects do indeed live in one schema, and you'd like to be able to
select that schema at CREATE EXTENSION time.  Which seems like a useful
feature for a common case.  We've certainly heard multiple complaints
about the fact that you can't do that easily now.

BTW, I did think of a case where substitution solves a problem we don't
presently have any other solution for: referring to the target schema
within the definition of a contained object.  As an example, you might
wish to attach "SET search_path = @target_schema@" to the definition of
a SQL function in an extension, to prevent search-path-related security
issues in the use of the function.  Without substitution you'll be
reduced to hard-wiring the name of the target schema.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread Peter Eisentraut

On mån, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote:
> I've been wondering if this patch will support case-insensitve
> collations. If so, then citext should probably be revised to use one.

This has been touch upon several times during the discussions on past
patches.

Essentially, the current patch only arranges that you can specify a sort
order for data.  The system always breaks ties using a binary
comparison.  This could conceivably be changed, but it's a separate
problem.  Some of the necessary investigation work has presumably
already been done in the context of citext.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread Peter Eisentraut

On mån, 2010-12-06 at 21:06 +0900, Itagaki Takahiro wrote:
> On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut  wrote:
> > Here is an updated patch to address the issues discussed during this
> > commitfest.
> 
> Here are comments and questions after I tested the latest patch:
> 
>  Issues 
> * initdb itself seems to be succeeded, but it says "could not determine
> encoding for locale" messages for any combination of encoding=utf8/eucjp
> and locale=ja_JP.utf8/ja_JP.eucjp/C. Is it an expected behavior?
> 
> creating collations ...initdb: locale name has non-ASCII characters,
> skipped: bokm虱
> initdb: locale name has non-ASCII characters, skipped: fran軋is
> could not determine encoding for locale "hy_AM.armscii8": codeset is 
> "ARMSCII-8"
> ... (a dozen of lines) ...
> could not determine encoding for locale "vi_VN.tcvn": codeset is "TCVN5712-1"
> ok
> 

What this does it take the output of locale -a and populate the
pg_collation catalog with the locales it finds.  When it finds an
operating system locale that uses an encoding that is not recognized,
you will see this warning.

I understand that that would probably annoy users.  We could hide the
warning and silently skip those locales.  But then could that hide
genuine configuration problems?

> * contrib/citext raises an encoding error when COLLATE is specified
> even if it is the collation as same as the database default.
> We might need some special treatment for C locale.
> =# SHOW lc_collate;  ==> C
> =# SELECT ('Ａ'::citext) = ('ａ'::citext);  ==> false
> =# SELECT ('Ａ'::citext) = ('ａ'::citext) COLLATE "C";
> ERROR:  invalid multibyte character for locale
> HINT:  The server's LC_CTYPE locale is probably incompatible with the
> database encoding.

OK, I can reproduce that.  That's fallout from the lc_ctype_is_c()
optimization that I removed, as explained in another email.  I'll have
to think about that again.

> * pg_dump would generate unportable files for different platforms
> because collation names

pg_dump can already produce unportable files for a number of other
reasons, including per-database locale, tablespaces, OS-dependent
configuration settings.

The way I imagine this working is that someone who wants to design a
genuinely portable application using this feature would create their own
collation based on the existing, OS-specific collation (using a
to-be-added CREATE COLLATION command).  As mentioned earlier, however,
we can't actually solve the problem that the OS locales may not behave
the same across systems.

>  Source codes 
> * PG_GETARG_COLLATION might be a better name rather than PG_GET_COLLATION.

It's not the collation of a function argument, it's the collation of a
function call.  (You could conceivably also fetch the collation of a
function argument, but that isn't used in any way.)

> * What is the different between InvalidOid and DEFAULT_COLLATION_OID
> for collation oids? The patch replaces DirectFunctionCall to
> DirectFunctionCallC in some places, but we could shrink the diff size
> if we can use InvalidOid instead of DEFAULT_COLLATION_OID,

Think of DEFAULT_COLLATION_OID as analogous to UKNOWNOID.  A long time
ago we used InvalidOid for all kinds of types, including unknown,
pseudotypes, cstring, and no type at all.  The reason we changed this
was that this masked errors and made processing of the unknown type
difficult/impossible.  I know this makes the code bigger, but it's
necessary.  I originally coded the patch using InvalidOid for
everything, but that wasn't very robust.

This also ties into the next question ...

> * I still think an explicit passing collations from-function-to-function
> is horrible because we might forget it in some places, and almost existing
> third party module won't work.  Is it possible to make it a global variable,
> and push/pop the state when changed? Sorry I'm missing something, but
> I think we could treat the collation setting as like as GUC settings.

A collation is a property of a datum or an expression.  You might as
well argue that we don't keep track of types of expressions and instead
store it globally.  Doesn't make sense.

Extensions are not required to support collations.  Those that might
want to will usually end up calling one of the locale-enabled functions
such as varstr_cmp(), and there the function prototype will ensure that
specifying a collation cannot be missed.

Additionally, the distinction of InvalidOid and DEFAULT_COLLATION_OID
does a great deal to ensure that in case a collation is unspecified or
missing in some new code, you will get a proper error message instead of
unspecified behavior.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 11:12 AM, Tom Lane wrote:

> Well, I don't put any stock in the idea that it's important for existing
> module .sql files to be usable as-is as extension definition files.  If
> it happens to fall out that way, fine, but we shouldn't give up anything
> else to get that.

I agree, but I don't think we have to lose anything.

>  Letting extension files be directly sourceable in
> psql is probably worth a bit more, but I'm not sure how much.  The
> argument that forgetting to include a magic source_path command would
> make CREATE EXTENSION behave surprisingly seems to have a good deal of
> merit though, certainly enough to justify having CREATE EXTENSION take
> care of that internally if at all possible.

Yes.

The other question I have, though, is how important is it to have extensions 
live in a particular schema since there seems to be no advantage to doing so. 
With the current patch, I can put extension "foo" in schema "bar", but I can't 
put any other extension named "foo" in any other schema. It's in schema "bar" 
but is at the same time global. That doesn't make much sense to me.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane

"David E. Wheeler"  writes:
> On Dec 6, 2010, at 10:43 AM, Tom Lane wrote:
>> (On the other hand, if we *could* avoid using any explicit
>> substitutions, it would certainly ease testing of extension files no?
>> They'd be sourceable into psql then.)

> Yes. And extension authors would not have to remember to include the magic 
> line (which at any rate would break extensions for earlier versions of 
> PostgreSQL).

Well, I don't put any stock in the idea that it's important for existing
module .sql files to be usable as-is as extension definition files.  If
it happens to fall out that way, fine, but we shouldn't give up anything
else to get that.  Letting extension files be directly sourceable in
psql is probably worth a bit more, but I'm not sure how much.  The
argument that forgetting to include a magic source_path command would
make CREATE EXTENSION behave surprisingly seems to have a good deal of
merit though, certainly enough to justify having CREATE EXTENSION take
care of that internally if at all possible.

The real question in my mind is whether there are any other known or
foreseeable cases where we would need to have substitution capability
and there's not another good way to handle it.  I haven't been paying
real close attention to the threads about this patch --- do we have any
specific use-cases in mind for substitution, besides this one?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner

Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> Tom Lane  wrote:
>>> No.  See subtransactions.
> 
>> Subtransactions are included in snapshots?
> 
> Sure, see GetSnapshotData().  You could avoid it by setting
> suboverflowed, but that comes at a nontrivial performance cost.

Yeah, sorry for blurting like that before I checked.  I was somewhat
panicked that I'd missed something important for SSI, because my
XidIsConcurrent check just uses xmin, xmax, and xip; I was afraid
what I have would fall down in the face of subtransactions.  But on
review I found that I'd thought that through and (discussion in in
the archives) I always wanted to associate the locks and conflicts
with the top level transaction; so that was already identified
before checking for overlap, and it was therefore more efficient to
just check that.

Sorry for the "senior moment".  :-/

Perhaps a line or two of comments about that in the SSI patch would
be a good idea.  And maybe some tests involving subtransactions

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 10:43 AM, Tom Lane wrote:

> That's an interesting idea, but I'm not sure it's wise to design around
> the assumption that we won't need substitutions ever.  What I was
> thinking was that we should try to limit knowledge of the substitution
> behavior to the extension definition files and the implementation of
> CREATE EXTENSION itself.  I don't agree with exposing that information
> at the SQL level.
> 
> (On the other hand, if we *could* avoid using any explicit
> substitutions, it would certainly ease testing of extension files no?
> They'd be sourceable into psql then.)

Yes. And extension authors would not have to remember to include the magic line 
(which at any rate would break extensions for earlier versions of PostgreSQL).

Best,

dAvid
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

"Kevin Grittner"  writes:
> Tom Lane  wrote:
>> No.  See subtransactions.
 
> Subtransactions are included in snapshots?

Sure, see GetSnapshotData().  You could avoid it by setting
suboverflowed, but that comes at a nontrivial performance cost.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane

"David E. Wheeler"  writes:
> On Dec 6, 2010, at 7:19 AM, Tom Lane wrote:
>> On the whole I'd prefer not to have any substitution functionality
>> hard-wired into pg_execute_file either, though I can see the argument
>> that it's necessary for practical use.  Basically I'm concerned that
>> replace-equivalent behavior is not going to be satisfactory over the
>> long run: I think eventually we're going to need to think about
>> quoting/escaping behavior.  So I think it's a bad idea to expose the
>> assumption that it'll be done that way at the SQL level.

> +1

> I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION 
> could be modified to handle setting the schema itself, without requiring that 
> the file have this magic line:

>SET search_path = @extschema@;

> Then there would be no need for substitutions at all.

That's an interesting idea, but I'm not sure it's wise to design around
the assumption that we won't need substitutions ever.  What I was
thinking was that we should try to limit knowledge of the substitution
behavior to the extension definition files and the implementation of
CREATE EXTENSION itself.  I don't agree with exposing that information
at the SQL level.

(On the other hand, if we *could* avoid using any explicit
substitutions, it would certainly ease testing of extension files no?
They'd be sourceable into psql then.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 01:23 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 12/06/2010 12:11 PM, Tom Lane wrote:

Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?

AFAICT it's not used in binary mode at all. But I will double check.

Well, even if it is not used at the moment, it seems potentially of use
in that path.  So I'd vote for continuing to set it correctly, rather
than making it deliberately incorrect as this patch is going out of its
way to do.




Ok.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner

Tom Lane  wrote:
> "Kevin Grittner"  writes:
 
>> Surely you can predict that any snapshot is no larger than a fairly
>> small fixed portion plus sizeof(TransactionId) * MaxBackends?
> 
> No.  See subtransactions.
 
Subtransactions are included in snapshots?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

"Kevin Grittner"  writes:
> Tom Lane  wrote:
>>> I'm still not convinced that using shared memory is a bad way to 
>>> pass these around. Surely we're not talking about large numbers
>>> of them.  What am I missing here?
>> 
>> They're not of a very predictable size.
 
> Surely you can predict that any snapshot is no larger than a fairly
> small fixed portion plus sizeof(TransactionId) * MaxBackends?

No.  See subtransactions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Tom Lane

Andrew Dunstan  writes:
> On 12/06/2010 12:11 PM, Tom Lane wrote:
>> Also, why did you change the setup code to
>> not compute nfields in binary mode?  That seems at best an unnecessary
>> change, and at worst a breakage of the binary path --- did you test it?

> AFAICT it's not used in binary mode at all. But I will double check.

Well, even if it is not used at the moment, it seems potentially of use
in that path.  So I'd vote for continuing to set it correctly, rather
than making it deliberately incorrect as this patch is going out of its
way to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Tom Lane

"Kevin Grittner"  writes:
> I reviewed the documentation and played around with this a bit and
> can't find any areas where the current PostgreSQL implementation of
> READ ONLY is incompatible with what is needed for the SSI
> optimizations where it is used.  There are a large number of tests
> which exercise this, and they're all passing.
 
> Did you have something in particular in mind which I should check? 

I did not, just thought it was a point that merited examination.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner

Tom Lane  wrote:

>> I'm still not convinced that using shared memory is a bad way to 
>> pass these around. Surely we're not talking about large numbers
>> of them.  What am I missing here?
> 
> They're not of a very predictable size.

Surely you can predict that any snapshot is no larger than a fairly
small fixed portion plus sizeof(TransactionId) * MaxBackends?  So,
for example, if you're configured for 100 connections, you'd be
limited to something under 1kB, maximum?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread Pavel Stehule

2010/12/6 David E. Wheeler :
> On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote:
>
>> * contrib/citext raises an encoding error when COLLATE is specified
>> even if it is the collation as same as the database default.
>> We might need some special treatment for C locale.
>
> I've been wondering if this patch will support case-insensitve collations. If 
> so, then citext should probably be revised to use one.

what I know - no. It's support only system based collations

Pavel

>
> Best,
>
> David
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 12:11 PM, Tom Lane wrote:

Andrew Dunstan  writes:

Attached is a patch that allows CopyReadAttibutesText() and
CopyReadAttributesCSV() to read arbitrary numbers of attributes.
Underflowing attributes are recorded as null, and space is made for
overflowing attributes on a line.

Why are you still passing nfields as a separate parameter instead of
relying on the value you added to the struct?  That can't do anything
except cause confusion, especially once the two values diverge due to a
previous array-expansion.


Good point. will fix.


   Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?


AFAICT it's not used in binary mode at all. But I will double check.


Also please be a little more careful with the formatting.


Ok, Will fix also. Thanks for he comments.

cheers

andre

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 12:41 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane  wrote:
>>> Why is there a variadic replace() in this patch at all?  It seems just
>>> about entirely unrelated to the stated purpose of the patch, as well
>>> as being of dubious usefulness.  When would it be superior to
>>>        replace(replace(orig, from1, to1), from2, to2), ...
>
>> An iterated replacement has different semantics from a simultaneous
>> replace - replacing N placeholders with values simultaneously means
>> you don't need to worry about the case where one of the replacement
>> strings contains something that looks like a placeholder.
>
> Good point, but what the patch implements is in fact iterated
> replacement ... or at least it looked that way in a quick once-over.

Oh.  Well, -1 from me for including that.

>> I actually
>> think a simultaneous replacement feature would be quite handy but I
>> make no comment on whether it belongs as part of this patch.
>
> My point is that the replacement stuff really really needs to be
> factored out of the string-execution stuff, precisely because the
> desired behavior is debatable.

+1 for committing the uncontroversial parts separately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 12:57 PM, Josh Berkus  wrote:
>>> At some point Hackers should look at pg vs MySQL multi tenantry but it
>>> is way tangential today.
>>
>> My understanding is that our schemas work like MySQL databases; and
>> our databases are an even higher level of isolation.  No?
>
> That's correct.  Drizzle is looking at implementing a feature like our
> databases called "catalogs" (per the SQL spec).
>
> Let me stress that not everyone is happy with the MySQL multi-tenantry
> approach.  But it does make multi-tenancy on a scale which you seldom see
> with PG possible, even if it has problems.  It's worth seeing whether we can
> steal any of their optimization ideas without breaking PG.

Please make sure to articulate what you think is wrong with our existing model.

> I was specifically looking at the login model, which works around the issue
> that we have: namely that different login ROLEs can't share a connection
> pool.  In MySQL, they can share the built-in connection "pool" because
> role-switching effectively is a session variable. AFAICT, anyway.

Please explain more precisely what is wrong with SET SESSION
AUTHORIZATION / SET ROLE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 7:19 AM, Tom Lane wrote:

> On the whole I'd prefer not to have any substitution functionality
> hard-wired into pg_execute_file either, though I can see the argument
> that it's necessary for practical use.  Basically I'm concerned that
> replace-equivalent behavior is not going to be satisfactory over the
> long run: I think eventually we're going to need to think about
> quoting/escaping behavior.  So I think it's a bad idea to expose the
> assumption that it'll be done that way at the SQL level.

+1

I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION 
could be modified to handle setting the schema itself, without requiring that 
the file have this magic line:

   SET search_path = @extschema@;

Then there would be no need for substitutions at all.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-column collation

2010-12-06 Thread David E. Wheeler

On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote:

> * contrib/citext raises an encoding error when COLLATE is specified
> even if it is the collation as same as the database default.
> We might need some special treatment for C locale.

I've been wondering if this patch will support case-insensitve collations. If 
so, then citext should probably be revised to use one.

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus




At some point Hackers should look at pg vs MySQL multi tenantry but it
is way tangential today.


My understanding is that our schemas work like MySQL databases; and
our databases are an even higher level of isolation.  No?


That's correct.  Drizzle is looking at implementing a feature like our 
databases called "catalogs" (per the SQL spec).


Let me stress that not everyone is happy with the MySQL multi-tenantry 
approach.  But it does make multi-tenancy on a scale which you seldom 
see with PG possible, even if it has problems.  It's worth seeing 
whether we can steal any of their optimization ideas without breaking PG.


I was specifically looking at the login model, which works around the 
issue that we have: namely that different login ROLEs can't share a 
connection pool.  In MySQL, they can share the built-in connection 
"pool" because role-switching effectively is a session variable. 
AFAICT, anyway.


For that matter, if anyone knows any other DB which does multi-tenant 
well/better, we should be looking at them too.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 12:38 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> One possible way to do make an improvement in this area would be to
>> move the responsibility for accepting connections out of the
>> postmaster.  Instead, you'd have a group of children that would all
>> call accept() on the socket, and the OS would arbitrarily pick one to
>> receive each new incoming connection.  The postmaster would just be
>> responsible for making sure that there were enough children hanging
>> around.  You could in fact make this change without doing anything
>> else, in which case it wouldn't save any work but would possibly
>> reduce connection latency a bit since more of the work could be done
>> before the connection actually arrived.
>
> This seems like potentially a good idea independent of anything else,
> just to reduce connection latency: fork() (not to mention exec() on
> Windows) now happens before not after receipt of the connection request.
> However, I see a couple of stumbling blocks:
>
> 1. Does accept() work that way everywhere (Windows, I'm looking at you)

Not sure.  It might be useful to look at what Apache does, but I don't
have time to do that ATM.

> 2. What do you do when max_connections is exceeded, and you don't have
> anybody at all listening on the socket?  Right now we are at least able
> to send back an error message explaining the problem.

Sending back an error message explaining the problem seems like a
non-negotiable requirement.  I'm not quite sure how to dance around
this.  Perhaps if max_connections is exhausted, the postmaster itself
joins the accept() queue and launches a dead-end backend for each new
connection.  Or perhaps we reserve one extra backend slot for a
probably-dead-end backend that will just sit there and mail rejection
notices; except that if it sees that a regular backend slot has opened
up it grabs it and turns itself into a regular backend.

> Another issue that would require some thought is what algorithm the
> postmaster uses for deciding to spawn new children.  But that doesn't
> sound like a potential showstopper.

The obvious algorithm would be to try to keep N spare workers around.
Any time the number of unconnected backends drops below N the
postmaster starts spawning new ones until it gets back up to N.  I
think the trick may not be the algorithm so much as finding a way to
make the signaling sufficiently robust and lightweight.  For example,
I bet having each child that gets a new connection signal() the
postmaster is a bad plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 12:28 PM, Tom Lane wrote:

Andrew Dunstan  writes:

Yeah. I'm still not convinced that using shared memory is a bad way to
pass these around. Surely we're not talking about large numbers of them.
What am I missing here?

They're not of a very predictable size.




Ah. Ok.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus


On 12/06/2010 09:38 AM, Tom Lane wrote:

Another issue that would require some thought is what algorithm the
postmaster uses for deciding to spawn new children.  But that doesn't
sound like a potential showstopper.


We'd probably want a couple of different ones, optimized for different 
connection patterns.  Realistically.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane

Robert Haas  writes:
> On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane  wrote:
>> Why is there a variadic replace() in this patch at all?  It seems just
>> about entirely unrelated to the stated purpose of the patch, as well
>> as being of dubious usefulness.  When would it be superior to
>>replace(replace(orig, from1, to1), from2, to2), ...

> An iterated replacement has different semantics from a simultaneous
> replace - replacing N placeholders with values simultaneously means
> you don't need to worry about the case where one of the replacement
> strings contains something that looks like a placeholder.

Good point, but what the patch implements is in fact iterated
replacement ... or at least it looked that way in a quick once-over.

> I actually
> think a simultaneous replacement feature would be quite handy but I
> make no comment on whether it belongs as part of this patch.

My point is that the replacement stuff really really needs to be
factored out of the string-execution stuff, precisely because the
desired behavior is debatable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Tom Lane

Robert Haas  writes:
> One possible way to do make an improvement in this area would be to
> move the responsibility for accepting connections out of the
> postmaster.  Instead, you'd have a group of children that would all
> call accept() on the socket, and the OS would arbitrarily pick one to
> receive each new incoming connection.  The postmaster would just be
> responsible for making sure that there were enough children hanging
> around.  You could in fact make this change without doing anything
> else, in which case it wouldn't save any work but would possibly
> reduce connection latency a bit since more of the work could be done
> before the connection actually arrived.

This seems like potentially a good idea independent of anything else,
just to reduce connection latency: fork() (not to mention exec() on
Windows) now happens before not after receipt of the connection request.
However, I see a couple of stumbling blocks:

1. Does accept() work that way everywhere (Windows, I'm looking at you)

2. What do you do when max_connections is exceeded, and you don't have
anybody at all listening on the socket?  Right now we are at least able
to send back an error message explaining the problem.

Another issue that would require some thought is what algorithm the
postmaster uses for deciding to spawn new children.  But that doesn't
sound like a potential showstopper.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

Andrew Dunstan  writes:
> Why not just say give me the snapshot currently held by process ?

There's not a unique snapshot held by a particular process.  Also, we
don't want to expend the overhead to fully publish every snapshot.
I think it's really necessary that the "sending" process take some
deliberate action to publish a snapshot.

> And please, not temp files if possible.

Barring the cleanup issue, I don't see why not.  This is a relatively
low-usage feature, I think, so I wouldn't be much in favor of dedicating
shmem to it even if the space requirement were predictable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

Andrew Dunstan  writes:
> Yeah. I'm still not convinced that using shared memory is a bad way to 
> pass these around. Surely we're not talking about large numbers of them. 
> What am I missing here?

They're not of a very predictable size.

Robert's idea of publish() returning a temp file identifier, which then
gets removed at transaction end, might work all right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FK's to refer to rows in inheritance child

2010-12-06 Thread Tom Lane

Robert Haas  writes:
> On Sun, Dec 5, 2010 at 12:41 PM, Andrew Dunstan  wrote:
>> Well, ISTM that amounts to not having "official topic branches" :-) I agree
>> that this is supposed to be one of git's strengths (or more exactly a
>> strength of distributed SCM's generally).  I don't really see any great
>> value in sanctifying a particular topic branch with some official status.

> I think the value in an official topic branch would be to allow formal
> incremental commit of large patches.  In other words, we could decide
> that a commit to the official topic branch must meet the same
> standards of quality normally applied to a commit to the master
> branch, and must go through the same process.  It would be understood
> that the topic branch would eventually be merged (with or without
> squash) back into the master branch, but that objections were to be
> raised as pieces were committed to the topic branch, not at merge
> time.  The merge itself would require consensus as to timing, but we'd
> agree to take a dim view of "I haven't reviewed anything that's been
> going on here for the last six months but now hate all of it".

Topic branches defined that way seem like a pretty bad idea from here.
They would save no effort at all for committers, because if you're not
allowed to object to something after it's gone into a topic branch, then
it's just like master in terms of having to keep up with changes as they
happen.  Moreover, we'd have to keep them in pretty close sync with
master --- otherwise what happens when you discover that some long-ago
change on master breaks the topic branch?  So AFAICS this would just
increase the amount of keeping-branches-in-sync dogwork without any
offsetting advantage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner

I wrote:
> Tom Lane  wrote:

>> I assume this would have to be a "hard" definition of READ ONLY,
>> not the rather squishy definition we use now?

> I'm excluding temporary tables from SSI on the grounds that they
> are only read and written by a single transaction and therefore
> can't be a source of rw-dependencies, and I'm excluding system
> tables on the grounds that they don't follow normal snapshot
> isolation rules. Hint bit rewrites are not an issue for SSI.  Are
> there any other squishy aspect I might need to consider?

I reviewed the documentation and played around with this a bit and
can't find any areas where the current PostgreSQL implementation of
READ ONLY is incompatible with what is needed for the SSI
optimizations where it is used.  There are a large number of tests
which exercise this, and they're all passing.

Did you have something in particular in mind which I should check? 
An example of code you think might break would be ideal, but
anything more concrete than the word "squishy" would be welcome.

Any thoughts on the original question about what to use as a
heavyweight lock to support the subject feature?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Tom Lane

Andrew Dunstan  writes:
> Attached is a patch that allows CopyReadAttibutesText() and 
> CopyReadAttributesCSV() to read arbitrary numbers of attributes. 
> Underflowing attributes are recorded as null, and space is made for 
> overflowing attributes on a line.

Why are you still passing nfields as a separate parameter instead of
relying on the value you added to the struct?  That can't do anything
except cause confusion, especially once the two values diverge due to a
previous array-expansion.  Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?

Also please be a little more careful with the formatting.  This for
instance is pretty sloppy:

!  * strings.  cstate->raw_fields[k] is set to point to the k'th attribute 
!  * string, * or NULL when the input matches the null marker string.  

and there seem to be some gratuitous whitespace changes as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Dmitriy Igrishin

2010/12/6 Andrew Chernow 

> On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote:
>
>> IMO, it would be better to implement some utility functions to
>> make it easy to construct arrays dynamically for PQexecParams
>> and PQexecPrepared. This seems to me more universal solution
>> and it is useful for both -- high level libpq-libraries authors and for
>> those who like to use libpq directly.
>>
>>
> Hmm, your idea isn't better, it is identical to what libpqtypes already
> does :)
> http://libpqtypes.esilo.com/browse_source.html?file=exec.c
>
Actually I don't  need this functionality :-). I've implemented a library on
C++
which does many things, including auto memory management, type
conversion and binary transfers easy...
But I believe, that including proposed utility functions are better than
printf-like addition... Although, both of these a excess.


> We wrap PQexecParams and friends.  You are coding libpq.  We extended much
> effort to provide the same result interface (PGresult), including handling
> composites and arrays.  You getf composites and arrays as PGresults; where a
> composite is a single tuple multiple field result, an array is a multiple
> tuple single field result and composite arrays are multiple tuples and
> multiple fields.  We've just made a more formal set of utility functions,
> typically called an API, in an attempt to match the coding standards of the
> postgresql project.
>
There is no libpq param interface like results, so we added PGparam stuff.
> This allows you to pack parameters (PQputf) and than execute it.

So, let libpq will not be bloated. Let libpq remain low-level library for
projects like libpqtypes, pqxx and so on (my library too) ;-)


>
>
> --
> Andrew Chernow
> eSilo, LLC
> every bit counts
> http://www.esilo.com/
>



-- 
// Dmitriy.

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Andrew Chernow


On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote:

IMO, it would be better to implement some utility functions to
make it easy to construct arrays dynamically for PQexecParams
and PQexecPrepared. This seems to me more universal solution
and it is useful for both -- high level libpq-libraries authors and for
those who like to use libpq directly.



Hmm, your idea isn't better, it is identical to what libpqtypes already 
does :)

http://libpqtypes.esilo.com/browse_source.html?file=exec.c

We wrap PQexecParams and friends.  You are coding libpq.  We extended 
much effort to provide the same result interface (PGresult), including 
handling composites and arrays.  You getf composites and arrays as 
PGresults; where a composite is a single tuple multiple field result, an 
array is a multiple tuple single field result and composite arrays are 
multiple tuples and multiple fields.  We've just made a more formal set 
of utility functions, typically called an API, in an attempt to match 
the coding standards of the postgresql project.


There is no libpq param interface like results, so we added PGparam 
stuff. This allows you to pack parameters (PQputf) and than execute it.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Dmitriy Igrishin

IMO, it would be better to implement some utility functions to
make it easy to construct arrays dynamically for PQexecParams
and PQexecPrepared. This seems to me more universal solution
and it is useful for both -- high level libpq-libraries authors and for
those who like to use libpq directly.

2010/12/6 Andrew Chernow 

>
>
>> That would be a *HUGE* piece of software compared the relatively small
>> thing I am suggesting...
>>
>>
> Sometimes complex and large solutions are required for the simplest of
> ideas.  I believe this is one of those cases.  You can't solve the "printf
> style PQexec" properly by merely implementing a sprintf wrapper.
>
>
>  As for escaping (or not escaping) of string arguments, that can be seen
>> as a bug or a feature.  I do not wan't automatic escaping of string
>> arguments in all cases, e.g. I might to construct an SQL statement with
>> dynamic parts "WHERE xy" or "AND a = b".
>>
>> hypothetical example:
>>
>> filter = "WHERE name like 'Balmer%'";
>> if (sort == SORT_DESC)
>>sort = " ORDER BY name DESCENDING";
>>
>> PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort);
>>
>> So what I am aiming at right now is a PQvexec() function that basically
>> has printf() like semantics, but adds an additional token to the format
>> string (printf uses %s and %b to produce strings.) I am thinking of
>> adding %S and %B, which produce strings that are escaped.
>>
>>
> This suffers from becoming cryptic over time, see Tom Lane's comments back
> in 2007 on this (
> http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php).
> libpqtypes uses the human readable %schema.typename (schema is optional) to
> specify format specifiers.  There is no learning curve or ambiguity, if you
> want a point than use "%point", or "%my_type"  libpqtypes allows you to
> register aliases (PQregisterSubClasses) so that you can map %text to %s to
> make it feel more like C..
>
>
> --
> Andrew Chernow
> eSilo, LLC
> every bit counts
> http://www.esilo.com/
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
// Dmitriy.

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 10:40 AM, Tom Lane wrote:

Robert Haas  writes:

On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
  wrote:

Well, then you need some sort of cross-backend communication, which is
always a bit clumsy.

A temp file seems quite sufficient, and not at all difficult.

"Not at all difficult" is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?




Yeah. I'm still not convinced that using shared memory is a bad way to 
pass these around. Surely we're not talking about large numbers of them. 
What am I missing here?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 10:40 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
>>  wrote:
>>> Well, then you need some sort of cross-backend communication, which is
>>> always a bit clumsy.
>
>> A temp file seems quite sufficient, and not at all difficult.
>
> "Not at all difficult" is nonsense.  To do that, you need to invent some
> mechanism for sender and receivers to identify which temp file they want
> to use,

Why is this even remotely hard?  That's the whole point of having the
"publish" operation return a token.  The token either is, or uniquely
identifies, the file name.

> and you need to think of some way to clean up the files when the
> client forgets to tell you to do so.  That's going to be at least as
> ugly as anything else.

Backends don't forget to call their end-of-transaction hooks, do they?
 They might crash, but we already have code to remove temp files on
server restart.  At most it would need minor adjustment.

>  And I think it's unproven that this approach
> would be security-hole-free either.  For instance, what about some other
> session overwriting pg_dump's snapshot temp file?

Why would this be any different from any other temp file?  We surely
must have a mechanism in place to ensure that the temporary files used
by sorts or hash joins don't get overwritten by some other session, or
the system would be totally unstable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane

Robert Haas  writes:
> On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
>  wrote:
>> Well, then you need some sort of cross-backend communication, which is
>> always a bit clumsy.

> A temp file seems quite sufficient, and not at all difficult.

"Not at all difficult" is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 10:35 AM, Andrew Dunstan  wrote:
> On 12/06/2010 10:22 AM, Robert Haas wrote:
>>
>> On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
>>   wrote:
>>>
>>> On 06.12.2010 15:53, Robert Haas wrote:

 I guess.  It still seems far too much like exposing the server's guts
 for my taste.  It might not be as bad as the expression tree stuff,
 but there's nothing particularly good about it either.
>>>
>>> Note that we already have txid_current_snapshot() function, which exposes
>>> all that.
>>
>> Fair enough, and I think that's actually useful for Slony&c.  But I
>> don't think we should shy away of providing a cleaner API here.
>>
>
> Just don't let the perfect get in the way of the good :P

I'll keep that in mind.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan




On 12/06/2010 10:22 AM, Robert Haas wrote:

On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
  wrote:

On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.

Note that we already have txid_current_snapshot() function, which exposes
all that.

Fair enough, and I think that's actually useful for Slony&c.  But I
don't think we should shy away of providing a cleaner API here.



Just don't let the perfect get in the way of the good :P

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Andrew Chernow





That would be a *HUGE* piece of software compared the relatively small
thing I am suggesting...



Sometimes complex and large solutions are required for the simplest of 
ideas.  I believe this is one of those cases.  You can't solve the 
"printf style PQexec" properly by merely implementing a sprintf wrapper.



As for escaping (or not escaping) of string arguments, that can be seen
as a bug or a feature.  I do not wan't automatic escaping of string
arguments in all cases, e.g. I might to construct an SQL statement with
dynamic parts "WHERE xy" or "AND a = b".

hypothetical example:

filter = "WHERE name like 'Balmer%'";
if (sort == SORT_DESC)
sort = " ORDER BY name DESCENDING";

PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort);

So what I am aiming at right now is a PQvexec() function that basically
has printf() like semantics, but adds an additional token to the format
string (printf uses %s and %b to produce strings.) I am thinking of
adding %S and %B, which produce strings that are escaped.



This suffers from becoming cryptic over time, see Tom Lane's comments 
back in 2007 on this 
(http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php). 
libpqtypes uses the human readable %schema.typename (schema is optional) 
to specify format specifiers.  There is no learning curve or ambiguity, 
if you want a point than use "%point", or "%my_type"  libpqtypes 
allows you to register aliases (PQregisterSubClasses) so that you can 
map %text to %s to make it feel more like C..


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Kenneth Marshall

On Mon, Dec 06, 2010 at 10:14:55AM -0500, Merlin Moncure wrote:
> On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer  wrote:
> > Am 06.12.10 15:37, schrieb Merlin Moncure:
> >> On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander  
> >> wrote:
> >>> On Sun, Dec 5, 2010 at 10:22, Marc Balmer  wrote:
>  I am suggesting adding a function to libpq:
> 
>  PGresult *PQvexec(PGconn *conn, const char *fmt, ...);
> 
>  It behaves similar to PQexec, but it allows for printf style varargs and
> >>>
> >>> How is that not a horrible idea, compared to using PQexecParams()? You
> >>> have to remember to do all your escaping and things manually, whereas
> >>> PQexecParams() does it automatically.
> >>
> >> It's only horrible if you stick to printf style formatting and you are
> >> using sting techniques to inject parameters into the query. ?Non
> >> parameterized queries should obviously be discouraged. ?However, it's
> >> entirely possible to wrap the parameterized interfaces with vararg
> >> interface (I should know, because we did exactly that) :-). ?This
> >> gives you the best of both worlds, easy coding without sacrificing
> >> safety. ?You might not remember the libpqtypes proposal, but libpq was
> >> specifically extended with callbacks so that libpqtypes could exist
> >> after the community determined that libpqtypes was too big of a change
> >> to the libpq library. ?I think ultimately this should be revisited,
> >> with libpqtypes going in core or something even richer...I've been
> >> thinking for a while that postgres types should be abstracted out of
> >> the backend into a library that both client and server depend on.
> >>
> >> With libpqtypes, we decided to use postgres style format markers:
> >> select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint);
> >>
> >> Everything is schema qualified, so that user types are supported (of
> >> course, this requires implementing handling on the client).
> >>
> >> Data routed through the binary protocol, with all the byte swapping
> >> etc handled by the library. ?No escaping necessary. ?We also added
> >> full support for arrays and composites, which are a nightmare to deal
> >> with over straight libpq, and various other niceties like thread safe
> >> error handling.
> >
> > That would be a *HUGE* piece of software compared the relatively small
> > thing I am suggesting...
> 
> well, it's already written. All you would have to do is compile it.
> 
> > As for escaping (or not escaping) of string arguments, that can be seen
> > as a bug or a feature. ?I do not wan't automatic escaping of string
> > arguments in all cases, e.g. I might to construct an SQL statement with
> > dynamic parts "WHERE xy" or "AND a = b".
> 
> libpqtypes doesn't escape at all.  It uses the internal parameterized
> interfaces that don't require it.  For particular types, like bytea
> and timestamps, this much faster because we use the binary wire
> format.  Less load on the client and the server.
> 
> > hypothetical example:
> >
> > filter = "WHERE name like 'Balmer%'";
> > if (sort == SORT_DESC)
> > ? ? ? ?sort = " ORDER BY name DESCENDING";
> >
> > PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort);
> >
> > So what I am aiming at right now is a PQvexec() function that basically
> > has printf() like semantics, but adds an additional token to the format
> > string (printf uses %s and %b to produce strings.) I am thinking of
> > adding %S and %B, which produce strings that are escaped.
> >
> > That would be a small function, and reasonably safe. ?Or rather, the
> > safety is in the hands of the programmer.
> 
> What you are suggesting doesn't provide a lot of value over sprintf
> the query first, then exec it.  You can do what you are suggesting
> yourself, wrapping PQexec:
> 
> A hypothetical wrapper would be implemented something like:
> va_list ap;
> char buf[BUFSZ];
> va_start(ap, query)
> vsnprintf(buf, BUFSZ. query, ap);
> va_end(ap);
> return PQexec(buf);
> 
> This is a bad idea (security, escaping, performance)...we wrote a
> faster, safer way to do it, with richer type support.  Or you can do
> it yourself.
> 
> merlin
> 

I have used the libpqtypes library and it is very easy to use.

+1 for adding it or something like it to the PostgreSQL core.
I have people who will try and roll their own because it does
not come with the core. While it is a hoot to see what reinventing
the wheel produces, it is also prone to mistakes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FK's to refer to rows in inheritance child

2010-12-06 Thread Alvaro Herrera

Excerpts from Andrew Dunstan's message of dom dic 05 14:41:20 -0300 2010:

> What I would like to see is people publishing the location of 
> development repos so that they can be pulled from or merged, especially 
> for any large patch.

Yes, this is pretty useful.  Dimitri published his repos for the
extension stuff which I followed for a while, made some smallish changes
and sent them back, etc.  Very time-saving.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
 wrote:
> On 06.12.2010 15:53, Robert Haas wrote:
>>
>> I guess.  It still seems far too much like exposing the server's guts
>> for my taste.  It might not be as bad as the expression tree stuff,
>> but there's nothing particularly good about it either.
>
> Note that we already have txid_current_snapshot() function, which exposes
> all that.

Fair enough, and I think that's actually useful for Slony &c.  But I
don't think we should shy away of providing a cleaner API here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep

2010-12-06 Thread Robert Haas

On Mon, Dec 6, 2010 at 9:54 AM, Heikki Linnakangas
 wrote:
>>> This occurred to me that the timeout would be required even for
>>> asynchronous streaming replication. So, how about implementing the
>>> replication timeout feature before synchronous replication itself?
>>
>> Here is the patch. This is one of features required for synchronous
>> replication, so I added this into current CF as a part of synchronous
>> replication.
>
> Hmm, that's actually a quite different timeout than what's required for
> synchronous replication. In synchronous replication, you need to get an
> acknowledgment within a timeout. This patch only puts a timeout on how long
> we wait to have enough room in the TCP send buffer. That doesn't seem all
> that useful.

Yeah.  If we rely on the TCP send buffer filling up, then the amount
of time the master takes to notice a dead standby is going to be hard
for the user to predict.  I think the standby ought to send some sort
of heartbeat and the master should declare the standby dead if it
doesn't see a heartbeat soon enough.  Maybe the heartbeat could even
include the receive/fsync/replay LSNs, so that sync rep can use the
same machinery but with more aggressive policies about when they must
be sent.

I also can't help noticing that this approach requires drilling a hole
through the abstraction stack.  We just invented latches; if the API
is going to have to change every time someone wants to implement a
feature, we've built ourselves an awfully porous abstraction layer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane

Dimitri Fontaine  writes:
> Tom Lane  writes:
>> Why is there a variadic replace() in this patch at all?  It seems just
>> about entirely unrelated to the stated purpose of the patch, as well
>> as being of dubious usefulness.

> It used not to being exposed at the SQL level, but just an internal loop
> in pg_execute_sql_file() when using the placeholders enabled
> variant. Then Itagaki wanted me to expose internals so that he basically
> can implement the logics in SQL directly.  It seems like we went a step
> too far in exposing this facility too.  Agreed in removing it at the SQL
> level.

Well, actually, my next question was going to be about removing the
variadic substitution in pg_execute_string too.  It's not apparent to me
that that function should have a rather lame substitution mechanism
hard-wired into it, when you can do the same thing with replace() in
front of it.

On the whole I'd prefer not to have any substitution functionality
hard-wired into pg_execute_file either, though I can see the argument
that it's necessary for practical use.  Basically I'm concerned that
replace-equivalent behavior is not going to be satisfactory over the
long run: I think eventually we're going to need to think about
quoting/escaping behavior.  So I think it's a bad idea to expose the
assumption that it'll be done that way at the SQL level.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Merlin Moncure

On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer  wrote:
> Am 06.12.10 15:37, schrieb Merlin Moncure:
>> On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander  wrote:
>>> On Sun, Dec 5, 2010 at 10:22, Marc Balmer  wrote:
 I am suggesting adding a function to libpq:

 PGresult *PQvexec(PGconn *conn, const char *fmt, ...);

 It behaves similar to PQexec, but it allows for printf style varargs and
>>>
>>> How is that not a horrible idea, compared to using PQexecParams()? You
>>> have to remember to do all your escaping and things manually, whereas
>>> PQexecParams() does it automatically.
>>
>> It's only horrible if you stick to printf style formatting and you are
>> using sting techniques to inject parameters into the query.  Non
>> parameterized queries should obviously be discouraged.  However, it's
>> entirely possible to wrap the parameterized interfaces with vararg
>> interface (I should know, because we did exactly that) :-).  This
>> gives you the best of both worlds, easy coding without sacrificing
>> safety.  You might not remember the libpqtypes proposal, but libpq was
>> specifically extended with callbacks so that libpqtypes could exist
>> after the community determined that libpqtypes was too big of a change
>> to the libpq library.  I think ultimately this should be revisited,
>> with libpqtypes going in core or something even richer...I've been
>> thinking for a while that postgres types should be abstracted out of
>> the backend into a library that both client and server depend on.
>>
>> With libpqtypes, we decided to use postgres style format markers:
>> select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint);
>>
>> Everything is schema qualified, so that user types are supported (of
>> course, this requires implementing handling on the client).
>>
>> Data routed through the binary protocol, with all the byte swapping
>> etc handled by the library.  No escaping necessary.  We also added
>> full support for arrays and composites, which are a nightmare to deal
>> with over straight libpq, and various other niceties like thread safe
>> error handling.
>
> That would be a *HUGE* piece of software compared the relatively small
> thing I am suggesting...

well, it's already written. All you would have to do is compile it.

> As for escaping (or not escaping) of string arguments, that can be seen
> as a bug or a feature.  I do not wan't automatic escaping of string
> arguments in all cases, e.g. I might to construct an SQL statement with
> dynamic parts "WHERE xy" or "AND a = b".

libpqtypes doesn't escape at all.  It uses the internal parameterized
interfaces that don't require it.  For particular types, like bytea
and timestamps, this much faster because we use the binary wire
format.  Less load on the client and the server.

> hypothetical example:
>
> filter = "WHERE name like 'Balmer%'";
> if (sort == SORT_DESC)
>        sort = " ORDER BY name DESCENDING";
>
> PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort);
>
> So what I am aiming at right now is a PQvexec() function that basically
> has printf() like semantics, but adds an additional token to the format
> string (printf uses %s and %b to produce strings.) I am thinking of
> adding %S and %B, which produce strings that are escaped.
>
> That would be a small function, and reasonably safe.  Or rather, the
> safety is in the hands of the programmer.

What you are suggesting doesn't provide a lot of value over sprintf
the query first, then exec it.  You can do what you are suggesting
yourself, wrapping PQexec:

A hypothetical wrapper would be implemented something like:
va_list ap;
char buf[BUFSZ];
va_start(ap, query)
vsnprintf(buf, BUFSZ. query, ap);
va_end(ap);
return PQexec(buf);

This is a bad idea (security, escaping, performance)...we wrote a
faster, safer way to do it, with richer type support.  Or you can do
it yourself.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Tom Lane

Fujii Masao  writes:
> Walsender doesn't need the periodic wakeups anymore, thanks to
> the latch feature. So wal_sender_delay is basically useless now.
> How about dropping wal_sender_delay or increasing the default
> value?

If we don't need it, we should remove it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas


On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.


Note that we already have txid_current_snapshot() function, which 
exposes all that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Marc Balmer

Am 06.12.10 15:37, schrieb Merlin Moncure:
> On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander  wrote:
>> On Sun, Dec 5, 2010 at 10:22, Marc Balmer  wrote:
>>> I am suggesting adding a function to libpq:
>>>
>>> PGresult *PQvexec(PGconn *conn, const char *fmt, ...);
>>>
>>> It behaves similar to PQexec, but it allows for printf style varargs and
>>
>> How is that not a horrible idea, compared to using PQexecParams()? You
>> have to remember to do all your escaping and things manually, whereas
>> PQexecParams() does it automatically.
> 
> It's only horrible if you stick to printf style formatting and you are
> using sting techniques to inject parameters into the query.  Non
> parameterized queries should obviously be discouraged.  However, it's
> entirely possible to wrap the parameterized interfaces with vararg
> interface (I should know, because we did exactly that) :-).  This
> gives you the best of both worlds, easy coding without sacrificing
> safety.  You might not remember the libpqtypes proposal, but libpq was
> specifically extended with callbacks so that libpqtypes could exist
> after the community determined that libpqtypes was too big of a change
> to the libpq library.  I think ultimately this should be revisited,
> with libpqtypes going in core or something even richer...I've been
> thinking for a while that postgres types should be abstracted out of
> the backend into a library that both client and server depend on.
> 
> With libpqtypes, we decided to use postgres style format markers:
> select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint);
> 
> Everything is schema qualified, so that user types are supported (of
> course, this requires implementing handling on the client).
> 
> Data routed through the binary protocol, with all the byte swapping
> etc handled by the library.  No escaping necessary.  We also added
> full support for arrays and composites, which are a nightmare to deal
> with over straight libpq, and various other niceties like thread safe
> error handling.

That would be a *HUGE* piece of software compared the relatively small
thing I am suggesting...

As for escaping (or not escaping) of string arguments, that can be seen
as a bug or a feature.  I do not wan't automatic escaping of string
arguments in all cases, e.g. I might to construct an SQL statement with
dynamic parts "WHERE xy" or "AND a = b".

hypothetical example:

filter = "WHERE name like 'Balmer%'";
if (sort == SORT_DESC)
sort = " ORDER BY name DESCENDING";

PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort);

So what I am aiming at right now is a PQvexec() function that basically
has printf() like semantics, but adds an additional token to the format
string (printf uses %s and %b to produce strings.) I am thinking of
adding %S and %B, which produce strings that are escaped.

That would be a small function, and reasonably safe.  Or rather, the
safety is in the hands of the programmer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 114 matches

Mail list logo