Re: [HACKERS] WIP patch for parallel pg_dump
> On 12/07/2010 01:22 AM, Tom Lane wrote: >> Josh Berkus writes: However, if you were doing something like parallel pg_dump you could just run the parent and child instances all against the slave, so the pg_dump scenario doesn't seem to offer much of a supporting use-case for worrying about this. When would you really need to be able to do it? >> >>> If you had several standbys, you could distribute the work of the >>> pg_dump among them. This would be a huge speedup for a large database, >>> potentially, thanks to parallelization of I/O and network. Imagine >>> doing a pg_dump of a 300GB database in 10min. >> >> That does sound kind of attractive. But to do that I think we'd have to >> go with the pass-the-snapshot-through-the-client approach. Shipping >> internal snapshot files through the WAL stream doesn't seem attractive >> to me. > > this kind of functionality would also be very useful/interesting for > connection poolers/loadbalancers that are trying to distribute load > across multiple hosts and could use that to at least give some sort of > consistency guarantee. In addition to this, that will greatly help query based replication tools such as pgpool-II. Sounds great. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] To Signal The postmaster
Hi, I want to signal the postmaster (User defined signal) once I created the trigger file (for switch over). I want to reduce my switch over time further by signaling the postmaster once the trigger file is created. Is this provision already there in the code? There ate already SIGUSR1 and SIGUSR2 are handled in the postmaster? is it for the same? If I handle my own signal using SIGUSR1/SIGUSR2 is it possible without affecting the other functionality of existing code? -- Thanks & Regards, Aaliya Zarrin (+91)-9160665888
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/07/2010 01:22 AM, Tom Lane wrote: > Josh Berkus writes: >>> However, if you were doing something like parallel pg_dump you could >>> just run the parent and child instances all against the slave, so the >>> pg_dump scenario doesn't seem to offer much of a supporting use-case for >>> worrying about this. When would you really need to be able to do it? > >> If you had several standbys, you could distribute the work of the >> pg_dump among them. This would be a huge speedup for a large database, >> potentially, thanks to parallelization of I/O and network. Imagine >> doing a pg_dump of a 300GB database in 10min. > > That does sound kind of attractive. But to do that I think we'd have to > go with the pass-the-snapshot-through-the-client approach. Shipping > internal snapshot files through the WAL stream doesn't seem attractive > to me. this kind of functionality would also be very useful/interesting for connection poolers/loadbalancers that are trying to distribute load across multiple hosts and could use that to at least give some sort of consistency guarantee. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On 07/12/10 10:48, Josh Berkus wrote: > >> It seems plausible to fix the first one, but how would you fix the >> second one? You either allow SET ROLE (which you need, to support the >> pooler changing authorization), or you don't. There doesn't seem to be >> a usable middleground. > > Well, this is why such a pooler would *have* to be built into the > backend. It would need to be able to SET ROLE even though SET ROLE > would not be accepted over the client connection. There's actually another way to do that that could be retrofitted onto an existing external pooler. It's not lovely, but if the approach above proved too hard... SET ROLE could accept a cookie / one-time password that had to be passed to RESET ROLE in order for RESET ROLE to accept the command. SET ROLE fred WITH COOKIE 'goqu8Mi6choht8ie'; -- hand to the user -- blah blah user work blah -- returned by the user RESET ROLE WITH COOKIE 'goqu8Mi6choht8ie'; The tricky bit might be that the user should still be permitted to SET ROLE, but only to roles that the role the pooler switched them to ("fred") has rights to SET ROLE to, not to roles that the pooler user its self has rights to switch to. > We'd also need > bookkeeping to track the ROLE (and other GUCs) of each client connection > and reset them whenever that client connection switches back. I'm really interested in this direction. Taken just a little further, it could bring Pg to the point where query executors (backends) are separated from connection state, so a given backend could pick up and work on queries by several different connections in rapid succession. The advantage there is that idle connections would become cheap, low-overhead affairs. As I (poorly) understand how Pg is designed it'd only be possible for a backend to work on queries that act on the same database, it couldn't really switch databases. That'd still be a real bonus especially for newer users who don't realize they *need* a connection pool. -- System & Network Administrator POST Newspapers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
Alvaro Herrera writes: > Maybe we should have a single tunable for processes that just sleep > waiting for events or postmaster death. For example pgstats has a > hardcoded 2 seconds, and the archiver process has a hardcoded value too > AFAICS. That would make sense once we get to the point where for all of those processes, the sleep delay *only* affects the time to notice postmaster death. Right now I think there are still several other behaviors mixed in with that, and not all of them necessarily want the same response time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Rethinking hint bits WAS: Protecting against unexpected zero-pages: proposal
On Nov 14, 2010, at 3:40 PM, Greg Stark wrote:On Sun, Nov 14, 2010 at 8:52 PM, Josh Berkus wrote:For example, imagine if the hint bits were moved to a separate per-tablebitmap outside the table instead of being stored with each row, as thecurrent FSM is.How many times do we have to keep going around the same block?We *already* have separate bitmap outside the table for transactioncommit bits. It's the clog.The only reason the hint bits exist is to cache that so we don't needto do extra I/O to check tuple visibility. If the hint bits are movedoutside the table then they serve no purpose whatsover. Then you havean additional I/O to attempt to save an additional I/O.Are you sure hint bits are only for IO savings? Calculating visibility from CLOG involves a hell of a lot more CPU than checking a hint bit.It would be extremely interesting if the CPU overhead wasn't very noticeable however. That would mean we *only* have to worry about CLOG IO, and there's probably a lot of ways around that (memory mapping CLOG is one possibility), especially considering that 4G isn't exactly a large amount of memory these days.--Jim C. Nasby, Database Architect j...@nasby.net512.569.9461 (cell) http://jim.nasby.net
Re: [HACKERS] wal_sender_delay is still required?
On Tue, Dec 7, 2010 at 12:22 PM, Robert Haas wrote: >> Fair enough. How about increasing the default to 10 seconds? >> Since bgwriter has already using 10s as a nap time if there is no >> configured activity, I think that 10s is non-nonsense default value. > > What do we get out of making this non-configurable? Which would make the setting of replication simpler, I think. But I agree to just increase the default value of wal_sender_delay rather than dropping it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
On Mon, Dec 6, 2010 at 10:07 PM, Fujii Masao wrote: > On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane wrote: >> Fujii Masao writes: >>> One problem with the patch is that it takes longer (at most 10s) to >>> detect the unexpected death of postmaster (by calling PostmasterIsAlive()). >>> This is OK for me. But does anyone want to specify the delay to detect >>> that within a short time? >> >> Oh. Hm. I'm hesitant to remove the setting if there's still some >> behavior that it would control. Maybe we should just crank up the >> default value instead. > > Fair enough. How about increasing the default to 10 seconds? > Since bgwriter has already using 10s as a nap time if there is no > configured activity, I think that 10s is non-nonsense default value. What do we get out of making this non-configurable? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
We may need other means to ensure that the snapshot is available on the slave. It could be a bit too early to use the snapshot on the slave depending upon the delay of WAL replay. -- Koichi Suzuki 2010/12/7 Tom Lane : > marcin mank writes: >> On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane wrote: >>> IIRC, in old discussions of this problem we first considered allowing >>> clients to pull down an explicit representation of their snapshot (which >>> actually is an existing feature now, txid_current_snapshot()) and then >>> upload that again to become the active snapshot in another connection. > >> Could a hot standby use such a snapshot representation? I.e. same >> snapshot on the master and the standby? > > Hm, that's a good question. It seems like it's at least possibly > workable, but I'm not sure if there are any showstoppers. The other > proposal of publish-a-snapshot would presumably NOT support this, since > we'd not want to ship the snapshot temp files down the WAL stream. > > However, if you were doing something like parallel pg_dump you could > just run the parent and child instances all against the slave, so the > pg_dump scenario doesn't seem to offer much of a supporting use-case for > worrying about this. When would you really need to be able to do it? > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
On Tue, Dec 7, 2010 at 11:49 AM, Tom Lane wrote: > Fujii Masao writes: >> One problem with the patch is that it takes longer (at most 10s) to >> detect the unexpected death of postmaster (by calling PostmasterIsAlive()). >> This is OK for me. But does anyone want to specify the delay to detect >> that within a short time? > > Oh. Hm. I'm hesitant to remove the setting if there's still some > behavior that it would control. Maybe we should just crank up the > default value instead. Fair enough. How about increasing the default to 10 seconds? Since bgwriter has already using 10s as a nap time if there is no configured activity, I think that 10s is non-nonsense default value. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
Excerpts from Tom Lane's message of lun dic 06 23:49:52 -0300 2010: > Fujii Masao writes: > > One problem with the patch is that it takes longer (at most 10s) to > > detect the unexpected death of postmaster (by calling PostmasterIsAlive()). > > This is OK for me. But does anyone want to specify the delay to detect > > that within a short time? > > Oh. Hm. I'm hesitant to remove the setting if there's still some > behavior that it would control. Maybe we should just crank up the > default value instead. Maybe we should have a single tunable for processes that just sleep waiting for events or postmaster death. For example pgstats has a hardcoded 2 seconds, and the archiver process has a hardcoded value too AFAICS. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Mon, Dec 6, 2010 at 9:37 PM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of lun dic 06 23:09:56 -0300 2010: >> On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus wrote: >> > >> >> Please explain more precisely what is wrong with SET SESSION >> >> AUTHORIZATION / SET ROLE. >> > >> > 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't >> > had any time to work on) >> > >> > 2) Users can always issue their own SET ROLE and then "hack into" other >> > users' data. >> >> Makes sense. It would be nice to fix those issues, independent of >> anything else. > > It seems plausible to fix the first one, but how would you fix the > second one? You either allow SET ROLE (which you need, to support the > pooler changing authorization), or you don't. There doesn't seem to be > a usable middleground. You could add a protocol message that does a "permanent" role switch in a way that can't be undone except by another such protocol message. Then connection poolers could simply refuse to proxy that particular message. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
Fujii Masao writes: > One problem with the patch is that it takes longer (at most 10s) to > detect the unexpected death of postmaster (by calling PostmasterIsAlive()). > This is OK for me. But does anyone want to specify the delay to detect > that within a short time? Oh. Hm. I'm hesitant to remove the setting if there's still some behavior that it would control. Maybe we should just crank up the default value instead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
> It seems plausible to fix the first one, but how would you fix the > second one? You either allow SET ROLE (which you need, to support the > pooler changing authorization), or you don't. There doesn't seem to be > a usable middleground. Well, this is why such a pooler would *have* to be built into the backend. It would need to be able to SET ROLE even though SET ROLE would not be accepted over the client connection. We'd also need bookkeeping to track the ROLE (and other GUCs) of each client connection and reset them whenever that client connection switches back. Mind you, I'm not entirely convinced that the end result of this would be performant. And they would certainly be complicated. I think that we should start by dealing with the simplest situation, ignoring SET ROLE and GUC issues for now. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut wrote: > Here is an updated patch to address the issues discussed during this > commitfest. I found another issue in the patch; ILIKE in WHERE clause doesn't work. It was surprising because LIKE in WHERE clause and ILIKE in SELECT list works expectedly. - SELECT * FROM pg_class WHERE relname LIKE 'pg%' - SELECT relname ILIKE 'pg%' FROM pg_class; postgres=# SELECT name, setting FROM pg_settings WHERE name IN ('lc_ctype', 'lc_collate', 'server_encoding'); name | setting -+- lc_collate | C lc_ctype| C server_encoding | UTF8 (3 rows) postgres=# SELECT * FROM pg_class WHERE relname ILIKE 'pg%'; ERROR: no collation was derived -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
On Tue, Dec 7, 2010 at 12:08 AM, Tom Lane wrote: > Fujii Masao writes: >> Walsender doesn't need the periodic wakeups anymore, thanks to >> the latch feature. So wal_sender_delay is basically useless now. >> How about dropping wal_sender_delay or increasing the default >> value? > > If we don't need it, we should remove it. The attached patch removes wal_sender_delay and uses hard-coded 10 seconds instead of wal_sender_delay as the delay between activity rounds for walsender. One problem with the patch is that it takes longer (at most 10s) to detect the unexpected death of postmaster (by calling PostmasterIsAlive()). This is OK for me. But does anyone want to specify the delay to detect that within a short time? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center drop_wal_sender_delay_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Excerpts from Robert Haas's message of lun dic 06 23:09:56 -0300 2010: > On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus wrote: > > > >> Please explain more precisely what is wrong with SET SESSION > >> AUTHORIZATION / SET ROLE. > > > > 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't > > had any time to work on) > > > > 2) Users can always issue their own SET ROLE and then "hack into" other > > users' data. > > Makes sense. It would be nice to fix those issues, independent of > anything else. It seems plausible to fix the first one, but how would you fix the second one? You either allow SET ROLE (which you need, to support the pooler changing authorization), or you don't. There doesn't seem to be a usable middleground. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
All, Geirth's results from his FreeBSD 7.1 server using 8.4's test_fsync: Simple write timing: write0.007081 Compare fsync times on write() and non-write() descriptor: If the times are similar, fsync() can sync data written on a different descriptor. write, fsync, close 5.937933 write, close, fsync 8.056394 Compare one o_sync write to two: one 16k o_sync write 7.366927 two 8k o_sync writes15.299300 Compare file sync methods with one 8k write: (o_dsync unavailable) open o_sync, write 7.512682 (fdatasync unavailable) write, fsync 5.856480 Compare file sync methods with two 8k writes: (o_dsync unavailable) open o_sync, write 15.472910 (fdatasync unavailable) write, fsync 5.880319 ... again, open_sync does not look very impressive. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
Excerpts from Greg Smith's message of dom dic 05 20:02:48 -0300 2010: > When ends up happening if you push toward fully sync I/O is the design > you see in some other databases, where you need multiple writer > processes. Then requests for new pages can continue to allocate as > needed, while keeping any one write from blocking things. That's one > sort of a way to simulate asynchronous I/O, and you can substitute true > async I/O instead in many of those implementations. We didn't have much > luck with portability on async I/O when that was last experimented with, > and having multiple background writer processes seems like overkill; > that whole direction worries me. Why would multiple bgwriter processes worry you? Of course, it wouldn't work to have multiple processes trying to execute a checkpoint simultaneously, but what if we separated the tasks so that one process is in charge of checkpoints, and another oneZis in charge of the LRU scan? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
Josh Berkus writes: > OK, patch coming then. Right now test_fsync aborts when O_DIRECT fails. > What should I have it do instead? Report that it fails, and keep testing the other methods. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
On 12/6/10 6:10 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus wrote: >>> Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available. >>> From my run, it looks like even so regular fsync might be better than >>> open_sync. > >> But I think you need to use fsync_writethrough if you actually want >> durability. > > Yeah. Unless your laptop contains an SSD, those numbers are garbage on > their face. So that's another problem with test_fsync: it omits > fsync_writethrough. Yeah, the issue with test_fsync appears to be that it's designed to work without os-specific switches no matter what, not to accurately reflect how we access wal. I'll see if I can do better. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Robert Haas writes: > On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus wrote: >> Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available. >> From my run, it looks like even so regular fsync might be better than >> open_sync. > But I think you need to use fsync_writethrough if you actually want > durability. Yeah. Unless your laptop contains an SSD, those numbers are garbage on their face. So that's another problem with test_fsync: it omits fsync_writethrough. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus wrote: > >> Please explain more precisely what is wrong with SET SESSION >> AUTHORIZATION / SET ROLE. > > 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't > had any time to work on) > > 2) Users can always issue their own SET ROLE and then "hack into" other > users' data. Makes sense. It would be nice to fix those issues, independent of anything else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus wrote: > >> Mac OS X: Like Solaris, there's a similar mechanism but it's not >> O_DIRECT; see >> http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag >> for notes about the F_NOCACHE feature used. Same basic situation as >> Solaris; there's an API, but PostgreSQL doesn't use it yet. > > Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available. > From my run, it looks like even so regular fsync might be better than > open_sync. But I think you need to use fsync_writethrough if you actually want durability. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
> Whether it's complex or not isn't really the issue. The issue is that > what test_fsync is testing had better match what the backend does, or > people will be making choices based on not-comparable test results. > I think we should have test_fsync just automatically fold in O_DIRECT > the same way the backend does. OK, patch coming then. Right now test_fsync aborts when O_DIRECT fails. What should I have it do instead? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
> Mac OS X: Like Solaris, there's a similar mechanism but it's not > O_DIRECT; see > http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag > for notes about the F_NOCACHE feature used. Same basic situation as > Solaris; there's an API, but PostgreSQL doesn't use it yet. Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available. >From my run, it looks like even so regular fsync might be better than open_sync. Results from a MacBook: Sidney-Stratton:fsync josh$ ./test_fsync Loops = 1 Simple write: 8k write 2121.004/second Compare file sync methods using one write: (open_datasync unavailable) open_sync 8k write 1993.833/second (fdatasync unavailable) 8k write, fsync1878.154/second Compare file sync methods using two writes: (open_datasync unavailable) 2 open_sync 8k writes 1005.009/second (fdatasync unavailable) 8k write, 8k write, fsync 1709.862/second Compare open_sync with different sizes: open_sync 16k write1728.803/second 2 open_sync 8k writes 969.416/second Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) 8k write, fsync, close 1772.572/second 8k write, close, fsync 1939.897/second -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep
On Mon, Dec 6, 2010 at 11:54 PM, Heikki Linnakangas wrote: > Hmm, that's actually a quite different timeout than what's required for > synchronous replication. In synchronous replication, you need to get an > acknowledgment within a timeout. This patch only puts a timeout on how long > we wait to have enough room in the TCP send buffer. That doesn't seem all > that useful. Yeah, I'm planning to implement that timeout for synchronous replication later. Since I thought that we should implement the timeout for *asynchronous* replication first and then extend it for synchronous replication, I created this patch. This kind of timeout is required for asynchronous replication since since there is no acknowledgement from the standby in it. Most part of the patch implements the non-blocking send function and changes walsender so that it uses that function instead of existing blocking one. This will be infrastructure for the timeout for synchronous replication. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Steve, > If you tell me which options to pgbench and which .conf file settings > you'd like to see I can probably arrange to run some tests on AIX. Compile and run test_fsync in PGSRC/src/tools/fsync. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
On 12/06/2010 08:38 PM, Tom Lane wrote: Josh Berkus writes: Making it support O_DIRECT would be possible but more complex; I don't see the point unless we think we're going to have open_sync_with_odirect as a seperate option. Whether it's complex or not isn't really the issue. The issue is that what test_fsync is testing had better match what the backend does, or people will be making choices based on not-comparable test results. I think we should have test_fsync just automatically fold in O_DIRECT the same way the backend does. Indeed. We were quite confused for a while when we were dealing with this about a week ago, and my handwritten test program failed as expected but test_fsync didn't. Anything other than behaving just as the backend does violates POLA, in my view. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
Josh Berkus writes: > Making it support O_DIRECT would be possible but more complex; I don't > see the point unless we think we're going to have open_sync_with_odirect > as a seperate option. Whether it's complex or not isn't really the issue. The issue is that what test_fsync is testing had better match what the backend does, or people will be making choices based on not-comparable test results. I think we should have test_fsync just automatically fold in O_DIRECT the same way the backend does. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Greg Smith writes: > So my guess is that some small percentage of Windows users might notice > a change here, and some testing on FreeBSD would be useful too. That's > about it for platforms that I think anybody needs to worry about. To my mind, O_DIRECT is not really the key issue here, it's whether to prefer O_DSYNC or fdatasync. I looked back in the archives, and I think that the main reason we prefer O_DSYNC when available is the results I got here: http://archives.postgresql.org/pgsql-hackers/2001-03/msg00381.php which demonstrated a performance benefit on HPUX 10.20, though with a test tool much more primitive than test_fsync. I still have that machine, although the disk that was in it at the time died awhile back. What's in there now is a Seagate ST336607LW spinning at 1 RPM (166 rev/sec) and today I get numbers like this from test_fsync: Simple write: 8k write 28331.020/second Compare file sync methods using one write: open_datasync 8k write 161.190/second open_sync 8k write 156.478/second 8k write, fdatasync 54.302/second 8k write, fsync 51.810/second Compare file sync methods using two writes: 2 open_datasync 8k writes81.702/second 2 open_sync 8k writes80.172/second 8k write, 8k write, fdatasync40.829/second 8k write, 8k write, fsync39.836/second Compare open_sync with different sizes: open_sync 16k write 80.192/second 2 open_sync 8k writes78.018/second Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) 8k write, fsync, close 52.527/second 8k write, close, fsync 54.092/second So *on that rather ancient platform* there's a measurable performance benefit to O_DSYNC, but this seems to be largely because fdatasync is stubbed to fsync in userspace rather than because fdatasync wouldn't be a better idea in the abstract. Also, a lot of the argument against fsync at the time was that it forced the kernel to iterate through all the buffers for the WAL file to see if any were dirty. I would imagine that modern kernels are a tad smarter about that; and even if they aren't, the CPU speed versus disk speed tradeoff has changed enough since 2001 that iterating through 16MB of buffers isn't as interesting as it was then. So to my mind, switching to the preference order fdatasync, fsync_writethrough, fsync seems like the thing to do. Since we assume fsync is always available, that means that O_DSYNC/O_SYNC will not be the defaults on any platform. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
On 10-12-06 06:56 PM, Greg Smith wrote: Tom Lane wrote: The various testing that's been reported so far is all for Linux and thus doesn't directly address the question of whether other kernels will have similar performance properties. Survey of some popular platforms: So my guess is that some small percentage of Windows users might notice a change here, and some testing on FreeBSD would be useful too. That's about it for platforms that I think anybody needs to worry about. If you tell me which options to pgbench and which .conf file settings you'd like to see I can probably arrange to run some tests on AIX. -- Greg Smith 2ndQuadrant usg...@2ndquadrant.comBaltimore, MD PostgreSQL Training, Services and Supportwww.2ndQuadrant.us "PostgreSQL 9.0 High Performance":http://www.2ndQuadrant.com/books -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
On 12/5/10 2:12 PM, Greg Smith wrote: > Josh Berkus wrote: >> I modified test_fsync in two ways to run this; first, to make it support >> O_DIRECT, and second to make it run in the *current* directory. > > Patch please? I agree with the latter change; what test_fsync does is > surprising. Attached. Making it support O_DIRECT would be possible but more complex; I don't see the point unless we think we're going to have open_sync_with_odirect as a seperate option. > I suggested a while ago that we refactor test_fsync to use a common set > of source code as the database itself for detecting things related to > wal_sync_method, perhaps just extract that whole set of DEFINE macro > logic to somewhere else. That happened at a bad time in the development > cycle (right before a freeze) and nobody ever got back to the idea > afterwards. If this code is getting touched, and it's clear it is in > some direction, I'd like to see things change so it's not possible for > the two to diverge again afterwards. I don't quite follow you. Maybe nobody else did last time, either. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com diff --git a/src/tools/fsync/test_fsync.c b/src/tools/fsync/test_fsync.c index 28c2119..12a83e1 100644 *** a/src/tools/fsync/test_fsync.c --- b/src/tools/fsync/test_fsync.c *** *** 23,34 #include ! #ifdef WIN32 #define FSYNC_FILENAME "./test_fsync.out" - #else - /* /tmp might be a memory file system */ - #define FSYNC_FILENAME "/var/tmp/test_fsync.out" - #endif #define WRITE_SIZE (8 * 1024) /* 8k */ --- 23,32 #include ! /* put the temp files in the local directory !this is a change from older versions which used !/var/tmp */ #define FSYNC_FILENAME "./test_fsync.out" #define WRITE_SIZE (8 * 1024) /* 8k */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Josh Berkus writes: >> However, if you were doing something like parallel pg_dump you could >> just run the parent and child instances all against the slave, so the >> pg_dump scenario doesn't seem to offer much of a supporting use-case for >> worrying about this. When would you really need to be able to do it? > If you had several standbys, you could distribute the work of the > pg_dump among them. This would be a huge speedup for a large database, > potentially, thanks to parallelization of I/O and network. Imagine > doing a pg_dump of a 300GB database in 10min. That does sound kind of attractive. But to do that I think we'd have to go with the pass-the-snapshot-through-the-client approach. Shipping internal snapshot files through the WAL stream doesn't seem attractive to me. While I see Robert's point about preferring not to expose the snapshot contents to clients, I don't think it outweighs all other considerations here; and every other one is pointing to doing it the other way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] the number of file descriptors when using POSIX semaphore
Hi, folks, in src/template/darwin: # Select appropriate semaphore support. Darwin 6.0 (Mac OS X 10.2) and up # support System V semaphores; before that we have to use POSIX semaphores, # which are less good for our purposes because they eat a file descriptor # per backend per max_connection slot. To my understanding, the number of descriptors created by POSIX semaphores would be # of actual clients times max_connection. However, I monitor the number of open files using sysctl, and I find that kern.num_files doesn't match the result calculated by the formula that is inferred by me... So, what would the number of file descriptors be, when using POSIX semaphore? Thanks!
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Tom Lane wrote: The various testing that's been reported so far is all for Linux and thus doesn't directly address the question of whether other kernels will have similar performance properties. Survey of some popular platforms: Linux: don't want O_DIRECT by default for reliability reasons, and there's no clear performance win in the default config with small wal_buffers Solaris: O_DIRECT doesn't work, there's another API support has never been added for; see http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and Windows: Small reported gains for O_DIRECT, i.e 10% at http://archives.postgresql.org/pgsql-hackers/2007-03/msg01615.php FreeBSD: It probably works there, but I've never seen good performance tests of it on this platform. Mac OS X: Like Solaris, there's a similar mechanism but it's not O_DIRECT; see http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag for notes about the F_NOCACHE feature used. Same basic situation as Solaris; there's an API, but PostgreSQL doesn't use it yet. So my guess is that some small percentage of Windows users might notice a change here, and some testing on FreeBSD would be useful too. That's about it for platforms that I think anybody needs to worry about. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services and Supportwww.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Re: [HACKERS] WIP patch for parallel pg_dump
> However, if you were doing something like parallel pg_dump you could > just run the parent and child instances all against the slave, so the > pg_dump scenario doesn't seem to offer much of a supporting use-case for > worrying about this. When would you really need to be able to do it? If you had several standbys, you could distribute the work of the pg_dump among them. This would be a huge speedup for a large database, potentially, thanks to parallelization of I/O and network. Imagine doing a pg_dump of a 300GB database in 10min. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Heikki Linnakangas wrote: > On 06.12.2010 22:53, Kevin Grittner wrote: >> What would be the correct way for a process to put itself to >> sleep, and for another process to later wake it up? > > See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. Is there a reason to prefer one over the other? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On 06.12.2010 22:53, Kevin Grittner wrote: What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
Florian Pflug wrote: > On Dec5, 2010, at 16:11 , Kevin Grittner wrote: >> The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE >> under SSI would be to have each non-read-only serializable >> transaction acquire a heavyweight lock which can coexist with >> other locks at the same level (SHARE looks good) on some common >> object and hold that for the duration of the transaction, while a >> SERIALIZABLE READ ONLY DEFERRABLE transaction would need to >> acquire a conflicting lock (EXCLUSIVE looks good) before it could >> acquire a snapshot, and release the lock immediately after >> acquiring the snapshot. > > Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to > acquire the lock, no other transaction would be allowed to start > until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been > able to acquire its snapshot. For pg_dump's purposes at least, > that seems undesirable, since a single long-running transaction at > the time you start pg_dump would effectly DoS your system until > the long-running transaction finishes. Well, when you put it that way, it sounds pretty grim. :-( Since one of the bragging points of SSI is that it doesn't introduce any blocking beyond current snapshot isolation, I don't want to do something here which blocks anything except the transaction which has explicitly requested the DEFERRABLE property. I guess that, simple as that technique might be, it just isn't a good idea. > The alternative seems to be to drop the guarantee that a > SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a > stream of overlapping non-READ ONLY transactions. Then a flag in > the proc array that marks non-READ ONLY transactions should be > sufficient, plus a wait-and-retry loop to take snapshots for > SERIALIZABLE READ ONLY DEFERRABLE transactions. If I can find a way to pause an active process I already have functions in which I maintain the count of active SERIALIZABLE READ WRITE transactions as they begin and end -- I could release pending DEFERRABLE transactions when the count hits zero without any separate loop. That has the added attraction of being a path to the more complex checking which could allow the deferrable process to start sooner in some circumstances. The "simple" solution with the heavyweight lock would not have been a good path to that. What would be the correct way for a process to put itself to sleep, and for another process to later wake it up? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
marcin mank writes: > On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane wrote: >> IIRC, in old discussions of this problem we first considered allowing >> clients to pull down an explicit representation of their snapshot (which >> actually is an existing feature now, txid_current_snapshot()) and then >> upload that again to become the active snapshot in another connection. > Could a hot standby use such a snapshot representation? I.e. same > snapshot on the master and the standby? Hm, that's a good question. It seems like it's at least possibly workable, but I'm not sure if there are any showstoppers. The other proposal of publish-a-snapshot would presumably NOT support this, since we'd not want to ship the snapshot temp files down the WAL stream. However, if you were doing something like parallel pg_dump you could just run the parent and child instances all against the slave, so the pg_dump scenario doesn't seem to offer much of a supporting use-case for worrying about this. When would you really need to be able to do it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 06.12.2010 21:48, marcin mank wrote: On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane wrote: IIRC, in old discussions of this problem we first considered allowing clients to pull down an explicit representation of their snapshot (which actually is an existing feature now, txid_current_snapshot()) and then upload that again to become the active snapshot in another connection. Could a hot standby use such a snapshot representation? I.e. same snapshot on the master and the standby? Hmm, I suppose it could. That's an interesting idea, you could run parallel pg_dump or something else against master and/or multiple hot standby servers, all working on the same snapshot. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4
Greg Smith writes: > Regardless, I'm now leaning heavily toward the idea of avoiding > open_datasync by default given this bug, and backpatching that change to > at least 8.4. I'll do some more database-level performance tests here > just as a final sanity check on that. My gut feel is now that we'll > eventually be taking something like Marti's patch, adding some more > documentation around it, and applying that to HEAD as well as some > number of back branches. I think we have got consensus that (1) open_datasync should not be the default on Linux, and (2) this change needs to be back-patched. What is not clear to me is whether we have consensus to change the option preference order globally, or restrict the change to just be effective on Linux. The various testing that's been reported so far is all for Linux and thus doesn't directly address the question of whether other kernels will have similar performance properties. However, it seems reasonable to me to suppose that open_datasync could only be a win in very restricted scenarios and thus shouldn't be a preferred default. Also, I dread trying to document the behavior if the preference order becomes platform-dependent. With the holidays fast approaching, our window to do something about this in a timely fashion grows short. If we don't schedule update releases to be made this week, I think we're looking at not getting the updates out till after New Year's. Do we want to wait that long? Is anyone actually planning to do performance testing that would prove anything about non-Linux platforms? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane wrote: > IIRC, in old discussions of this problem we first considered allowing > clients to pull down an explicit representation of their snapshot (which > actually is an existing feature now, txid_current_snapshot()) and then > upload that again to become the active snapshot in another connection. Could a hot standby use such a snapshot representation? I.e. same snapshot on the master and the standby? Greetings Marcin Mańk -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
On Dec5, 2010, at 16:11 , Kevin Grittner wrote: > The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under > SSI would be to have each non-read-only serializable transaction > acquire a heavyweight lock which can coexist with other locks at the > same level (SHARE looks good) on some common object and hold that for > the duration of the transaction, while a SERIALIZABLE READ ONLY > DEFERRABLE transaction would need to acquire a conflicting lock > (EXCLUSIVE looks good) before it could acquire a snapshot, and > release the lock immediately after acquiring the snapshot. Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, no other transaction would be allowed to start until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's purposes at least, that seems undesirable, since a single long-running transaction at the time you start pg_dump would effectly DoS your system until the long-running transaction finishes. The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY transactions. Then a flag in the proc array that marks non-READ ONLY transactions should be sufficient, plus a wait-and-retry loop to take snapshots for SERIALIZABLE READ ONLY DEFERRABLE transactions. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Dec 6, 2010, at 11:36 AM, Tom Lane wrote: > There's a difference between whether an extension as such is considered > to belong to a schema and whether its contained objects do. We can't > really avoid the fact that functions, operators, etc must be assigned to > some particular schema. Right, of course. > It seems not particularly important that > extension names be schema-qualified, though --- the use-case for having > two different extensions named "foo" installed simultaneously seems > pretty darn small. On the other hand, if we were enforcing that all > objects contained in an extension belong to the same schema, it'd make > logistical sense to consider that the extension itself belongs to that > schema as well. But last I heard we didn't want to enforce such a > restriction. Okay. > I believe what the search_path substitution is actually about is to > provide a convenient shorthand for the case that all the contained > objects do indeed live in one schema, and you'd like to be able to > select that schema at CREATE EXTENSION time. Which seems like a useful > feature for a common case. We've certainly heard multiple complaints > about the fact that you can't do that easily now. Yes, it *is* useful. But what happens if I have SET search_path = whatever; In my extension install script, and someone executes CREATE EXTENSION FOO WITH SCHEMA bar; Surprise! Everything is in whatever, not in bar. > BTW, I did think of a case where substitution solves a problem we don't > presently have any other solution for: referring to the target schema > within the definition of a contained object. As an example, you might > wish to attach "SET search_path = @target_schema@" to the definition of > a SQL function in an extension, to prevent search-path-related security > issues in the use of the function. Without substitution you'll be > reduced to hard-wiring the name of the target schema. You lost me. :-( David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
Please It would be very important to us that the Brazilian LIKE collate worked with, and possible case-insensitive and accent-insensitive Tank's Alexandre Riveira Brazil Peter Eisentraut escreveu: On mån, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote: I've been wondering if this patch will support case-insensitve collations. If so, then citext should probably be revised to use one. This has been touch upon several times during the discussions on past patches. Essentially, the current patch only arranges that you can specify a sort order for data. The system always breaks ties using a binary comparison. This could conceivably be changed, but it's a separate problem. Some of the necessary investigation work has presumably already been done in the context of citext. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
> Please explain more precisely what is wrong with SET SESSION > AUTHORIZATION / SET ROLE. 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't had any time to work on) 2) Users can always issue their own SET ROLE and then "hack into" other users' data. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Label switcher function
2010/11/25 KaiGai Kohei : > The attached patch is a revised one. > > It provides two hooks; the one informs core PG whether the supplied > function needs to be hooked, or not. the other is an actual hook on > prepare, start, end and abort of function invocations. > > typedef bool (*needs_function_call_type)(Oid fn_oid); > > typedef void (*function_call_type)(FunctionCallEventType event, > FmgrInfo *flinfo, Datum *private); > > The hook prototype was a bit modified since the suggestion from > Robert. Because FmgrInfo structure contain OID of the function, > it might be redundant to deliver OID of the function individually. > > Rest of parts are revised according to the comment. > > I also fixed up source code comments which might become incorrect. FCET_PREPARE looks completely unnecessary to me. Any necessary one-time work can easily be done at FCET_START time, assuming that the private-data field is initialized to (Datum) 0. I'm fairly certain that the following is not portable: + ObjectAddress object = { .classId = ProcedureRelationId, + .objectId = fn_oid, + .objectSubId = 0 }; I'd suggest renaming needs_function_call_type and function_call_type to needs_fmgr_hook_type and fmgr_hook_type. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On Dec 6, 2010, at 11:29 AM, Peter Eisentraut wrote: > This has been touch upon several times during the discussions on past > patches. > > Essentially, the current patch only arranges that you can specify a sort > order for data. The system always breaks ties using a binary > comparison. This could conceivably be changed, but it's a separate > problem. Some of the necessary investigation work has presumably > already been done in the context of citext. Okay, thanks, good to know. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
"David E. Wheeler" writes: > The other question I have, though, is how important is it to have extensions > live in a particular schema since there seems to be no advantage to doing so. > With the current patch, I can put extension "foo" in schema "bar", but I > can't put any other extension named "foo" in any other schema. It's in schema > "bar" but is at the same time global. That doesn't make much sense to me. There's a difference between whether an extension as such is considered to belong to a schema and whether its contained objects do. We can't really avoid the fact that functions, operators, etc must be assigned to some particular schema. It seems not particularly important that extension names be schema-qualified, though --- the use-case for having two different extensions named "foo" installed simultaneously seems pretty darn small. On the other hand, if we were enforcing that all objects contained in an extension belong to the same schema, it'd make logistical sense to consider that the extension itself belongs to that schema as well. But last I heard we didn't want to enforce such a restriction. I believe what the search_path substitution is actually about is to provide a convenient shorthand for the case that all the contained objects do indeed live in one schema, and you'd like to be able to select that schema at CREATE EXTENSION time. Which seems like a useful feature for a common case. We've certainly heard multiple complaints about the fact that you can't do that easily now. BTW, I did think of a case where substitution solves a problem we don't presently have any other solution for: referring to the target schema within the definition of a contained object. As an example, you might wish to attach "SET search_path = @target_schema@" to the definition of a SQL function in an extension, to prevent search-path-related security issues in the use of the function. Without substitution you'll be reduced to hard-wiring the name of the target schema. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On mån, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote: > I've been wondering if this patch will support case-insensitve > collations. If so, then citext should probably be revised to use one. This has been touch upon several times during the discussions on past patches. Essentially, the current patch only arranges that you can specify a sort order for data. The system always breaks ties using a binary comparison. This could conceivably be changed, but it's a separate problem. Some of the necessary investigation work has presumably already been done in the context of citext. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On mån, 2010-12-06 at 21:06 +0900, Itagaki Takahiro wrote: > On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut wrote: > > Here is an updated patch to address the issues discussed during this > > commitfest. > > Here are comments and questions after I tested the latest patch: > > Issues > * initdb itself seems to be succeeded, but it says "could not determine > encoding for locale" messages for any combination of encoding=utf8/eucjp > and locale=ja_JP.utf8/ja_JP.eucjp/C. Is it an expected behavior? > > creating collations ...initdb: locale name has non-ASCII characters, > skipped: bokm虱 > initdb: locale name has non-ASCII characters, skipped: fran軋is > could not determine encoding for locale "hy_AM.armscii8": codeset is > "ARMSCII-8" > ... (a dozen of lines) ... > could not determine encoding for locale "vi_VN.tcvn": codeset is "TCVN5712-1" > ok > What this does it take the output of locale -a and populate the pg_collation catalog with the locales it finds. When it finds an operating system locale that uses an encoding that is not recognized, you will see this warning. I understand that that would probably annoy users. We could hide the warning and silently skip those locales. But then could that hide genuine configuration problems? > * contrib/citext raises an encoding error when COLLATE is specified > even if it is the collation as same as the database default. > We might need some special treatment for C locale. > =# SHOW lc_collate; ==> C > =# SELECT ('A'::citext) = ('a'::citext); ==> false > =# SELECT ('A'::citext) = ('a'::citext) COLLATE "C"; > ERROR: invalid multibyte character for locale > HINT: The server's LC_CTYPE locale is probably incompatible with the > database encoding. OK, I can reproduce that. That's fallout from the lc_ctype_is_c() optimization that I removed, as explained in another email. I'll have to think about that again. > * pg_dump would generate unportable files for different platforms > because collation names pg_dump can already produce unportable files for a number of other reasons, including per-database locale, tablespaces, OS-dependent configuration settings. The way I imagine this working is that someone who wants to design a genuinely portable application using this feature would create their own collation based on the existing, OS-specific collation (using a to-be-added CREATE COLLATION command). As mentioned earlier, however, we can't actually solve the problem that the OS locales may not behave the same across systems. > Source codes > * PG_GETARG_COLLATION might be a better name rather than PG_GET_COLLATION. It's not the collation of a function argument, it's the collation of a function call. (You could conceivably also fetch the collation of a function argument, but that isn't used in any way.) > * What is the different between InvalidOid and DEFAULT_COLLATION_OID > for collation oids? The patch replaces DirectFunctionCall to > DirectFunctionCallC in some places, but we could shrink the diff size > if we can use InvalidOid instead of DEFAULT_COLLATION_OID, Think of DEFAULT_COLLATION_OID as analogous to UKNOWNOID. A long time ago we used InvalidOid for all kinds of types, including unknown, pseudotypes, cstring, and no type at all. The reason we changed this was that this masked errors and made processing of the unknown type difficult/impossible. I know this makes the code bigger, but it's necessary. I originally coded the patch using InvalidOid for everything, but that wasn't very robust. This also ties into the next question ... > * I still think an explicit passing collations from-function-to-function > is horrible because we might forget it in some places, and almost existing > third party module won't work. Is it possible to make it a global variable, > and push/pop the state when changed? Sorry I'm missing something, but > I think we could treat the collation setting as like as GUC settings. A collation is a property of a datum or an expression. You might as well argue that we don't keep track of types of expressions and instead store it globally. Doesn't make sense. Extensions are not required to support collations. Those that might want to will usually end up calling one of the locale-enabled functions such as varstr_cmp(), and there the function prototype will ensure that specifying a collation cannot be missed. Additionally, the distinction of InvalidOid and DEFAULT_COLLATION_OID does a great deal to ensure that in case a collation is unspecified or missing in some new code, you will get a proper error message instead of unspecified behavior. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Dec 6, 2010, at 11:12 AM, Tom Lane wrote: > Well, I don't put any stock in the idea that it's important for existing > module .sql files to be usable as-is as extension definition files. If > it happens to fall out that way, fine, but we shouldn't give up anything > else to get that. I agree, but I don't think we have to lose anything. > Letting extension files be directly sourceable in > psql is probably worth a bit more, but I'm not sure how much. The > argument that forgetting to include a magic source_path command would > make CREATE EXTENSION behave surprisingly seems to have a good deal of > merit though, certainly enough to justify having CREATE EXTENSION take > care of that internally if at all possible. Yes. The other question I have, though, is how important is it to have extensions live in a particular schema since there seems to be no advantage to doing so. With the current patch, I can put extension "foo" in schema "bar", but I can't put any other extension named "foo" in any other schema. It's in schema "bar" but is at the same time global. That doesn't make much sense to me. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
"David E. Wheeler" writes: > On Dec 6, 2010, at 10:43 AM, Tom Lane wrote: >> (On the other hand, if we *could* avoid using any explicit >> substitutions, it would certainly ease testing of extension files no? >> They'd be sourceable into psql then.) > Yes. And extension authors would not have to remember to include the magic > line (which at any rate would break extensions for earlier versions of > PostgreSQL). Well, I don't put any stock in the idea that it's important for existing module .sql files to be usable as-is as extension definition files. If it happens to fall out that way, fine, but we shouldn't give up anything else to get that. Letting extension files be directly sourceable in psql is probably worth a bit more, but I'm not sure how much. The argument that forgetting to include a magic source_path command would make CREATE EXTENSION behave surprisingly seems to have a good deal of merit though, certainly enough to justify having CREATE EXTENSION take care of that internally if at all possible. The real question in my mind is whether there are any other known or foreseeable cases where we would need to have substitution capability and there's not another good way to handle it. I haven't been paying real close attention to the threads about this patch --- do we have any specific use-cases in mind for substitution, besides this one? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Tom Lane wrote: > "Kevin Grittner" writes: >> Tom Lane wrote: >>> No. See subtransactions. > >> Subtransactions are included in snapshots? > > Sure, see GetSnapshotData(). You could avoid it by setting > suboverflowed, but that comes at a nontrivial performance cost. Yeah, sorry for blurting like that before I checked. I was somewhat panicked that I'd missed something important for SSI, because my XidIsConcurrent check just uses xmin, xmax, and xip; I was afraid what I have would fall down in the face of subtransactions. But on review I found that I'd thought that through and (discussion in in the archives) I always wanted to associate the locks and conflicts with the top level transaction; so that was already identified before checking for overlap, and it was therefore more efficient to just check that. Sorry for the "senior moment". :-/ Perhaps a line or two of comments about that in the SSI patch would be a good idea. And maybe some tests involving subtransactions -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Dec 6, 2010, at 10:43 AM, Tom Lane wrote: > That's an interesting idea, but I'm not sure it's wise to design around > the assumption that we won't need substitutions ever. What I was > thinking was that we should try to limit knowledge of the substitution > behavior to the extension definition files and the implementation of > CREATE EXTENSION itself. I don't agree with exposing that information > at the SQL level. > > (On the other hand, if we *could* avoid using any explicit > substitutions, it would certainly ease testing of extension files no? > They'd be sourceable into psql then.) Yes. And extension authors would not have to remember to include the magic line (which at any rate would break extensions for earlier versions of PostgreSQL). Best, dAvid -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
"Kevin Grittner" writes: > Tom Lane wrote: >> No. See subtransactions. > Subtransactions are included in snapshots? Sure, see GetSnapshotData(). You could avoid it by setting suboverflowed, but that comes at a nontrivial performance cost. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
"David E. Wheeler" writes: > On Dec 6, 2010, at 7:19 AM, Tom Lane wrote: >> On the whole I'd prefer not to have any substitution functionality >> hard-wired into pg_execute_file either, though I can see the argument >> that it's necessary for practical use. Basically I'm concerned that >> replace-equivalent behavior is not going to be satisfactory over the >> long run: I think eventually we're going to need to think about >> quoting/escaping behavior. So I think it's a bad idea to expose the >> assumption that it'll be done that way at the SQL level. > +1 > I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION > could be modified to handle setting the schema itself, without requiring that > the file have this magic line: >SET search_path = @extschema@; > Then there would be no need for substitutions at all. That's an interesting idea, but I'm not sure it's wise to design around the assumption that we won't need substitutions ever. What I was thinking was that we should try to limit knowledge of the substitution behavior to the extension definition files and the implementation of CREATE EXTENSION itself. I don't agree with exposing that information at the SQL level. (On the other hand, if we *could* avoid using any explicit substitutions, it would certainly ease testing of extension files no? They'd be sourceable into psql then.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields
On 12/06/2010 01:23 PM, Tom Lane wrote: Andrew Dunstan writes: On 12/06/2010 12:11 PM, Tom Lane wrote: Also, why did you change the setup code to not compute nfields in binary mode? That seems at best an unnecessary change, and at worst a breakage of the binary path --- did you test it? AFAICT it's not used in binary mode at all. But I will double check. Well, even if it is not used at the moment, it seems potentially of use in that path. So I'd vote for continuing to set it correctly, rather than making it deliberately incorrect as this patch is going out of its way to do. Ok. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Tom Lane wrote: > "Kevin Grittner" writes: >> Surely you can predict that any snapshot is no larger than a fairly >> small fixed portion plus sizeof(TransactionId) * MaxBackends? > > No. See subtransactions. Subtransactions are included in snapshots? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
"Kevin Grittner" writes: > Tom Lane wrote: >>> I'm still not convinced that using shared memory is a bad way to >>> pass these around. Surely we're not talking about large numbers >>> of them. What am I missing here? >> >> They're not of a very predictable size. > Surely you can predict that any snapshot is no larger than a fairly > small fixed portion plus sizeof(TransactionId) * MaxBackends? No. See subtransactions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields
Andrew Dunstan writes: > On 12/06/2010 12:11 PM, Tom Lane wrote: >> Also, why did you change the setup code to >> not compute nfields in binary mode? That seems at best an unnecessary >> change, and at worst a breakage of the binary path --- did you test it? > AFAICT it's not used in binary mode at all. But I will double check. Well, even if it is not used at the moment, it seems potentially of use in that path. So I'd vote for continuing to set it correctly, rather than making it deliberately incorrect as this patch is going out of its way to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
"Kevin Grittner" writes: > I reviewed the documentation and played around with this a bit and > can't find any areas where the current PostgreSQL implementation of > READ ONLY is incompatible with what is needed for the SSI > optimizations where it is used. There are a large number of tests > which exercise this, and they're all passing. > Did you have something in particular in mind which I should check? I did not, just thought it was a point that merited examination. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Tom Lane wrote: >> I'm still not convinced that using shared memory is a bad way to >> pass these around. Surely we're not talking about large numbers >> of them. What am I missing here? > > They're not of a very predictable size. Surely you can predict that any snapshot is no larger than a fairly small fixed portion plus sizeof(TransactionId) * MaxBackends? So, for example, if you're configured for 100 connections, you'd be limited to something under 1kB, maximum? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
2010/12/6 David E. Wheeler : > On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote: > >> * contrib/citext raises an encoding error when COLLATE is specified >> even if it is the collation as same as the database default. >> We might need some special treatment for C locale. > > I've been wondering if this patch will support case-insensitve collations. If > so, then citext should probably be revised to use one. what I know - no. It's support only system based collations Pavel > > Best, > > David > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields
On 12/06/2010 12:11 PM, Tom Lane wrote: Andrew Dunstan writes: Attached is a patch that allows CopyReadAttibutesText() and CopyReadAttributesCSV() to read arbitrary numbers of attributes. Underflowing attributes are recorded as null, and space is made for overflowing attributes on a line. Why are you still passing nfields as a separate parameter instead of relying on the value you added to the struct? That can't do anything except cause confusion, especially once the two values diverge due to a previous array-expansion. Good point. will fix. Also, why did you change the setup code to not compute nfields in binary mode? That seems at best an unnecessary change, and at worst a breakage of the binary path --- did you test it? AFAICT it's not used in binary mode at all. But I will double check. Also please be a little more careful with the formatting. Ok, Will fix also. Thanks for he comments. cheers andre -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Mon, Dec 6, 2010 at 12:41 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane wrote: >>> Why is there a variadic replace() in this patch at all? It seems just >>> about entirely unrelated to the stated purpose of the patch, as well >>> as being of dubious usefulness. When would it be superior to >>> replace(replace(orig, from1, to1), from2, to2), ... > >> An iterated replacement has different semantics from a simultaneous >> replace - replacing N placeholders with values simultaneously means >> you don't need to worry about the case where one of the replacement >> strings contains something that looks like a placeholder. > > Good point, but what the patch implements is in fact iterated > replacement ... or at least it looked that way in a quick once-over. Oh. Well, -1 from me for including that. >> I actually >> think a simultaneous replacement feature would be quite handy but I >> make no comment on whether it belongs as part of this patch. > > My point is that the replacement stuff really really needs to be > factored out of the string-execution stuff, precisely because the > desired behavior is debatable. +1 for committing the uncontroversial parts separately. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Mon, Dec 6, 2010 at 12:57 PM, Josh Berkus wrote: >>> At some point Hackers should look at pg vs MySQL multi tenantry but it >>> is way tangential today. >> >> My understanding is that our schemas work like MySQL databases; and >> our databases are an even higher level of isolation. No? > > That's correct. Drizzle is looking at implementing a feature like our > databases called "catalogs" (per the SQL spec). > > Let me stress that not everyone is happy with the MySQL multi-tenantry > approach. But it does make multi-tenancy on a scale which you seldom see > with PG possible, even if it has problems. It's worth seeing whether we can > steal any of their optimization ideas without breaking PG. Please make sure to articulate what you think is wrong with our existing model. > I was specifically looking at the login model, which works around the issue > that we have: namely that different login ROLEs can't share a connection > pool. In MySQL, they can share the built-in connection "pool" because > role-switching effectively is a session variable. AFAICT, anyway. Please explain more precisely what is wrong with SET SESSION AUTHORIZATION / SET ROLE. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
On Dec 6, 2010, at 7:19 AM, Tom Lane wrote: > On the whole I'd prefer not to have any substitution functionality > hard-wired into pg_execute_file either, though I can see the argument > that it's necessary for practical use. Basically I'm concerned that > replace-equivalent behavior is not going to be satisfactory over the > long run: I think eventually we're going to need to think about > quoting/escaping behavior. So I think it's a bad idea to expose the > assumption that it'll be done that way at the SQL level. +1 I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION could be modified to handle setting the schema itself, without requiring that the file have this magic line: SET search_path = @extschema@; Then there would be no need for substitutions at all. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation
On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote: > * contrib/citext raises an encoding error when COLLATE is specified > even if it is the collation as same as the database default. > We might need some special treatment for C locale. I've been wondering if this patch will support case-insensitve collations. If so, then citext should probably be revised to use one. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
At some point Hackers should look at pg vs MySQL multi tenantry but it is way tangential today. My understanding is that our schemas work like MySQL databases; and our databases are an even higher level of isolation. No? That's correct. Drizzle is looking at implementing a feature like our databases called "catalogs" (per the SQL spec). Let me stress that not everyone is happy with the MySQL multi-tenantry approach. But it does make multi-tenancy on a scale which you seldom see with PG possible, even if it has problems. It's worth seeing whether we can steal any of their optimization ideas without breaking PG. I was specifically looking at the login model, which works around the issue that we have: namely that different login ROLEs can't share a connection pool. In MySQL, they can share the built-in connection "pool" because role-switching effectively is a session variable. AFAICT, anyway. For that matter, if anyone knows any other DB which does multi-tenant well/better, we should be looking at them too. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On Mon, Dec 6, 2010 at 12:38 PM, Tom Lane wrote: > Robert Haas writes: >> One possible way to do make an improvement in this area would be to >> move the responsibility for accepting connections out of the >> postmaster. Instead, you'd have a group of children that would all >> call accept() on the socket, and the OS would arbitrarily pick one to >> receive each new incoming connection. The postmaster would just be >> responsible for making sure that there were enough children hanging >> around. You could in fact make this change without doing anything >> else, in which case it wouldn't save any work but would possibly >> reduce connection latency a bit since more of the work could be done >> before the connection actually arrived. > > This seems like potentially a good idea independent of anything else, > just to reduce connection latency: fork() (not to mention exec() on > Windows) now happens before not after receipt of the connection request. > However, I see a couple of stumbling blocks: > > 1. Does accept() work that way everywhere (Windows, I'm looking at you) Not sure. It might be useful to look at what Apache does, but I don't have time to do that ATM. > 2. What do you do when max_connections is exceeded, and you don't have > anybody at all listening on the socket? Right now we are at least able > to send back an error message explaining the problem. Sending back an error message explaining the problem seems like a non-negotiable requirement. I'm not quite sure how to dance around this. Perhaps if max_connections is exhausted, the postmaster itself joins the accept() queue and launches a dead-end backend for each new connection. Or perhaps we reserve one extra backend slot for a probably-dead-end backend that will just sit there and mail rejection notices; except that if it sees that a regular backend slot has opened up it grabs it and turns itself into a regular backend. > Another issue that would require some thought is what algorithm the > postmaster uses for deciding to spawn new children. But that doesn't > sound like a potential showstopper. The obvious algorithm would be to try to keep N spare workers around. Any time the number of unconnected backends drops below N the postmaster starts spawning new ones until it gets back up to N. I think the trick may not be the algorithm so much as finding a way to make the signaling sufficiently robust and lightweight. For example, I bet having each child that gets a new connection signal() the postmaster is a bad plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/06/2010 12:28 PM, Tom Lane wrote: Andrew Dunstan writes: Yeah. I'm still not convinced that using shared memory is a bad way to pass these around. Surely we're not talking about large numbers of them. What am I missing here? They're not of a very predictable size. Ah. Ok. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
On 12/06/2010 09:38 AM, Tom Lane wrote: Another issue that would require some thought is what algorithm the postmaster uses for deciding to spawn new children. But that doesn't sound like a potential showstopper. We'd probably want a couple of different ones, optimized for different connection patterns. Realistically. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
Robert Haas writes: > On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane wrote: >> Why is there a variadic replace() in this patch at all? It seems just >> about entirely unrelated to the stated purpose of the patch, as well >> as being of dubious usefulness. When would it be superior to >>replace(replace(orig, from1, to1), from2, to2), ... > An iterated replacement has different semantics from a simultaneous > replace - replacing N placeholders with values simultaneously means > you don't need to worry about the case where one of the replacement > strings contains something that looks like a placeholder. Good point, but what the patch implements is in fact iterated replacement ... or at least it looked that way in a quick once-over. > I actually > think a simultaneous replacement feature would be quite handy but I > make no comment on whether it belongs as part of this patch. My point is that the replacement stuff really really needs to be factored out of the string-execution stuff, precisely because the desired behavior is debatable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] profiling connection overhead
Robert Haas writes: > One possible way to do make an improvement in this area would be to > move the responsibility for accepting connections out of the > postmaster. Instead, you'd have a group of children that would all > call accept() on the socket, and the OS would arbitrarily pick one to > receive each new incoming connection. The postmaster would just be > responsible for making sure that there were enough children hanging > around. You could in fact make this change without doing anything > else, in which case it wouldn't save any work but would possibly > reduce connection latency a bit since more of the work could be done > before the connection actually arrived. This seems like potentially a good idea independent of anything else, just to reduce connection latency: fork() (not to mention exec() on Windows) now happens before not after receipt of the connection request. However, I see a couple of stumbling blocks: 1. Does accept() work that way everywhere (Windows, I'm looking at you) 2. What do you do when max_connections is exceeded, and you don't have anybody at all listening on the socket? Right now we are at least able to send back an error message explaining the problem. Another issue that would require some thought is what algorithm the postmaster uses for deciding to spawn new children. But that doesn't sound like a potential showstopper. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan writes: > Why not just say give me the snapshot currently held by process ? There's not a unique snapshot held by a particular process. Also, we don't want to expend the overhead to fully publish every snapshot. I think it's really necessary that the "sending" process take some deliberate action to publish a snapshot. > And please, not temp files if possible. Barring the cleanup issue, I don't see why not. This is a relatively low-usage feature, I think, so I wouldn't be much in favor of dedicating shmem to it even if the space requirement were predictable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Andrew Dunstan writes: > Yeah. I'm still not convinced that using shared memory is a bad way to > pass these around. Surely we're not talking about large numbers of them. > What am I missing here? They're not of a very predictable size. Robert's idea of publish() returning a temp file identifier, which then gets removed at transaction end, might work all right. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FK's to refer to rows in inheritance child
Robert Haas writes: > On Sun, Dec 5, 2010 at 12:41 PM, Andrew Dunstan wrote: >> Well, ISTM that amounts to not having "official topic branches" :-) I agree >> that this is supposed to be one of git's strengths (or more exactly a >> strength of distributed SCM's generally). I don't really see any great >> value in sanctifying a particular topic branch with some official status. > I think the value in an official topic branch would be to allow formal > incremental commit of large patches. In other words, we could decide > that a commit to the official topic branch must meet the same > standards of quality normally applied to a commit to the master > branch, and must go through the same process. It would be understood > that the topic branch would eventually be merged (with or without > squash) back into the master branch, but that objections were to be > raised as pieces were committed to the topic branch, not at merge > time. The merge itself would require consensus as to timing, but we'd > agree to take a dim view of "I haven't reviewed anything that's been > going on here for the last six months but now hate all of it". Topic branches defined that way seem like a pretty bad idea from here. They would save no effort at all for committers, because if you're not allowed to object to something after it's gone into a topic branch, then it's just like master in terms of having to keep up with changes as they happen. Moreover, we'd have to keep them in pretty close sync with master --- otherwise what happens when you discover that some long-ago change on master breaks the topic branch? So AFAICS this would just increase the amount of keeping-branches-in-sync dogwork without any offsetting advantage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] serializable read only deferrable
I wrote: > Tom Lane wrote: >> I assume this would have to be a "hard" definition of READ ONLY, >> not the rather squishy definition we use now? > I'm excluding temporary tables from SSI on the grounds that they > are only read and written by a single transaction and therefore > can't be a source of rw-dependencies, and I'm excluding system > tables on the grounds that they don't follow normal snapshot > isolation rules. Hint bit rewrites are not an issue for SSI. Are > there any other squishy aspect I might need to consider? I reviewed the documentation and played around with this a bit and can't find any areas where the current PostgreSQL implementation of READ ONLY is incompatible with what is needed for the SSI optimizations where it is used. There are a large number of tests which exercise this, and they're all passing. Did you have something in particular in mind which I should check? An example of code you think might break would be ideal, but anything more concrete than the word "squishy" would be welcome. Any thoughts on the original question about what to use as a heavyweight lock to support the subject feature? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields
Andrew Dunstan writes: > Attached is a patch that allows CopyReadAttibutesText() and > CopyReadAttributesCSV() to read arbitrary numbers of attributes. > Underflowing attributes are recorded as null, and space is made for > overflowing attributes on a line. Why are you still passing nfields as a separate parameter instead of relying on the value you added to the struct? That can't do anything except cause confusion, especially once the two values diverge due to a previous array-expansion. Also, why did you change the setup code to not compute nfields in binary mode? That seems at best an unnecessary change, and at worst a breakage of the binary path --- did you test it? Also please be a little more careful with the formatting. This for instance is pretty sloppy: ! * strings. cstate->raw_fields[k] is set to point to the k'th attribute ! * string, * or NULL when the input matches the null marker string. and there seem to be some gratuitous whitespace changes as well. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
2010/12/6 Andrew Chernow > On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote: > >> IMO, it would be better to implement some utility functions to >> make it easy to construct arrays dynamically for PQexecParams >> and PQexecPrepared. This seems to me more universal solution >> and it is useful for both -- high level libpq-libraries authors and for >> those who like to use libpq directly. >> >> > Hmm, your idea isn't better, it is identical to what libpqtypes already > does :) > http://libpqtypes.esilo.com/browse_source.html?file=exec.c > Actually I don't need this functionality :-). I've implemented a library on C++ which does many things, including auto memory management, type conversion and binary transfers easy... But I believe, that including proposed utility functions are better than printf-like addition... Although, both of these a excess. > We wrap PQexecParams and friends. You are coding libpq. We extended much > effort to provide the same result interface (PGresult), including handling > composites and arrays. You getf composites and arrays as PGresults; where a > composite is a single tuple multiple field result, an array is a multiple > tuple single field result and composite arrays are multiple tuples and > multiple fields. We've just made a more formal set of utility functions, > typically called an API, in an attempt to match the coding standards of the > postgresql project. > There is no libpq param interface like results, so we added PGparam stuff. > This allows you to pack parameters (PQputf) and than execute it. So, let libpq will not be bloated. Let libpq remain low-level library for projects like libpqtypes, pqxx and so on (my library too) ;-) > > > -- > Andrew Chernow > eSilo, LLC > every bit counts > http://www.esilo.com/ > -- // Dmitriy.
Re: [HACKERS] Suggesting a libpq addition
On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote: IMO, it would be better to implement some utility functions to make it easy to construct arrays dynamically for PQexecParams and PQexecPrepared. This seems to me more universal solution and it is useful for both -- high level libpq-libraries authors and for those who like to use libpq directly. Hmm, your idea isn't better, it is identical to what libpqtypes already does :) http://libpqtypes.esilo.com/browse_source.html?file=exec.c We wrap PQexecParams and friends. You are coding libpq. We extended much effort to provide the same result interface (PGresult), including handling composites and arrays. You getf composites and arrays as PGresults; where a composite is a single tuple multiple field result, an array is a multiple tuple single field result and composite arrays are multiple tuples and multiple fields. We've just made a more formal set of utility functions, typically called an API, in an attempt to match the coding standards of the postgresql project. There is no libpq param interface like results, so we added PGparam stuff. This allows you to pack parameters (PQputf) and than execute it. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
IMO, it would be better to implement some utility functions to make it easy to construct arrays dynamically for PQexecParams and PQexecPrepared. This seems to me more universal solution and it is useful for both -- high level libpq-libraries authors and for those who like to use libpq directly. 2010/12/6 Andrew Chernow > > >> That would be a *HUGE* piece of software compared the relatively small >> thing I am suggesting... >> >> > Sometimes complex and large solutions are required for the simplest of > ideas. I believe this is one of those cases. You can't solve the "printf > style PQexec" properly by merely implementing a sprintf wrapper. > > > As for escaping (or not escaping) of string arguments, that can be seen >> as a bug or a feature. I do not wan't automatic escaping of string >> arguments in all cases, e.g. I might to construct an SQL statement with >> dynamic parts "WHERE xy" or "AND a = b". >> >> hypothetical example: >> >> filter = "WHERE name like 'Balmer%'"; >> if (sort == SORT_DESC) >>sort = " ORDER BY name DESCENDING"; >> >> PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort); >> >> So what I am aiming at right now is a PQvexec() function that basically >> has printf() like semantics, but adds an additional token to the format >> string (printf uses %s and %b to produce strings.) I am thinking of >> adding %S and %B, which produce strings that are escaped. >> >> > This suffers from becoming cryptic over time, see Tom Lane's comments back > in 2007 on this ( > http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php). > libpqtypes uses the human readable %schema.typename (schema is optional) to > specify format specifiers. There is no learning curve or ambiguity, if you > want a point than use "%point", or "%my_type" libpqtypes allows you to > register aliases (PQregisterSubClasses) so that you can map %text to %s to > make it feel more like C.. > > > -- > Andrew Chernow > eSilo, LLC > every bit counts > http://www.esilo.com/ > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- // Dmitriy.
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/06/2010 10:40 AM, Tom Lane wrote: Robert Haas writes: On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas wrote: Well, then you need some sort of cross-backend communication, which is always a bit clumsy. A temp file seems quite sufficient, and not at all difficult. "Not at all difficult" is nonsense. To do that, you need to invent some mechanism for sender and receivers to identify which temp file they want to use, and you need to think of some way to clean up the files when the client forgets to tell you to do so. That's going to be at least as ugly as anything else. And I think it's unproven that this approach would be security-hole-free either. For instance, what about some other session overwriting pg_dump's snapshot temp file? Yeah. I'm still not convinced that using shared memory is a bad way to pass these around. Surely we're not talking about large numbers of them. What am I missing here? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Mon, Dec 6, 2010 at 10:40 AM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas >> wrote: >>> Well, then you need some sort of cross-backend communication, which is >>> always a bit clumsy. > >> A temp file seems quite sufficient, and not at all difficult. > > "Not at all difficult" is nonsense. To do that, you need to invent some > mechanism for sender and receivers to identify which temp file they want > to use, Why is this even remotely hard? That's the whole point of having the "publish" operation return a token. The token either is, or uniquely identifies, the file name. > and you need to think of some way to clean up the files when the > client forgets to tell you to do so. That's going to be at least as > ugly as anything else. Backends don't forget to call their end-of-transaction hooks, do they? They might crash, but we already have code to remove temp files on server restart. At most it would need minor adjustment. > And I think it's unproven that this approach > would be security-hole-free either. For instance, what about some other > session overwriting pg_dump's snapshot temp file? Why would this be any different from any other temp file? We surely must have a mechanism in place to ensure that the temporary files used by sorts or hash joins don't get overwritten by some other session, or the system would be totally unstable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
Robert Haas writes: > On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas > wrote: >> Well, then you need some sort of cross-backend communication, which is >> always a bit clumsy. > A temp file seems quite sufficient, and not at all difficult. "Not at all difficult" is nonsense. To do that, you need to invent some mechanism for sender and receivers to identify which temp file they want to use, and you need to think of some way to clean up the files when the client forgets to tell you to do so. That's going to be at least as ugly as anything else. And I think it's unproven that this approach would be security-hole-free either. For instance, what about some other session overwriting pg_dump's snapshot temp file? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Mon, Dec 6, 2010 at 10:35 AM, Andrew Dunstan wrote: > On 12/06/2010 10:22 AM, Robert Haas wrote: >> >> On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas >> wrote: >>> >>> On 06.12.2010 15:53, Robert Haas wrote: I guess. It still seems far too much like exposing the server's guts for my taste. It might not be as bad as the expression tree stuff, but there's nothing particularly good about it either. >>> >>> Note that we already have txid_current_snapshot() function, which exposes >>> all that. >> >> Fair enough, and I think that's actually useful for Slony&c. But I >> don't think we should shy away of providing a cleaner API here. >> > > Just don't let the perfect get in the way of the good :P I'll keep that in mind. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 12/06/2010 10:22 AM, Robert Haas wrote: On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas wrote: On 06.12.2010 15:53, Robert Haas wrote: I guess. It still seems far too much like exposing the server's guts for my taste. It might not be as bad as the expression tree stuff, but there's nothing particularly good about it either. Note that we already have txid_current_snapshot() function, which exposes all that. Fair enough, and I think that's actually useful for Slony&c. But I don't think we should shy away of providing a cleaner API here. Just don't let the perfect get in the way of the good :P cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
That would be a *HUGE* piece of software compared the relatively small thing I am suggesting... Sometimes complex and large solutions are required for the simplest of ideas. I believe this is one of those cases. You can't solve the "printf style PQexec" properly by merely implementing a sprintf wrapper. As for escaping (or not escaping) of string arguments, that can be seen as a bug or a feature. I do not wan't automatic escaping of string arguments in all cases, e.g. I might to construct an SQL statement with dynamic parts "WHERE xy" or "AND a = b". hypothetical example: filter = "WHERE name like 'Balmer%'"; if (sort == SORT_DESC) sort = " ORDER BY name DESCENDING"; PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort); So what I am aiming at right now is a PQvexec() function that basically has printf() like semantics, but adds an additional token to the format string (printf uses %s and %b to produce strings.) I am thinking of adding %S and %B, which produce strings that are escaped. This suffers from becoming cryptic over time, see Tom Lane's comments back in 2007 on this (http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php). libpqtypes uses the human readable %schema.typename (schema is optional) to specify format specifiers. There is no learning curve or ambiguity, if you want a point than use "%point", or "%my_type" libpqtypes allows you to register aliases (PQregisterSubClasses) so that you can map %text to %s to make it feel more like C.. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
On Mon, Dec 06, 2010 at 10:14:55AM -0500, Merlin Moncure wrote: > On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer wrote: > > Am 06.12.10 15:37, schrieb Merlin Moncure: > >> On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander > >> wrote: > >>> On Sun, Dec 5, 2010 at 10:22, Marc Balmer wrote: > I am suggesting adding a function to libpq: > > PGresult *PQvexec(PGconn *conn, const char *fmt, ...); > > It behaves similar to PQexec, but it allows for printf style varargs and > >>> > >>> How is that not a horrible idea, compared to using PQexecParams()? You > >>> have to remember to do all your escaping and things manually, whereas > >>> PQexecParams() does it automatically. > >> > >> It's only horrible if you stick to printf style formatting and you are > >> using sting techniques to inject parameters into the query. ?Non > >> parameterized queries should obviously be discouraged. ?However, it's > >> entirely possible to wrap the parameterized interfaces with vararg > >> interface (I should know, because we did exactly that) :-). ?This > >> gives you the best of both worlds, easy coding without sacrificing > >> safety. ?You might not remember the libpqtypes proposal, but libpq was > >> specifically extended with callbacks so that libpqtypes could exist > >> after the community determined that libpqtypes was too big of a change > >> to the libpq library. ?I think ultimately this should be revisited, > >> with libpqtypes going in core or something even richer...I've been > >> thinking for a while that postgres types should be abstracted out of > >> the backend into a library that both client and server depend on. > >> > >> With libpqtypes, we decided to use postgres style format markers: > >> select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint); > >> > >> Everything is schema qualified, so that user types are supported (of > >> course, this requires implementing handling on the client). > >> > >> Data routed through the binary protocol, with all the byte swapping > >> etc handled by the library. ?No escaping necessary. ?We also added > >> full support for arrays and composites, which are a nightmare to deal > >> with over straight libpq, and various other niceties like thread safe > >> error handling. > > > > That would be a *HUGE* piece of software compared the relatively small > > thing I am suggesting... > > well, it's already written. All you would have to do is compile it. > > > As for escaping (or not escaping) of string arguments, that can be seen > > as a bug or a feature. ?I do not wan't automatic escaping of string > > arguments in all cases, e.g. I might to construct an SQL statement with > > dynamic parts "WHERE xy" or "AND a = b". > > libpqtypes doesn't escape at all. It uses the internal parameterized > interfaces that don't require it. For particular types, like bytea > and timestamps, this much faster because we use the binary wire > format. Less load on the client and the server. > > > hypothetical example: > > > > filter = "WHERE name like 'Balmer%'"; > > if (sort == SORT_DESC) > > ? ? ? ?sort = " ORDER BY name DESCENDING"; > > > > PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort); > > > > So what I am aiming at right now is a PQvexec() function that basically > > has printf() like semantics, but adds an additional token to the format > > string (printf uses %s and %b to produce strings.) I am thinking of > > adding %S and %B, which produce strings that are escaped. > > > > That would be a small function, and reasonably safe. ?Or rather, the > > safety is in the hands of the programmer. > > What you are suggesting doesn't provide a lot of value over sprintf > the query first, then exec it. You can do what you are suggesting > yourself, wrapping PQexec: > > A hypothetical wrapper would be implemented something like: > va_list ap; > char buf[BUFSZ]; > va_start(ap, query) > vsnprintf(buf, BUFSZ. query, ap); > va_end(ap); > return PQexec(buf); > > This is a bad idea (security, escaping, performance)...we wrote a > faster, safer way to do it, with richer type support. Or you can do > it yourself. > > merlin > I have used the libpqtypes library and it is very easy to use. +1 for adding it or something like it to the PostgreSQL core. I have people who will try and roll their own because it does not come with the core. While it is a hoot to see what reinventing the wheel produces, it is also prone to mistakes. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FK's to refer to rows in inheritance child
Excerpts from Andrew Dunstan's message of dom dic 05 14:41:20 -0300 2010: > What I would like to see is people publishing the location of > development repos so that they can be pulled from or merged, especially > for any large patch. Yes, this is pretty useful. Dimitri published his repos for the extension stuff which I followed for a while, made some smallish changes and sent them back, etc. Very time-saving. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas wrote: > On 06.12.2010 15:53, Robert Haas wrote: >> >> I guess. It still seems far too much like exposing the server's guts >> for my taste. It might not be as bad as the expression tree stuff, >> but there's nothing particularly good about it either. > > Note that we already have txid_current_snapshot() function, which exposes > all that. Fair enough, and I think that's actually useful for Slony &c. But I don't think we should shy away of providing a cleaner API here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep
On Mon, Dec 6, 2010 at 9:54 AM, Heikki Linnakangas wrote: >>> This occurred to me that the timeout would be required even for >>> asynchronous streaming replication. So, how about implementing the >>> replication timeout feature before synchronous replication itself? >> >> Here is the patch. This is one of features required for synchronous >> replication, so I added this into current CF as a part of synchronous >> replication. > > Hmm, that's actually a quite different timeout than what's required for > synchronous replication. In synchronous replication, you need to get an > acknowledgment within a timeout. This patch only puts a timeout on how long > we wait to have enough room in the TCP send buffer. That doesn't seem all > that useful. Yeah. If we rely on the TCP send buffer filling up, then the amount of time the master takes to notice a dead standby is going to be hard for the user to predict. I think the standby ought to send some sort of heartbeat and the master should declare the standby dead if it doesn't see a heartbeat soon enough. Maybe the heartbeat could even include the receive/fsync/replay LSNs, so that sync rep can use the same machinery but with more aggressive policies about when they must be sent. I also can't help noticing that this approach requires drilling a hole through the abstraction stack. We just invented latches; if the API is going to have to change every time someone wants to implement a feature, we've built ourselves an awfully porous abstraction layer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_execute_from_file review
Dimitri Fontaine writes: > Tom Lane writes: >> Why is there a variadic replace() in this patch at all? It seems just >> about entirely unrelated to the stated purpose of the patch, as well >> as being of dubious usefulness. > It used not to being exposed at the SQL level, but just an internal loop > in pg_execute_sql_file() when using the placeholders enabled > variant. Then Itagaki wanted me to expose internals so that he basically > can implement the logics in SQL directly. It seems like we went a step > too far in exposing this facility too. Agreed in removing it at the SQL > level. Well, actually, my next question was going to be about removing the variadic substitution in pg_execute_string too. It's not apparent to me that that function should have a rather lame substitution mechanism hard-wired into it, when you can do the same thing with replace() in front of it. On the whole I'd prefer not to have any substitution functionality hard-wired into pg_execute_file either, though I can see the argument that it's necessary for practical use. Basically I'm concerned that replace-equivalent behavior is not going to be satisfactory over the long run: I think eventually we're going to need to think about quoting/escaping behavior. So I think it's a bad idea to expose the assumption that it'll be done that way at the SQL level. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer wrote: > Am 06.12.10 15:37, schrieb Merlin Moncure: >> On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander wrote: >>> On Sun, Dec 5, 2010 at 10:22, Marc Balmer wrote: I am suggesting adding a function to libpq: PGresult *PQvexec(PGconn *conn, const char *fmt, ...); It behaves similar to PQexec, but it allows for printf style varargs and >>> >>> How is that not a horrible idea, compared to using PQexecParams()? You >>> have to remember to do all your escaping and things manually, whereas >>> PQexecParams() does it automatically. >> >> It's only horrible if you stick to printf style formatting and you are >> using sting techniques to inject parameters into the query. Non >> parameterized queries should obviously be discouraged. However, it's >> entirely possible to wrap the parameterized interfaces with vararg >> interface (I should know, because we did exactly that) :-). This >> gives you the best of both worlds, easy coding without sacrificing >> safety. You might not remember the libpqtypes proposal, but libpq was >> specifically extended with callbacks so that libpqtypes could exist >> after the community determined that libpqtypes was too big of a change >> to the libpq library. I think ultimately this should be revisited, >> with libpqtypes going in core or something even richer...I've been >> thinking for a while that postgres types should be abstracted out of >> the backend into a library that both client and server depend on. >> >> With libpqtypes, we decided to use postgres style format markers: >> select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint); >> >> Everything is schema qualified, so that user types are supported (of >> course, this requires implementing handling on the client). >> >> Data routed through the binary protocol, with all the byte swapping >> etc handled by the library. No escaping necessary. We also added >> full support for arrays and composites, which are a nightmare to deal >> with over straight libpq, and various other niceties like thread safe >> error handling. > > That would be a *HUGE* piece of software compared the relatively small > thing I am suggesting... well, it's already written. All you would have to do is compile it. > As for escaping (or not escaping) of string arguments, that can be seen > as a bug or a feature. I do not wan't automatic escaping of string > arguments in all cases, e.g. I might to construct an SQL statement with > dynamic parts "WHERE xy" or "AND a = b". libpqtypes doesn't escape at all. It uses the internal parameterized interfaces that don't require it. For particular types, like bytea and timestamps, this much faster because we use the binary wire format. Less load on the client and the server. > hypothetical example: > > filter = "WHERE name like 'Balmer%'"; > if (sort == SORT_DESC) > sort = " ORDER BY name DESCENDING"; > > PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort); > > So what I am aiming at right now is a PQvexec() function that basically > has printf() like semantics, but adds an additional token to the format > string (printf uses %s and %b to produce strings.) I am thinking of > adding %S and %B, which produce strings that are escaped. > > That would be a small function, and reasonably safe. Or rather, the > safety is in the hands of the programmer. What you are suggesting doesn't provide a lot of value over sprintf the query first, then exec it. You can do what you are suggesting yourself, wrapping PQexec: A hypothetical wrapper would be implemented something like: va_list ap; char buf[BUFSZ]; va_start(ap, query) vsnprintf(buf, BUFSZ. query, ap); va_end(ap); return PQexec(buf); This is a bad idea (security, escaping, performance)...we wrote a faster, safer way to do it, with richer type support. Or you can do it yourself. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] wal_sender_delay is still required?
Fujii Masao writes: > Walsender doesn't need the periodic wakeups anymore, thanks to > the latch feature. So wal_sender_delay is basically useless now. > How about dropping wal_sender_delay or increasing the default > value? If we don't need it, we should remove it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for parallel pg_dump
On 06.12.2010 15:53, Robert Haas wrote: I guess. It still seems far too much like exposing the server's guts for my taste. It might not be as bad as the expression tree stuff, but there's nothing particularly good about it either. Note that we already have txid_current_snapshot() function, which exposes all that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Suggesting a libpq addition
Am 06.12.10 15:37, schrieb Merlin Moncure: > On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander wrote: >> On Sun, Dec 5, 2010 at 10:22, Marc Balmer wrote: >>> I am suggesting adding a function to libpq: >>> >>> PGresult *PQvexec(PGconn *conn, const char *fmt, ...); >>> >>> It behaves similar to PQexec, but it allows for printf style varargs and >> >> How is that not a horrible idea, compared to using PQexecParams()? You >> have to remember to do all your escaping and things manually, whereas >> PQexecParams() does it automatically. > > It's only horrible if you stick to printf style formatting and you are > using sting techniques to inject parameters into the query. Non > parameterized queries should obviously be discouraged. However, it's > entirely possible to wrap the parameterized interfaces with vararg > interface (I should know, because we did exactly that) :-). This > gives you the best of both worlds, easy coding without sacrificing > safety. You might not remember the libpqtypes proposal, but libpq was > specifically extended with callbacks so that libpqtypes could exist > after the community determined that libpqtypes was too big of a change > to the libpq library. I think ultimately this should be revisited, > with libpqtypes going in core or something even richer...I've been > thinking for a while that postgres types should be abstracted out of > the backend into a library that both client and server depend on. > > With libpqtypes, we decided to use postgres style format markers: > select PQexecf(conn, "select %int4 + %int8", an_int, a_bigint); > > Everything is schema qualified, so that user types are supported (of > course, this requires implementing handling on the client). > > Data routed through the binary protocol, with all the byte swapping > etc handled by the library. No escaping necessary. We also added > full support for arrays and composites, which are a nightmare to deal > with over straight libpq, and various other niceties like thread safe > error handling. That would be a *HUGE* piece of software compared the relatively small thing I am suggesting... As for escaping (or not escaping) of string arguments, that can be seen as a bug or a feature. I do not wan't automatic escaping of string arguments in all cases, e.g. I might to construct an SQL statement with dynamic parts "WHERE xy" or "AND a = b". hypothetical example: filter = "WHERE name like 'Balmer%'"; if (sort == SORT_DESC) sort = " ORDER BY name DESCENDING"; PQvexec(conn, "SELECT name, nr, id FROM address %s%s", filter, sort); So what I am aiming at right now is a PQvexec() function that basically has printf() like semantics, but adds an additional token to the format string (printf uses %s and %b to produce strings.) I am thinking of adding %S and %B, which produce strings that are escaped. That would be a small function, and reasonably safe. Or rather, the safety is in the hands of the programmer. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers