Re: [HACKERS] Add dblink function to check if a named connection exists
Joe Conway wrote: Tom Lane wrote: Tommy Gildseth [EMAIL PROTECTED] writes: One obvious disadvantage of this approach, is that I need to connect and disconnect in every function. A possible solution to this, would be having a function f.ex dblink_exists('connection_name') that returns true/false depending on whether the connection already exists. Can't you do this already? SELECT 'myconn' = ANY (dblink_get_connections()); A dedicated function might be a tad faster, but it probably isn't going to matter compared to the overhead of sending a remote query. I agree. The above is about as simple as SELECT dblink_exists('dtest1'); and probably not measurably slower. If you still think a dedicated function is needed, please send the output of some performance testing to justify it. If you really want the notational simplicity, you could use an SQL function to wrap it: CREATE OR REPLACE FUNCTION dblink_exists(text) RETURNS bool AS $$ SELECT $1 = ANY (dblink_get_connections()) $$ LANGUAGE sql; dblink_get_connections() returns null if there are no connections though, so the above will fail if you haven't already established a connection, unless you also check for null, and not just false. I guess you could rewrite the above function to something like: CREATE OR REPLACE FUNCTION dblink_exists(text) RETURNS bool AS $$ SELECT COALESCE($1 = ANY (dblink_get_connections()), false) $$ LANGUAGE sql; -- Tommy Gildseth -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
On Mon, 2008-06-02 at 22:40 +0200, Andreas 'ads' Scherbaum wrote: On Mon, 02 Jun 2008 11:52:05 -0400 Chris Browne wrote: [EMAIL PROTECTED] (Andreas 'ads' Scherbaum) writes: On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote: Well, yes, but you do know about archive_timeout, right? No need to wait 2 hours. Then you ship 16 MB binary stuff every 30 second or every minute but you only have some kbyte real data in the logfile. This must be taken into account, especially if you ship the logfile over the internet (means: no high-speed connection, maybe even pay-per-traffic) to the slave. If you have that kind of scenario, then you have painted yourself into a corner, and there isn't anything that can be done to extract you from it. You are misunderstanding something. It's perfectly possible that you have a low-traffic database with changes every now and then. But you have to copy a full 16 MB logfile every 30 seconds or every minute just to have the slave up-to-date. To repeat my other post in this thread: Actually we can already do better than file-by-file by using pg_xlogfile_name_offset() which was added sometime in 2006. walmgr.py from SkyTools package for example does this to get no more than a few seconds failure window and it copies just the changed part of WAL to slave. pg_xlogfile_name_offset() was added just for this purpose - to enable WAL shipping scripts to query, where inside the logfile current write pointer is. It is not synchronous, but it can be made very close, within subsecond if you poll it frequently enough. --- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
Martijn van Oosterhout napsal(a): On Mon, Jun 02, 2008 at 11:08:55AM -0700, Jeff Davis wrote: http://wiki.postgresql.org/wiki/Todo:Collate The last reference I see on that page is from 2005. Is there any updated information? Are there any major obstacles holding this up aside from the platform issues mentioned on that page? Well, a review of the patch and a bit of work in the optimiser. However, I think the patch will have bitrotted beyond any use by now. It touched many of the areas the operator families stuff touched, for example. I beleive it is being reimplemented as a GSoc project, that's probably a better approach. Should probably just delete the page from the wiki altogether. The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] intercepting WAL writes
Mike schrieb: Hello, I’m new to the core PostgreSQL code, so pardon the question if the answer is really obvious, and I’m just missing it, but I’ve got a relatively large web application that uses PostgreSQL as a back-end database, and we’re heavily using memcached to cache frequently accessed data. I’m looking at modifying PostgreSQL (in some way) to push changes directly to our memcache servers, in hopes of moving towards a system where only writes are actually sent to the databases, and reads are exclusively sent to the memcache servers. I’m guessing that I could intercept the WAL writes, and use this information to push out to my memcache servers, similar to a replication model. Can somebody point to the most logical place in the code to intercept the WAL writes? (just a rough direction would be enough)- or if this doesn’t make sense at all, another suggestion on where to get the data? (I’m trying to avoid doing it using triggers). Thanks, Mike Why not use rules? They are far more easy to use than patching at C Level, and you can simply write some functions at C level and load those as a postgres extension, and interact with MemCache at this level. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
to add some complexity to this topic :-) Please note I admit upfront I am not familiar with every parameter out there, but during my quest in finding bottleneck while stressing the back-end I find many GUC parameters with names that show they should be interesting. I read the comments, the docs (that I know of), go into the source to learn the parameters and their units. And wondering if altering a setting would do miracles in my case. Indeed how can I measure the effect of a new setting? Enhanced throughput? I learned in the past that fiddling with parameter A in a database that was not properly setup/used --and there were many of these--, had side effects in other parts of the engine. Yes, increased throughput was observed. A new hype created. In the end it turned out the parameter A was not set correctly at all. That parameter B, once set to a sane value, cured the wrong behavior, and parameter A was not optimal at all after the cure. We were just side tracked because we did not know. Incorrect knowledge was borne (parameter A setting). Throughout the years this database product has matured, many more parameter realized and much, much more instrumentation been implemented. It still is quite a challenge to understand what is happening. But proper analysis is possible indeed. The black box is much more open now. One current example: wal_writer_delay. In my team there is an advise to set this parameter to 100. However, after implementing a counter (home grown instrumentation) I now know that the background log flush routine is never called when stressing the database. Therefore I now think the best setting is 1 (its maximum) since it does not do useful work (in my context) and therefore should wake up as little times as possible. Without this instrumentation I can only guess about the usability of this parameter and spend many tests in order to get an impression of its validity to me. So overhauling the GUC parameters is one step, but adding proper instrumentation in order to really measure the impact of the new setting is necessary too. Especially when looking in the direction of auto tuning. Proper measurement is crucial to enable correct analysis. Of course I am in favor of doing this with DTrace, however not all platforms can benefit in that case :-) --Paul On 2 jun 2008, at 20:06, Josh Berkus wr Greg, Like some of the other GUC simplification ideas that show up sometimes (unifying all I/O and limiting background processes based on that total is another), this is hard to do internally. Josh's proposal has a fair amount of work involved, but the code itself doesn't need to be clever or too intrusive. Unifying all the memory settings would require being both clever and intrusive, and I doubt you'll find anybody who could pull it off who isn't already overtasked with more important improvements for the 8.4 timeframe. Plus, I'm a big fan of enable an API rather than write a feature. I think that there are people companies out there who can write better autotuning tools than I can, and I'd rather give them a place to plug those tools in than trying to write autotuning into the postmaster. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Regards, Paul van den Bogaard - Paul van den Bogaard [EMAIL PROTECTED] ISV-E -- ISV Engineering, Opensource Engineering group Sun Microsystems, Inc phone:+31 334 515 918 Saturnus 1 extentsion: x (70)15918 3824 ME Amersfoort mobile: +31 651 913 354 The Netherlandsfax: +31 334 515 001 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] DISTINCT - GROUP BY
Folks, I've noticed that queries of the form SELECT DISTNCT foo, bar, baz FROM quux WHERE ... perform significantly worse than the equivalent using GROUP BY. SELECT foo, bar, baz FROM quux WHERE ... GROUP BY foo, bar, baz Where would I start looking in order to make them actually equivalent from the planner's point of view? Also, would back-patching this make sense? It doesn't change any APIs, but it does make some queries go faster. Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DISTINCT - GROUP BY
Hello David http://www.postgresql.org/docs/faqs.TODO.html Consider using hash buckets to do DISTINCT, rather than sorting This would be beneficial when there are few distinct values. This is already used by GROUP BY. Regards Pavel Stehule 2008/6/3 David Fetter [EMAIL PROTECTED]: Folks, I've noticed that queries of the form SELECT DISTNCT foo, bar, baz FROM quux WHERE ... perform significantly worse than the equivalent using GROUP BY. SELECT foo, bar, baz FROM quux WHERE ... GROUP BY foo, bar, baz Where would I start looking in order to make them actually equivalent from the planner's point of view? Also, would back-patching this make sense? It doesn't change any APIs, but it does make some queries go faster. Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DISTINCT - GROUP BY
On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote: Hello David http://www.postgresql.org/docs/faqs.TODO.html Consider using hash buckets to do DISTINCT, rather than sorting This would be beneficial when there are few distinct values. This is already used by GROUP BY. It's nice to see that this is kinda on the TODO, but it doesn't address the question I asked, which is, how would I get the planner to rewrite DISTINCTs as the equivalent GROUP BYs? :) Any hints? Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DISTINCT - GROUP BY
2008/6/3 David Fetter [EMAIL PROTECTED]: On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote: Hello David http://www.postgresql.org/docs/faqs.TODO.html Consider using hash buckets to do DISTINCT, rather than sorting This would be beneficial when there are few distinct values. This is already used by GROUP BY. It's nice to see that this is kinda on the TODO, but it doesn't address the question I asked, which is, how would I get the planner to rewrite DISTINCTs as the equivalent GROUP BYs? :) you can't to do it :( Pavel Any hints? Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rfc: add pg_dump options to dump output
On Tue, Jun 03, 2008 at 01:17:43AM -0400, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: I would like to add the flags given to pg_dump into the output of the pg_dump file. +1, FWIW Anyone see any issues with this? I'm a bit worried about breaking diff-equality of matching dumps, I don't tend to do this too often, but I'm curious if others make a habit of this for some process. When I have done this, it's normally one-off case. but mainly I don't see the point. It's informative, no? Sure if every dump used the same flags, it's pointless. However, that's certainly not the case: --schema, --exclude-schema, etc. ie, I imagine it would be nice to have the information about how a dump was performed without having to track down someone's script or pester the person who's in the know via e-mail if you wanted/needed to repeat the process. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'm very excited to have improved SQL COLLATION support in core. But if I could ask a dumb question, how would I specify a case-insensitive collation? Or maybe the Unicode Collation Algorithm is case-insensitive or has case-insensitive support? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] phrase search
This is far more complicated than I thought. Of course, phrase search should be able to use indexes. I can probably look into how to use index. Any pointers on this? src/backend/utils/adt/tsginidx.c, if you invent operation # in tsquery then you will have index support with minimal effort. Yes this is exactly how I am using in my application. Do you think this will solve a lot of common case or we should try to get phrase search Yeah, it solves a lot of useful case, for simple use it's needed to invent function similar to existsing plaitnto_tsquery, say phraseto_tsquery. It should produce correct tsquery with described above operations. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
David E. Wheeler napsal(a): On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'm very excited to have improved SQL COLLATION support in core. But if I could ask a dumb question, how would I specify a case-insensitive collation? Or maybe the Unicode Collation Algorithm is case-insensitive or has case-insensitive support? It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION collation name FOR character set specification FROM existing collation name [ pad characteristic ] [ case sensitive ] [ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ] pad characteristic := NO PAD | PAD SPACE case sensitive := CASE SENSITIVE | CASE INSENSITIVE accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE You can specify for each collation if it is case sensitive or not and collation function should be responsible to correctly handle this flag. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
Why we need norms? We don't need norms at all - all matched HeadlineWordEntry already marked by HeadlineWordEntry-item! If it equals to NULL then this word isn't contained in tsquery. hlCover does the exact thing that Cover in tsrank does which is to find the cover that contains the query. However hlcover has to go through words that do not match the query. Cover on the other hand operates on position indexes for just the query words and so it should be faster. Cover, by definition, is a minimal continuous text's piece matched by query. May be a several covers in text and hlCover will find all of them. Next, prsd_headline() (for now) tries to define the best one. Best means: cover contains a lot of words from query, not less that MinWords, not greater than MaxWords, hasn't words shorter that ShortWord on the begin and end of cover etc. The main reason why I would I like it to be fast is that I want to generate all covers for a given query. Then choose covers with smallest hlCover generates all covers. Let me know what you think on this patch and I will update the patch to respect other options like MinWords and ShortWord. As I understand, you very wish to call Cover() function instead of hlCover() - by design, they should be identical, but accepts different document's representation. So, the best way is generalize them: develop a new one which can be called with some kind of callback or/and opaque structure to use it in both rank and headline. NumFragments 2: I wanted people to use the new headline marker if they specify NumFragments = 1. If they do not specify the NumFragments or put it to Ok, but if you unify cover generation and NumFragments == 1 then result for old and new algorithms should be the same... On an another note I found that make_tsvector crashes if it receives a ParsedText with curwords = 0. Specifically uniqueWORD returns curwords as 1 even when it gets 0 words. I am not sure if this is the desired behavior. In all places there is a check before call of make_tsvector. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
On Jun 3, 2008, at 12:06, Zdenek Kotala wrote: It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION collation name FOR character set specification FROM existing collation name [ pad characteristic ] [ case sensitive ] [ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ] pad characteristic := NO PAD | PAD SPACE case sensitive := CASE SENSITIVE | CASE INSENSITIVE accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE You can specify for each collation if it is case sensitive or not and collation function should be responsible to correctly handle this flag. Wooo! Now if only i could apply that on a per-column basis. Still, it'll be great to have this for a whole database. Thanks, looking forward to it. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
David E. Wheeler napsal(a): On Jun 3, 2008, at 12:06, Zdenek Kotala wrote: It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION collation name FOR character set specification FROM existing collation name [ pad characteristic ] [ case sensitive ] [ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ] pad characteristic := NO PAD | PAD SPACE case sensitive := CASE SENSITIVE | CASE INSENSITIVE accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE You can specify for each collation if it is case sensitive or not and collation function should be responsible to correctly handle this flag. Wooo! Now if only i could apply that on a per-column basis. Still, it'll be great to have this for a whole database. The first step is per database, because it is relative easy. Collation per-column is very difficult. It requires a lot of changes (parser, planer, executor...) in whole source code, because you need to keep collation information together with text data. It is reason why this task is split to severals part. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
2008/6/3 Zdenek Kotala [EMAIL PROTECTED]: David E. Wheeler napsal(a): On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'm very excited to have improved SQL COLLATION support in core. But if I could ask a dumb question, how would I specify a case-insensitive collation? Or maybe the Unicode Collation Algorithm is case-insensitive or has case-insensitive support? It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION collation name FOR character set specification FROM existing collation name [ pad characteristic ] [ case sensitive ] [ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ] pad characteristic := NO PAD | PAD SPACE case sensitive := CASE SENSITIVE | CASE INSENSITIVE accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE it is in conformance with others databases? Or what is syntax used in others db? regards Pavel You can specify for each collation if it is case sensitive or not and collation function should be responsible to correctly handle this flag. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
Pavel Stehule napsal(a): 2008/6/3 Zdenek Kotala [EMAIL PROTECTED]: David E. Wheeler napsal(a): On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'm very excited to have improved SQL COLLATION support in core. But if I could ask a dumb question, how would I specify a case-insensitive collation? Or maybe the Unicode Collation Algorithm is case-insensitive or has case-insensitive support? It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION collation name FOR character set specification FROM existing collation name [ pad characteristic ] [ case sensitive ] [ accent sensitive ] [ LC_COLLATE lc_collate ] [ LC_CTYPE lc_ctype ] pad characteristic := NO PAD | PAD SPACE case sensitive := CASE SENSITIVE | CASE INSENSITIVE accent sensitive := ACCENT SENSITIVE | ACCENT INSENSITIVE it is in conformance with others databases? Or what is syntax used in others db? It seems to me, that CREATE COLLATION command supports only firebird other databases like MySQL, MS SQL have hardcoded list of collations. Zdenek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #4204: COPY to table with FK has memory leak
On May 28, 2008, at 1:22 PM, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Tomasz Rybak [EMAIL PROTECTED] writes: I tried to use COPY to import 27M rows to table: CREATE TABLE sputnik.ccc24 ( station CHARACTER(4) NOT NULL REFERENCES sputnik.station24 (id), moment INTEGER NOT NULL, flags INTEGER NOT NULL ) INHERITS (sputnik.sputnik); COPY sputnik.ccc24(id, moment, station, strength, sequence, flags) FROM '/tmp/24c3' WITH DELIMITER AS ' '; This is expected to take lots of memory because each row-requiring- check generates an entry in the pending trigger event list. Even if you had not exhausted memory, the actual execution of the retail checks would have taken an unreasonable amount of time. The recommended way to do this sort of thing is to add the REFERENCES constraint *after* you load all the data; that'll be a lot faster in most cases because the checks are done in bulk using a JOIN rather than one-at-a-time. Hm, it occurs to me that we could still do a join against the pending event trigger list... I wonder how feasible it would be to store the pending trigger event list in a temporary table instead of in ram. Related to that, I really wish that our statement-level triggers provided NEW and OLD recordsets like some other databases do. That would allow for RI triggers to be done on a per-statement basis, and they could aggregate keys to be checked. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Case-Insensitve Text Comparison
On Tue, 2008-06-03 at 21:26 +0200, Zdenek Kotala wrote: The first step is per database, because it is relative easy. Collation per-column is very difficult. It requires a lot of changes (parser, planer, executor...) in whole source code, because you need to keep collation information together with text data. Right now typmod is already passed to all those layers, right? Would it be a useful intermediate step to use typmod to hold this information for the text type? I suppose that would only work for text and not varchar, because varchar already has a use for typmod. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hint Bits and Write I/O
On May 27, 2008, at 2:35 PM, Simon Riggs wrote: After some discussions at PGCon, I'd like to make some proposals for hint bit setting with the aim to reduce write overhead. For those that missed it... http://wiki.postgresql.org/wiki/Hint_Bits (see archive reference at bottom). -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Change lock requirements for adding a trigger
On May 30, 2008, at 9:51 AM, Simon Riggs wrote: On Thu, 2008-05-29 at 19:18 -0500, Decibel! wrote: Is there a reason that we can't add a trigger to a table while a select is running? This is a serious pain when trying to setup londiste or slony. This is constrained by locking. There are a subset of DDL commands that might be able to be performed with just an ExclusiveLock or ShareLock rather than an AccessExclusiveLock. Nobody has studied which sub-statements this might apply to, but its do-able since CREATE INDEX already does this. Is there a good way to determine this other than depending on knowledge of the source code? -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Change lock requirements for adding a trigger
On Tue, 2008-06-03 at 16:48 -0500, Decibel! wrote: On May 30, 2008, at 9:51 AM, Simon Riggs wrote: On Thu, 2008-05-29 at 19:18 -0500, Decibel! wrote: Is there a reason that we can't add a trigger to a table while a select is running? This is a serious pain when trying to setup londiste or slony. This is constrained by locking. There are a subset of DDL commands that might be able to be performed with just an ExclusiveLock or ShareLock rather than an AccessExclusiveLock. Nobody has studied which sub-statements this might apply to, but its do-able since CREATE INDEX already does this. Is there a good way to determine this other than depending on knowledge of the source code? The source doesn't know yet. So just analysis and thinking. The mechanism to hold less than an AccessExclusiveLock it doesn't exist yet, but it never will unless we have a list of the things that might be performed correctly with a lower level of lock. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Preference SQL
On May 29, 2008, at 6:08 PM, Jan Urbański wrote: Now about the idea itself: http://www.informatik.uni-augsburg.de/de/lehrstuehle/dbis/db/ publications/all_db_tech-reports/tr-2001-7_kie_koe/ tr-2001-7_kie_koe.pdf That's one of the basic papers about Preference SQL, explaining what it's all about. For those, who don't feel like reading through it just because I said it's interesting, here's some more info (warning, it's a bit formal): Preference SQL is an extension to regular SQL, that allows expressing preferences in SQL queries. Preferences are like soft WHERE clauses. A preference doesn't need to be satisfied by a tuple for it to appear in the result set, but it's preferred it is. More strictly, a set of preference clauses in a SQL query defines a partial order on the result set as it would appear without any preference clauses and then returns the maximal elements. The partial order imposed by the set of preferences P[1], P [2], ..., P[n] is such that tuple T1 T2 iff T1 all preferences T2 satisfies and there is a preference satisfied by T1 and not satisfied by T2 (or there is a preference satisfied by T1 that is better satisfied by T2 and all others are equaly satisfied). As can be seen, there could be an order defined on the degree of satisfyiness of a preference, and the exact semantics are not all that well defined for all concievable cases. Defining a complete semantics will be a part of my thesis. This seems like a subset of http://pgfoundry.org/projects/qbe/ ... or do I misunderstand? -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] proposal: Preference SQL
Decibel! wrote: On May 29, 2008, at 6:08 PM, Jan Urbański wrote: Preference SQL is an extension to regular SQL, that allows expressing preferences in SQL queries. Preferences are like soft WHERE clauses. This seems like a subset of http://pgfoundry.org/projects/qbe/ ... or do I misunderstand? I skimmed through the QBE howto, and I think it's actually far from it. The thing that closely resembles preference clauses is the SKYLINE OF operator, mentioned eariler in the thread - there is some archives coverage on it. I'm still working on producing a comparision of preference SQL and the skyline operator, more to follow soon. -- Jan Urbanski GPG key ID: E583D7D2 ouden estin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Preference SQL
Jan, I'm still working on producing a comparision of preference SQL and the skyline operator, more to follow soon. The big problem with all of these is that there's no standards on approximate queries yet. So we're reluctant to support syntax extensions for them. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Preference SQL
Josh Berkus wrote: Jan, I'm still working on producing a comparision of preference SQL and the skyline operator, more to follow soon. The big problem with all of these is that there's no standards on approximate queries yet. So we're reluctant to support syntax extensions for them. Yes, I realized it after some thought - adding nonstandard syntax isn't so great after all. Right now I'm wondering if SQL standard window functions can do the things I though could be doable with preferences. Maybe I should talk to my thesis supervisor and find out if implementing window functions would be an equally good subject... I suppose having window functions would be a nice thing? To be honest - I need a thesis subject and I like fiddling with Postgres. I'm trying to find an area in which my work would be useful to the community and enough of a standalone feature, that I can use it as the basis of my dissertation. Also, going through the PG development process will ensure that the resulting code will be of topmost quality ;) -- Jan Urbanski GPG key ID: E583D7D2 ouden estin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: Preference SQL
Jan Urbański wrote: Maybe I should talk to my thesis supervisor and find out if implementing window functions would be an equally good subject... I suppose having window functions would be a nice thing? To be honest - I need a thesis subject and I like fiddling with Postgres. Window functions are on the TODO list and a good implementation is something we really really want. You will get far less resistance than for your original idea. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
My main argument for using Cover instead of hlCover was that Cover will be faster. I tested the default headline generation that uses hlCover with the current patch that uses Cover. There was not much difference. So I think you are right in that we do not need norms and we can just use hlCover. I also compared performance of ts_headline with my first patch to headline generation (one which was a separate function and took tsvector as input). The performance was dramatically different. For one query ts_headline took roughly 200 ms while headline_with_fragments took just 70 ms. On an another query ts_headline took 76 ms while headline_with_fragments took 24 ms. You can find 'explain analyze' for the first query at the bottom of the page. These queries were run multiple times to ensure that I never hit the disk. This is a m/c with 2.0 GhZ Pentium 4 CPU and 512 MB RAM running Linux 2.6.22-gentoo-r8. A couple of caveats: 1. ts_headline testing was done with current cvs head where as headline_with_fragments was done with postgres 8.3.1. 2. For headline_with_fragments, TSVector for the document was obtained by joining with another table. Are these differences understandable? If you think these caveats are the reasons or there is something I am missing, then I can repeat the entire experiments with exactly the same conditions. -Sushant. Here is 'explain analyze' for both the functions: ts_headline lawdb=# explain analyze SELECT ts_headline('english', doc, q, '') FROMdocraw, plainto_tsquery('english', 'freedom of speech') as q WHERE docraw.tid = 125596; QUERY PLAN Nested Loop (cost=0.00..8.31 rows=1 width=497) (actual time=199.692..200.207 rows=1 loops=1) - Index Scan using docraw_pkey on docraw (cost=0.00..8.29 rows=1 width=465) (actual time=0.041..0.065 rows=1 loops=1) Index Cond: (tid = 125596) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.010..0.014 rows=1 loops=1) Total runtime: 200.311 ms headline_with_fragments --- lawdb=# explain analyze SELECT headline_with_fragments('english', docvector, doc, q, 'MaxWords=40') FROMdocraw, docmeta, plainto_tsquery('english', 'freedom of speech') as q WHERE docraw.tid = 125596 and docmeta.tid=125596; QUERY PLAN -- Nested Loop (cost=0.00..16.61 rows=1 width=883) (actual time=70.564..70.949 rows=1 loops=1) - Nested Loop (cost=0.00..16.59 rows=1 width=851) (actual time=0.064..0.094 rows=1 loops=1) - Index Scan using docraw_pkey on docraw (cost=0.00..8.29 rows=1 width=454) (actual time=0.040..0.044 rows=1 loops=1) Index Cond: (tid = 125596) - Index Scan using docmeta_pkey on docmeta (cost=0.00..8.29 rows=1 width=397) (actual time=0.017..0.040 rows=1 loops=1) Index Cond: (docmeta.tid = 125596) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.012..0.016 rows=1 loops=1) Total runtime: 71.076 ms (8 rows) On Tue, 2008-06-03 at 22:53 +0400, Teodor Sigaev wrote: Why we need norms? We don't need norms at all - all matched HeadlineWordEntry already marked by HeadlineWordEntry-item! If it equals to NULL then this word isn't contained in tsquery. hlCover does the exact thing that Cover in tsrank does which is to find the cover that contains the query. However hlcover has to go through words that do not match the query. Cover on the other hand operates on position indexes for just the query words and so it should be faster. Cover, by definition, is a minimal continuous text's piece matched by query. May be a several covers in text and hlCover will find all of them. Next, prsd_headline() (for now) tries to define the best one. Best means: cover contains a lot of words from query, not less that MinWords, not greater than MaxWords, hasn't words shorter that ShortWord on the begin and end of cover etc. The main reason why I would I like it to be fast is that I want to generate all covers for a given query. Then choose covers with smallest hlCover generates all covers. Let me know what you think on this patch and I will update the patch to respect other options like MinWords and ShortWord. As I understand, you very wish to call Cover() function instead of hlCover() - by design, they should be identical, but accepts different document's representation. So, the best way is generalize them: develop a new one which can be called with some kind of callback or/and opaque structure to use it in both rank and headline. NumFragments 2: I wanted people to use the new headline marker if they specify NumFragments = 1. If they do not
Re: [HACKERS] rfc: add pg_dump options to dump output
On Tuesday 03 June 2008 01:17:43 Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: Anyone see any issues with this? I'm a bit worried about breaking diff-equality of matching dumps, If you are calling pg_dump with different flags, it seems likely your breaking diff equality anyway so I'm not sure how valid that is. Actually, in some cases I'd expect the difference in flags might actually make other difference more clear, for example comparing a dump with a -T flag vs. one without, the differences might be scattered throughout the dump (table, triggers,indexes), seeing the difference in the dump flags would likely make things more concrete. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] phrase search
On Tue, 2008-06-03 at 22:16 +0400, Teodor Sigaev wrote: This is far more complicated than I thought. Of course, phrase search should be able to use indexes. I can probably look into how to use index. Any pointers on this? src/backend/utils/adt/tsginidx.c, if you invent operation # in tsquery then you will have index support with minimal effort. Yes this is exactly how I am using in my application. Do you think this will solve a lot of common case or we should try to get phrase search Yeah, it solves a lot of useful case, for simple use it's needed to invent function similar to existsing plaitnto_tsquery, say phraseto_tsquery. It should produce correct tsquery with described above operations. I can add index support and support for arbitrary distance between lexeme. It appears to me that supporting arbitrary boolean expression will be complicated. Can we pull out something from TSQuery? -Sushant. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
On Sun, Jun 01, 2008 at 01:43:22PM -0400, Tom Lane wrote: power to him. (Is the replica-hooks-discuss list still working?) But Yes. And silent as ever. :-) A -- Andrew Sullivan [EMAIL PROTECTED] +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Tue, 3 Jun 2008, Paul van den Bogaard wrote: So overhauling the GUC parameters is one step, but adding proper instrumentation in order to really measure the impact of the new setting is necessary too. Correct, but completely off-topic regardless. One problem to be solved here is to take PostgreSQL tuning from zero to, say, 50% automatic. Wander the user lists for a few months; the number of completely misconfigured systems out there is considerable, partly because the default values for many parameters are completely unreasonable for modern hardware and there's no easy way to improve on that without someone educating themselves. Getting distracted by the requirements of the high-end systems will give you a problem you have no hope of executing in a reasonable time period. By all means bring that up as a separate (and much, much larger) project: Database Benchmarking and Sensitivity Analysis of Performance Tuning Parameters would make a nice PhD project for somebody, and there's probably a good patent in there somewhere. Even if you had such a tool, it wouldn't be usable by non-experts unless the mundate GUC generation issues are dealt with first, and that's where this is at right now. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rfc: add pg_dump options to dump output
James William Pye [EMAIL PROTECTED] writes: On Tue, Jun 03, 2008 at 01:17:43AM -0400, Tom Lane wrote: but mainly I don't see the point. It's informative, no? Well, the stuff included into the dump by pg_dump -v is informative, too. But we stopped doing that by default because of complaints. I remain unconvinced that this proposal won't suffer the same fate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
Jeff Davis [EMAIL PROTECTED] writes: Right now typmod is already passed to all those layers, right? Would it be a useful intermediate step to use typmod to hold this information for the text type? No, it would not, because typmod doesn't propagate through functions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Core team statement on replication in PostgreSQL
Hannu Krosing wrote: The simplest form of synchronous wal shipping would not even need postgresql running on slave, just a small daemon which reports when wal blocks are a) received and b) synced to disk. While that does sound simple, I'd presume that most people would want the guarantee of the same version of postgresql installed wherever the logs are ending up, with the log receiver speaking the same protocol version as the log sender. I imagine that would be most easily achieved through using something like the continuously restoring startup mode of current postgresql. However variations on this kind of daemon can be used to perform testing, configuring it to work well, go slow, pause, not respond, disconnect, or fail in particular ways, emulating disk full, etc. Regards, Stephen Denne. -- At the Datamail Group we value teamwork, respect, achievement, client focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by replying immediately, destroy it and do not copy, disclose or use it in any way. The Datamail Group, through our GoGreen programme, is committed to environmental sustainability. Help us in our efforts by not printing this email. __ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/dmzmessaging.htm for details. __ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Case-Insensitve Text Comparison
On Tue, Jun 03, 2008 at 01:53:56PM -0700, Jeff Davis wrote: On Tue, 2008-06-03 at 21:26 +0200, Zdenek Kotala wrote: The first step is per database, because it is relative easy. Collation per-column is very difficult. It requires a lot of changes (parser, planer, executor...) in whole source code, because you need to keep collation information together with text data. Right now typmod is already passed to all those layers, right? Would it be a useful intermediate step to use typmod to hold this information for the text type? In SQL the collation is associated with a node in the parse tree and not with the values at all. It's a sort of extra parameter to functions (at least, that's how I implemented it). So you can say things like: SELECT text COLLATE case_insensetive; Here the collate clause does nothing, though if you had a SELECT INTO it would control the default collation for that column. The standard has rules on how to determine what the collation at any point (explicit overrides implicit overrides default). If two columns have conflicting collations, when comparing them you are required to disambiguate or it's an (parse-time) error. Check the archives for details on how it works precisely, but it's far nicer than merely adding an typmod, since that would cause you to throw errors at runtime if there's a problem. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
[HACKERS] cannot use result of (insert..returning)
Hello, I need to use query like: select (insert into test (a) values (x) returning b),c from anytable where condition but it say ERROR: syntax error at or near into Is this a bug? Function does not work too: create function addt(..) returning .. as 'insert ... returning ..' language 'sql' ERROR:... DETAIL: Function's final statement must be a SELECT. BUT: create function addt(..) returning .. as 'insert...(nextval(..)...);select currval(..)' language 'sql' work in select addt(x),c from anytable where condition but this function is analog of insert...returning in any case Why analog work better then original? What is my mistake? (I dont want use functions here!) dvs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers