Re: [GENERAL] Fault Tolerant Postgresql (two machines, two postmasters, one disk array)
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ron Johnson Sent: donderdag 17 mei 2007 22:56 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Fault Tolerant Postgresql (two machines, two postmasters, one disk array) -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/17/07 09:35, Andrew Sullivan wrote: [snip] The problems come when you get a false detection of machine failure. Consider a case, for instance, where the machine A gets overloaded, goes into swap madness, or has a billion runaway processes that cause it to stagger. In this case, A might not respond in time on the heartbeat monitor, and then the standby machine B thinks A has failed. But A doesn't know that, of course, because it is working as hard as it can just to stay up. Now, if B mounts the disk and starts the postmaster, but doesn't have a way to make _sure_ tha A is completely disconnected from the disk, then it's entirely possible A will flush buffers out to the still-mounted data area. Poof! Instant data corruption. Aren't there PCI heartbeat cards that are independent of the load on the host machine? A solution commonly seen is to cut the power on the 'failed' machine just before a take-over is done. Solutions for that are available... Besides this, you don't want a separate PCI heartbeat card to see if your software happens to work. Same situation with a watchdog, you don't want the watchdog to 'reset' itself continuesly, as you loose the benefit of the watchdog. Generally your software should also check is postgresql is operation as expected: its not stopped or non-responsive. In these cases the system should fail over. The 'cut power' solution works. If you look for details how to set up, heartbeat (www.linux-ha.org) and search for stonith. They have lots and lots of very useful information about high availability solutions. Furthermore the package is used arround the world for these solutions by large companies and part of several other software packages. It supports linux and BSD... - Joris ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] about the pgadmin3
my os is Redhat as4,and my postgreSQL is 7.4 , well, when i use pgadmin(windows version1.6.3) to link to the database.the return frame tells me pg_user doesn't exist and pg_settings doesn't exist,how shall handle the problem?
Re: [GENERAL] Data replication through disk replication
On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote: that all changes are replicated, it won't say an fsync is finished until it's finished on the remote host too, and it won't let you mount the block device on the slave system (at least with 0.7x). How can it guarantee these things? The web pages say this: If the primary node fails, heartbeat is switching the secondary device into primary state and starts the application there. (If you are using it with a non-journaling FS this involves running fsck) If the failed node comes up again, it is a new secondary node and has to synchronise its content to the primary. This, of course, will happen whithout interruption of service in the background. So what happens in those cases where the primary node gets in trouble but isn't actually dead yet? I see a potential for a race condition here that is really troubling to me. (Especially since it uses the TCP/IP stack, which is notoriously subject to DoS on Linux.) I think you really had better have something like STONITH running to use this. In case you think it matters, IBM recommends DRBD for DB2. IBM also sells HACMP. I refuse to comment on whether IBM's advice on high availability products is worth taking seriously. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] basic postgres questions...
hey.. . can someone point me to the cmds that i'd use in order to see what databases are created in my postgres app. i need to see what's here, and then i need to know the cmd to then delete a given database, and the underlying tables. from the OS command line, psql -l will list the available Postgres databases For general info on this sort of admin task, see http://www.postgresql.org/docs/8.2/static/managing-databases.html also, is there an irc channel for postgres! See: http://www.postgresql.org/community/irc.html there are several... Cheers, Brent Wood ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] How to implement GOMONTH function
Thank all very much for great suggestions. I created function CREATE OR REPLACE FUNCTION PUBLIC.GOMONTH(DATE, INTEGER, OUT DATE) IMMUTABLE AS $_$ SELECT ($1 + ($2 * '1 MONTH'::INTERVAL))::DATE; $_$ LANGUAGE SQL; I got errors: function gomonth(date, numeric) does not exist and function gomonth(date, bigint ) does not exist How to fix those errors ? Andrus. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] tsearch2 keep throw-away characters
postgres=# select to_tsvector('default','I know how to code in C#, java and C++'); to_tsvector - 'c':7,10 'code':5 'java':8 'know':2 (1 row) postgres=# select to_tsvector('simple','I know how to code in C#, java and C++'); to_tsvector - 'c':7,10 'i':1 'in':6 'to':4 'and':9 'how':3 'code':5 'java':8 'know':2 (1 row) I'd like to get lexemes/tokens 'c#' and 'c++' out of this query. Everything I can find has to do with stop words. How do I keep characters that tsearch throws out? I've already tried 'c\#' and 'c\\#' etc, which don't work. Kimball
[GENERAL] FULL JOIN is only supported with merge-joinable join conditions
I try to port application to PostgreSQL 8.1+ The following query runs OK in VFP but causes error in Postgres FULL JOIN is only supported with merge-joinable join conditions How to fix ? Andrus. SELECT ametikoh.Nimetus as ametikoht, Isik.nimi, Isik.eesnimi, koosseis.kogus, COALESCE( iandmed.Kuluobj, koosseis.objekt1) as osakond FROM iandmed FULL JOIN koosseis ON iandmed.ametikoht=koosseis.ametikoht AND iandmed.Kuluobj= koosseis.objekt1 AND iandmed.AmetiKoht is not null JOIN ametikoh ON COALESCE(iandmed.ametikoht,koosseis.ametikoht)= ametikoh.AmetiKoht LEFT JOIN isik ON iandmed.isik=isik.isik WHERE true ORDER BY 1,2 Revelant pars of table structures are: CREATE TABLE iandmed ( reanr integer NOT NULL DEFAULT nextval('iandmed_reanr_seq'::regclass), isik character(10) NOT NULL, miskuup date, plopp date, summavrt numeric(12,2), kuluobj character(10), ametikoht numeric(7), CONSTRAINT iandmed_pkey PRIMARY KEY (reanr) ) ; CREATE TABLE koosseis ( id numeric(7) NOT NULL, ametikoht numeric(7) NOT NULL, objekt1 character(10) NOT NULL, kogus numeric(4) NOT NULL DEFAULT 0, algus date, lopp date, CONSTRAINT koosseis_pkey PRIMARY KEY (id) ); ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] adding missing FROM-clause entry for table
I am porting from 7.4 to 8.1 and i turned off the option to add missing from clause. Is there a way to know How postgres rewritte the queries or add the missing from clause on the postgres log? thank you Vivian ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] a few questions on backup
Marco Colombo wrote: I'll try that out. Maybe my ideas are so far from the truth that I'm having a hard time in explaing them to people who actually know how things work. I'll be back with results. Meanwhile, thanks for your time. I think I finally got it. Segment 34 in my pg_xlog got archived and recycled. It became segment 39, but PG is still working on segment 35, after some hours. Now pg_xlog contains 5 segments, from named from 35 to 39, 35 being the most recently modified. 39 won't be used yet for about a couple of days. Now I see what you mean for recycled: I thought it meant marked free for later use, but it means renamed for future use. My mistake was assuming that the rename part happens lazily when PG starts using the file. Instead, it happens right after (the eventual) archiving. That makes the strategy in my original post somehow unfeasable. Still, I was not completely wrong: # cmp /var/lib/pgsql/data/pg_xlog/000100010039 /u1/pg_wal_archive/000100010034 echo Yes Yes They do contain the same data, that of segment 34, and the *39 file will stay there, untouched, for quite a while after the backup. So the WAL segment I need *is* there, just with a different name. The only problem is figuring out what segment that data actually belongs to. I know only because I can compare it with the archived one. Now, I could still make some educated guesses, by looking at modification times, but definitely a guessing game is not something you want to play when restoring your precious data. :) Archiving the WAL segments and letting the recovery procedure handle them at restore time is easier anyway. Again, thanks a lot. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Are foreign key's automatically set as indicies?
Hey all, I'm sorta new to PG and am just wondering, when I create a foreign key is it automatically set as an indexed column or do I need to also set that manually? Thanks in advance, Cam ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Committing in a trigger
Hi there, I'm using PG 8.2.3. Is it possible to (somehow) commit a specific statement in a trigger function if the function itself is rolled back because of an error (eg, for a unique index error)? For example: create table tab1 (col1 int unique); create table tab2 (col1 int); CREATE OR REPLACE FUNCTION f_func1 () RETURNS trigger AS $$ BEGIN -- ... some processing ... INSERT INTO tab2 (col1) VALUES (new.col1); -- COMMIT the above statement, irrespective of whether this -- trigger/function is rolled back or not. RETURN new; end; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER tiu_t1 BEFORE UPDATE OR INSERT ON tab1 FOR EACH ROW EXECUTE PROCEDURE f_func1(); * * * Now, if you: insert into tab1 (col1) values (1); and tab1 already has a row with col1=1, tab2 must be updated even when the statement fails with: ERROR: duplicate key violates unique constraint... I know this can be achieved outside the DB (ie, with checks, etc), but I'd like to keep this aspect inside the DB. Also, I could perform selects inside the trigger to pre-empt a unique constraint error, but this will slow the inserts down. I can't wrap BEGIN/COMMIT around the INSERT in the trigger. Is there another way of achieving this? Any suggestions are appreciated. Regards ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] change database encoding without corrupting data (latin9 to utf8)
I have a problem to entry data to postgres database (latin9) from my perl/tk application running on windows (utf8). Whenever I try to entry letter with accents, these looks corrupted once stored into database. A workaround is to set client encoding to UTF8 after creating the database connection: $dbh-do(qq/SET client_encoding to 'UTF8'/); To avoid such kind of workaround I'd like to convert the whole database from LATIN9 to UTF8, how can I do it without corrupting the data? Thanks, Filippo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Are foreign key's automatically set as indicies?
On 16/05/2007 05:09, camb wrote: I'm sorta new to PG and am just wondering, when I create a foreign key is it automatically set as an indexed column or do I need to also set that manually? AFAIK you need to create the index separately. I use PgAdmin most of the time, which does that by default, so I'm a bit hazy on doing it by hand. Ray. --- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland [EMAIL PROTECTED] --- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Are foreign key's automatically set as indicies?
On Wed, 16 May 2007 06:09:15 +0200, camb [EMAIL PROTECTED] wrote: Hey all, I'm sorta new to PG and am just wondering, when I create a foreign key is it automatically set as an indexed column or do I need to also set that manually? Primary key creates unique index. Foreign keys do not create any indexes, it is up to you to decide if you need indexes or not. Say : CREATE TABLE items ( ... category_id REFERENCES categories( id ) ) if you want cascaded deletes/updates to your items table to be fast, or you need it for other reasons, create an index on category_id. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Data replication through disk replication
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Sullivan Sent: zaterdag 19 mei 2007 15:28 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Data replication through disk replication On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote: that all changes are replicated, it won't say an fsync is finished until it's finished on the remote host too, and it won't let you mount the block device on the slave system (at least with 0.7x). How can it guarantee these things? The web pages say this: If the primary node fails, heartbeat is switching the secondary device into primary state and starts the application there. (If you are using it with a non-journaling FS this involves running fsck) If the failed node comes up again, it is a new secondary node and has to synchronise its content to the primary. This, of course, will happen whithout interruption of service in the background. So what happens in those cases where the primary node gets in trouble but isn't actually dead yet? I see a potential for a race condition here that is really troubling to me. (Especially since it uses the TCP/IP stack, which is notoriously subject to DoS on Linux.) I think you really had better have something like STONITH running to use this. General advise you see at linux-ha is to use redundant heartbeat paths. You can use a serial link if you want to. Other options are redundent networks. This is to reduce the probability of a split-brain situation. As you guessed stonith is much required to guard against possible 'race' conditions caused by hanging nodes. As a note, DRDB will also do a lot of work for you. It prevents you from some mistakes you can make. While starting it waits if it does not detect the other node, and thus doesn't know who has the latest data available. Can be overriden by a timeout if desired. (In practice only a single will fail or both will come up at the same time). Prevents running out of sync. It also detects when its out of sync, requiring administrator intervention. Another advice is to take a look at the linux-ha web sit and mailing list. Though the web-site might not be the best, the mailing list is quite active and a lot of knowledge is available. In general high availability is complex and requires a lot of thought to cover all possible cases. [snip] - Joris Dobbelsteen ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Data replication through disk replication
Er, yes, sorry, I didn't mean to imply that you should run without some kind of STONITH solution, to catch the case when the link DRDB uses goes down but the other network links are still working fine. It's in the common case, when everything is working, that DRBD won't accidentally let you mount the same device on both machines. (In my experience, operator error more common than hardware error.) On May 19, 2007, at 6:28 AM, Andrew Sullivan wrote: On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote: that all changes are replicated, it won't say an fsync is finished until it's finished on the remote host too, and it won't let you mount the block device on the slave system (at least with 0.7x). How can it guarantee these things? The web pages say this: If the primary node fails, heartbeat is switching the secondary device into primary state and starts the application there. (If you are using it with a non-journaling FS this involves running fsck) If the failed node comes up again, it is a new secondary node and has to synchronise its content to the primary. This, of course, will happen whithout interruption of service in the background. So what happens in those cases where the primary node gets in trouble but isn't actually dead yet? I see a potential for a race condition here that is really troubling to me. (Especially since it uses the TCP/IP stack, which is notoriously subject to DoS on Linux.) I think you really had better have something like STONITH running to use this. In case you think it matters, IBM recommends DRBD for DB2. IBM also sells HACMP. I refuse to comment on whether IBM's advice on high availability products is worth taking seriously. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Data replication through disk replication
Alvaro Herrera wrote: Ben wrote: If you're just looking for a way to have high availability and you're ok being tied to linux, DRBD is a good way to go. It keeps things simple in that all changes are replicated, it won't say an fsync is finished until it's finished on the remote host too, Oh, so that's how it works. I assume performance must be, huh, not stellar? It depends... if the machines are equal, and you have bonded two gigabits together Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Against legal disclaimers in email
[ redirected for a wider audience ] Lew [EMAIL PROTECTED] writes: John Summerfield wrote: Penchalaiah P. wrote: Information transmitted by this e-mail is proprietary to Infinite Computer Solutions It may be proprietary, but it shore ain't confidential! Placing confidential on every document without regard for its content, especially when some of it's publicly disseminated, can remove the protection of confidentiality at law from all such marked documents in many jurisdictions, including the U.S. There must be discrimination applied in the marking of information as confidential. Quite aside from the foolishness you pointed out of marking something confidential then placing it into the public eye in an archived forum where it will be visible by everybody forever. Now we can publish everything ever written at or by Infinite Computer Solutions without fear of liability. Wow, that is a great argument that I've not seen before. The only way we'll ever make a dent in the plague of legalistic disclaimers auto-attached to email is if we can convince corporate lawyers that there is a downside *to them* of insisting on the damned things. This is the first argument I've seen that there might actually be a serious legal downside to the practice. Can you cite chapter and verse to back up this position? It'd be great to have a well-reasoned FAQ document that people could just automatically send to anyone exhibiting this folly ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Performance tuning?
We have a MS SQL server db that we successfully migrated to pgsql 8.2 and I am now working on some views and notice select queries being very slow. I have vacuumed last night and running the exact same query (with minor changes to syntax of course), it runs in just a few seconds compared to several minutes in pgsql. Since I've never dealt with MS SQL, I wanted to ask here if this is expected performance for the type of query *or* do I need to just learn how to properly tune my performance on the pgsql server? Perhaps some commands or tests may help me determine where issues may lie? I am running the following query on a linux server with comparable processor and memory as the windows server. The query was just taken from the SQL server as is and adjusted teh syntax...the query only returns 3 records, but several tables have tens of thousands of records, the tblactivitytag table has over 100K... SELECT distinct A.fldClientNumber as cNumber, A.fldClientName as cName, B.fldContactNumber as contactNumber, B.fldContactCity as cCity, B.fldContactState as cState, B.fldContactFirstName as contactFName, B.fldContactLastName as contactLName, B.fldContactEmail as ContactEmail, B.fldContactTitle as cTitle, B.fldContactPhone1_Num as B1Phonenumber, B.fldContactPhone4_Type as Num4Type, B.fldContactPhone4_Num as CellNum FROM tblClientMaster A, tblContactInfo B,tblClientProductPreference C, tblClientRoomSize D,tblProductMaster F, tblClientProductRelation G, tblclientcomments H, tblgeopreference E ,tblClientActivityTag WHERE A.fldClientNumber = B.fldClientNumber AND A.fldClientNumber = C.fldClientNumber AND A.fldClientNumber = D.fldClientNumber AND A.fldClientName ilike '%ADVISOR%' AND B.fldContactFirstName ilike '%%%' AND A.fldClientNumber = G.fldClientNumber AND G.fldProductNumber = F.fldProductNumber AND F.fldProductName ilike '%%%' AND A.fldClientNumber = H.fldClientNumber AND H.fldenable = 't' AND H.fldcontactnumber = b.fldcontactnumber AND H.fldClientcomments ilike '%%%' AND (A.fldBuyingStatus = 'Now' ) AND (A.fldSellingStatus = 'Now' ) AND (C.fldFullService = 't' ) AND (D.fldSize149 = 't' ) AND (E.fldW = 't' ) AND A.fldClientNumber = E.fldClientNumber AND A.fldclientnumber = tblClientActivityTag.fldclientnumber AND tblClientActivityTag.fldcontactnumber = b.fldcontactnumber AND tblClientActivityTag.fldcontactactivitytag like 'A%' AND b.fldcontactnumber in (select fldcontactnumber from tblclientcomments where tblclientcomments$ A.fldEnable = 't' AND B.fldEnable = 't' ORDER BY A.fldClientName, B.fldContactLastName; -- Robert ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Performance tuning?
Robert Fitzpatrick [EMAIL PROTECTED] writes: I am running the following query on a linux server with comparable processor and memory as the windows server. Show us the table definitions and the EXPLAIN ANALYZE output, please. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] change database encoding without corrupting data (latin9 to utf8)
On Fri, May 18, 2007 at 02:46:26AM -0700, filippo wrote: I have a problem to entry data to postgres database (latin9) from my perl/tk application running on windows (utf8). Whenever I try to entry letter with accents, these looks corrupted once stored into database. A workaround is to set client encoding to UTF8 after creating the database connection: $dbh-do(qq/SET client_encoding to 'UTF8'/); Workaround has a negative connotation that's perhaps misused in this case because setting client_encoding is the proper way of telling the database what the client's encoding is. If the connecting role will always use UTF8 then you could use ALTER ROLE (or ALTER USER in 8.0 and earlier) to automatically set client_encoding for every connection: ALTER ROLE rolename SET client_encoding TO 'UTF8'; To avoid such kind of workaround I'd like to convert the whole database from LATIN9 to UTF8, how can I do it without corrupting the data? If all of the data is uncorrupted LATIN9 then you could use pg_dump to dump the LATIN9 database and then restore it into a UTF8 database. But if you have a mix of uncorrupted and corrupted characters (UTF8 byte sequences stored as LATIN9) then you have a bit of a problem because some data needs to be converted from LATIN9 to UTF8 but other data is already UTF8 and shouldn't be converted. -- Michael Fuhr ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Performance tuning?
On Sat, 2007-05-19 at 17:05 -0400, Tom Lane wrote: Robert Fitzpatrick [EMAIL PROTECTED] writes: I am running the following query on a linux server with comparable processor and memory as the windows server. Show us the table definitions and the EXPLAIN ANALYZE output, please. Thanks Tom... --- Unique (cost=2326081.07..2354383.40 rows=12445 width=998) (actual time=71931.967..71989.731 rows=3 loops=1) - Sort (cost=2326081.07..2328258.17 rows=870841 width=998) (actual time=71931.959..71943.845 rows=9110 loops=1) Sort Key: a.fldclientname, b.fldcontactlastname, a.fldclientnumber, b.fldcontactnumber, b.fldcontactcity, b.fldcontactstate, b.fldcontactfirstname, b.fldcontactemail, b.fldcontacttitle, b.fldcontactphone1_num, b.fldcontactphone4_type, b.fldcontactphone4_num - Merge Join (cost=55798.98..60543.68 rows=870841 width=998) (actual time=46902.686..70218.041 rows=9110 loops=1) Merge Cond: (outer.fldclientnumber = inner.fldclientnumber) - Merge Join (cost=679.89..4617.75 rows=224283 width=8) (actual time=17.104..74.653 rows=125 loops=1) Merge Cond: (outer.fldclientnumber = inner.fldclientnumber) - Index Scan using ix_tblgeopreference_fldclientnumber on tblgeopreference e (cost=0.00..556.87 rows=6699 width=4) (actual time=0.205..56.266 rows=143 loops=1) Filter: (fldw = true) - Sort (cost=679.89..696.63 rows=6696 width=4) (actual time=16.844..17.005 rows=247 loops=1) Sort Key: c.fldclientnumber - Seq Scan on tblclientproductpreference c (cost=0.00..254.39 rows=6696 width=4) (actual time=0.084..15.884 rows=663 loops=1) Filter: (fldfullservice = true) - Materialize (cost=55119.09..55127.13 rows=804 width=1014) (actual time=46827.886..70028.280 rows=9110 loops=1) - Merge Join (cost=53060.03..55118.29 rows=804 width=1014) (actual time=46827.877..69956.976 rows=9110 loops=1) Merge Cond: (outer.fldclientnumber = inner.fldclientnumber) - Nested Loop (cost=53060.03..54565.61 rows=24 width=1010) (actual time=37189.898..69232.176 rows=25048 loops=1) - Nested Loop (cost=53060.03..54420.94 rows=24 width=1014) (actual time=37148.445..67472.468 rows=25048 loops=1) - Nested Loop IN Join (cost=53060.03..53581.73 rows=1 width=1006) (actual time=37129.788..66642.591 rows=1017 loops=1) Join Filter: (inner.fldcontactnumber = outer.fldcontactnumber) - Nested Loop (cost=53060.03..53565.72 rows=1 width=1040) (actual time=36584.031..37402.166 rows=1017 loops=1) Join Filter: (outer.fldcontactnumber = (inner.fldcontactnumber)::numeric) - Merge Join (cost=53060.03..53087.19 rows=1 width=210) (actual time=36561.298..36603.979 rows=1873 loops=1) Merge Cond: ((outer.fldclientnumber = inner.fldclientnumber) AND (outer.fldcontactnumber = inner.fldcontactnumber)) - Sort (cost=50577.52..50585.04 rows=3008 width=189) (actual time=36156.473..36159.932 rows=6167 loops=1) Sort Key: a.fldclientnumber, h.fldcontactnumber - Nested Loop (cost=0.00..50403.74 rows=3008 width=189) (actual time=6.180..36110.024 rows=6167 loops=1) Join Filter: ((outer.fldclientnumber)::numeric = inner.fldclientnumber) - Seq Scan on tblclientmaster a (cost=0.00..728.70 rows=1 width=172) (actual time=0.680..197.224 rows=4 loops=1) Filter: (((fldclientname)::text ~~* '%ADVISOR%'::text) AND ((fldbuyingstatus)::text = 'Now'::text) AND ((fldsellingstatus)::text = 'Now'::text) AND (fldenable = true)) - Seq Scan on tblclientcomments h (cost=0.00..40651.36 rows=601579 width=34) (actual time=0.019..7026.388 rows=1202169 loops=4) Filter: ((fldenable = true) AND
Re: [GENERAL] Performance tuning?
Robert Fitzpatrick [EMAIL PROTECTED] writes: On Sat, 2007-05-19 at 17:05 -0400, Tom Lane wrote: Show us the table definitions and the EXPLAIN ANALYZE output, please. There seem to be a couple of problems visible in the EXPLAIN output: - Nested Loop (cost=53060.03..53565.72 rows=1 width=1040) (actual time=36584.031..37402.166 rows=1017 loops=1) Join Filter: (outer.fldcontactnumber = (inner.fldcontactnumber)::numeric) - Nested Loop (cost=0.00..50403.74 rows=3008 width=189) (actual time=6.180..36110.024 rows=6167 loops=1) Join Filter: ((outer.fldclientnumber)::numeric = inner.fldclientnumber) You're comparing fields of distinct types, which not only incurs run-time type conversions but can interfere with the ability to use some plan types at all. Looking at the table definitions, you've got primary keys declared as SERIAL (ie, integer) and the referencing columns declared as NUMERIC(18,0). This is just horrid for performance :-( --- NUMERIC arithmetic is pretty slow, and it's really pointless when the referenced columns are only integers. I suspect you should have translated these column types as BIGINT (and BIGSERIAL). - Merge Join (cost=53060.03..53087.19 rows=1 width=210) (actual time=36561.298..36603.979 rows=1873 loops=1) Merge Cond: ((outer.fldclientnumber = inner.fldclientnumber) AND (outer.fldcontactnumber = inner.fldcontactnumber)) The planner is drastically underestimating the number of rows out of this join, probably because it does not know that there is any correlation between fldclientnumber and fldcontactnumber, where in reality I bet there's a lot. Is it possible that one of these fields is actually functionally dependent on the other, such that you could use just one of them in the join? The one-result-row estimate is bad because it leads to inappropriate choices of nestloop joins. There may not be much you can do about that part, but I suspect if you get rid of the pointless use of NUMERIC arithmetic you'll find a lot of the performance issue goes away. Another thing that might be worth fixing is the rather silly use of '%%%' rather than '%' for a no-op LIKE pattern. It looks like the planner's LIKE-estimator gets fooled by that and doesn't realize it's a match-everything pattern. (Yeah, we should fix that, but it won't help you today...) Again, underestimating the number of rows is bad for the quality of the plan. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Performance tuning?
I wrote: Another thing that might be worth fixing is the rather silly use of '%%%' rather than '%' for a no-op LIKE pattern. It looks like the planner's LIKE-estimator gets fooled by that and doesn't realize it's a match-everything pattern. Uh, scratch that advice, I fat-fingered my test. It does seem to estimate that '%%%' matches every row. But that leads into another question, because some of the scan estimates are further off than one would like: - Seq Scan on tblclientcomments (cost=0.00..40651.36 rows=601579 width=17) (actual time=0.014..17.342 rows=6912 loops=1017) Filter: (((fldproductcode)::text ~~* '%%%'::text) AND (fldenable = true)) I had thought that this was explained by a bad LIKE estimate but that seems not the case, which means that your statistics for fldenable must be way off. Have you ANALYZEd these tables since loading the data? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Performance tuning?
On Sat, 2007-05-19 at 19:19 -0400, Tom Lane wrote: You're comparing fields of distinct types, which not only incurs run-time type conversions but can interfere with the ability to use some plan types at all. Looking at the table definitions, you've got primary keys declared as SERIAL (ie, integer) and the referencing columns declared as NUMERIC(18,0). This is just horrid for performance :-( --- NUMERIC arithmetic is pretty slow, and it's really pointless when the referenced columns are only integers. I suspect you should have translated these column types as BIGINT (and BIGSERIAL). Thanks again, I'll be sure to get this straightened out and tested again tomorrow. I thought my nightly backup was analyze'ing the database afterward, I'll be sure to check that as well. I really appreciate your analysis! It is my first migration from another SQL database. -- Robert ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] FULL JOIN is only supported with merge-joinable join conditions
Andrus [EMAIL PROTECTED] writes: I try to port application to PostgreSQL 8.1+ The following query runs OK in VFP but causes error in Postgres FULL JOIN is only supported with merge-joinable join conditions SELECT ... FROM iandmed FULL JOIN koosseis ON iandmed.ametikoht=koosseis.ametikoht AND iandmed.Kuluobj= koosseis.objekt1 AND iandmed.AmetiKoht is not null Uh, can't you just drop the iandmed.AmetiKoht is not null condition? It seems redundant considering that iandmed.ametikoht=koosseis.ametikoht isn't going to succeed when ametikoht is null. In the long run we should teach hash join to support full-join behavior, which would allow cases like this one to work; but it seems not very high priority, since I've yet to see a real-world case where a non-merge-joinable full-join condition was really needed. (FULL JOIN being inherently symmetric, the join condition should usually be symmetric as well...) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] tsearch2 keep throw-away characters
Hello, Your problem is not about stop words, it's about the fact that tsearch parser treats '+' and '#' symbols as a lexemes of a blank type (use ts_debug() function to figure it out) and drops it without any further processing. AFAIK, typical solution for this is to rewrite your text and then queries to some auxiliary words, like 'SYScpp' and 'SYScsharp', that will be included in tsvectors and indexed without any problems. Usually you can do replacements in tsvector trigger when indexing documents and via query rewriting (in tsearch or your application) when quering database. Trivial examples: test=# select to_tsvector('english','I know how to code in SYScsharp, java and SYScpp'); to_tsvector -- 'code':5 'java':8 'know':2 'syscpp':10 'syscsharp':7 (1 row) and, sure: test=# select 'I know how to code in SYScsharp, java and SYScpp' @@ 'SYScpp'; ?column? -- t (1 row) There might be more sophisticated solution like prevent parser from treating '++' as a blank lexemes, but Oleg will explain this much better, as soon as he has time. -- Regards, Ivan On 5/16/07, Kimball [EMAIL PROTECTED] wrote: postgres=# select to_tsvector('default','I know how to code in C#, java and C++'); to_tsvector - 'c':7,10 'code':5 'java':8 'know':2 (1 row) postgres=# select to_tsvector('simple','I know how to code in C#, java and C++'); to_tsvector - 'c':7,10 'i':1 'in':6 'to':4 'and':9 'how':3 'code':5 'java':8 'know':2 (1 row) I'd like to get lexemes/tokens 'c#' and 'c++' out of this query. Everything I can find has to do with stop words. How do I keep characters that tsearch throws out? I've already tried 'c\#' and 'c\\#' etc, which don't work. Kimball ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/