Re: [GENERAL] Fault Tolerant Postgresql (two machines, two postmasters, one disk array)

2007-05-19 Thread Joris Dobbelsteen
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Ron Johnson
Sent: donderdag 17 mei 2007 22:56
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Fault Tolerant Postgresql (two 
machines, two postmasters, one disk array)

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/17/07 09:35, Andrew Sullivan wrote:
[snip]
 
 The problems come when you get a false detection of machine failure.
 Consider a case, for instance, where the machine A gets overloaded, 
 goes into swap madness, or has a billion runaway processes 
that cause 
 it to stagger.  In this case, A might not respond in time on the 
 heartbeat monitor, and then the standby machine B thinks A 
has failed.  
 But A doesn't know that, of course, because it is working as hard as 
 it can just to stay up.  Now, if B mounts the disk and starts the 
 postmaster, but doesn't have a way to make _sure_ tha A is 
completely 
 disconnected from the disk, then it's entirely possible A will flush 
 buffers out to the still-mounted data area.  Poof!
 Instant data corruption.

Aren't there PCI heartbeat cards that are independent of the 
load on the host machine?

A solution commonly seen is to cut the power on the 'failed' machine
just before a take-over is done. Solutions for that are available...

Besides this, you don't want a separate PCI heartbeat card to see if
your software happens to work. Same situation with a watchdog, you don't
want the watchdog to 'reset' itself continuesly, as you loose the
benefit of the watchdog.

Generally your software should also check is postgresql is operation as
expected: its not stopped or non-responsive. In these cases the system
should fail over. The 'cut power' solution works.

If you look for details how to set up, heartbeat (www.linux-ha.org) and
search for stonith. They have lots and lots of very useful information
about high availability solutions. Furthermore the package is used
arround the world for these solutions by large companies and part of
several other software packages. It supports linux and BSD...

- Joris 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] about the pgadmin3

2007-05-19 Thread ee e

my os is Redhat as4,and my postgreSQL is 7.4 , well, when i use
pgadmin(windows version1.6.3)  to link to the database.the return frame
tells me pg_user doesn't exist and pg_settings doesn't exist,how shall
handle the problem?


Re: [GENERAL] Data replication through disk replication

2007-05-19 Thread Andrew Sullivan
On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote:

 that all changes are replicated, it won't say an fsync is finished until 
 it's finished on the remote host too, and it won't let you mount the block 
 device on the slave system (at least with 0.7x). 

How can it guarantee these things?  The web pages say this:

If the primary node fails, heartbeat is switching the
secondary device into primary state and starts the
application there. (If you are using it with a non-journaling
FS this involves running fsck)

If the failed node comes up again, it is a new secondary node
and has to synchronise its content to the primary. This, of
course, will happen whithout interruption of service in the
background.

So what happens in those cases where the primary node gets in trouble
but isn't actually dead yet?  I see a potential for a race condition
here that is really troubling to me.  (Especially since it uses the
TCP/IP stack, which is notoriously subject to DoS on Linux.)  I think
you really had better have something like STONITH running to use
this.

 In case you think it matters, IBM recommends DRBD for DB2.

IBM also sells HACMP.  I refuse to comment on whether IBM's advice on
high availability products is worth taking seriously.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] basic postgres questions...

2007-05-19 Thread woodb
 hey.. .

 can someone point me to the cmds that i'd use in order to see what
 databases
 are created in my postgres app. i need to see what's here, and then i need
 to know the cmd to then delete a given database, and the underlying
 tables.

from the OS command line, psql -l will list the available Postgres databases

For general info on this sort of admin task, see
http://www.postgresql.org/docs/8.2/static/managing-databases.html


 also, is there an irc channel for postgres!
See: http://www.postgresql.org/community/irc.html

there are several...


Cheers,

  Brent Wood


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] How to implement GOMONTH function

2007-05-19 Thread Andrus

Thank all very much for great suggestions.

I created function

CREATE OR REPLACE FUNCTION PUBLIC.GOMONTH(DATE, INTEGER, OUT DATE) IMMUTABLE
AS
$_$
SELECT ($1 + ($2 * '1 MONTH'::INTERVAL))::DATE;
$_$ LANGUAGE SQL;

I got errors:

function gomonth(date, numeric) does not exist

and

function gomonth(date, bigint ) does not exist

How to fix those errors ?


Andrus.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[GENERAL] tsearch2 keep throw-away characters

2007-05-19 Thread Kimball

postgres=# select to_tsvector('default','I know how to code in C#, java and
C++');
to_tsvector
-
'c':7,10 'code':5 'java':8 'know':2
(1 row)

postgres=# select to_tsvector('simple','I know how to code in C#, java and
C++');
  to_tsvector
-
'c':7,10 'i':1 'in':6 'to':4 'and':9 'how':3 'code':5 'java':8 'know':2
(1 row)


I'd like to get lexemes/tokens 'c#' and 'c++' out of this query.  Everything
I can find has to do with stop words.   How do I keep characters that
tsearch throws out?  I've already tried 'c\#' and 'c\\#' etc, which don't
work.

Kimball


[GENERAL] FULL JOIN is only supported with merge-joinable join conditions

2007-05-19 Thread Andrus

I try to port application to PostgreSQL 8.1+

The following query runs OK in VFP but causes error in Postgres

FULL JOIN is only supported with merge-joinable join conditions

How to fix ?

Andrus.


SELECT
 ametikoh.Nimetus as ametikoht,
 Isik.nimi,
 Isik.eesnimi,
 koosseis.kogus,
 COALESCE( iandmed.Kuluobj, koosseis.objekt1) as osakond

FROM iandmed
 FULL JOIN koosseis ON iandmed.ametikoht=koosseis.ametikoht
   AND  iandmed.Kuluobj= koosseis.objekt1
   AND iandmed.AmetiKoht is not null
 JOIN ametikoh ON COALESCE(iandmed.ametikoht,koosseis.ametikoht)= 
ametikoh.AmetiKoht

 LEFT JOIN isik ON iandmed.isik=isik.isik
WHERE true
ORDER BY 1,2


Revelant pars of table structures are:

CREATE TABLE iandmed
(
 reanr integer NOT NULL DEFAULT nextval('iandmed_reanr_seq'::regclass),
 isik character(10) NOT NULL,
 miskuup date,
 plopp date,
 summavrt numeric(12,2),
 kuluobj character(10),
 ametikoht numeric(7),
 CONSTRAINT iandmed_pkey PRIMARY KEY (reanr)
) ;


CREATE TABLE koosseis
(
 id numeric(7) NOT NULL,
 ametikoht numeric(7) NOT NULL,
 objekt1 character(10) NOT NULL,
 kogus numeric(4) NOT NULL DEFAULT 0,
 algus date,
 lopp date,
 CONSTRAINT koosseis_pkey PRIMARY KEY (id)
); 



---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] adding missing FROM-clause entry for table

2007-05-19 Thread Vivian Aguilar
I am porting from 7.4 to 8.1 and i turned off the option to add missing 
from clause.
Is there a way to know How postgres rewritte the queries or add the 
missing from clause on the postgres log? 


thank you

Vivian

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] a few questions on backup

2007-05-19 Thread Marco Colombo
Marco Colombo wrote:
 I'll try that out. Maybe my ideas are so far from the truth that I'm
 having a hard time in explaing them to people who actually know how
 things work. I'll be back with results. Meanwhile, thanks for your time.

I think I finally got it.

Segment 34 in my pg_xlog got archived and recycled. It became segment
39, but PG is still working on segment 35, after some hours. Now pg_xlog
contains 5 segments, from named from 35 to 39, 35 being the most
recently modified. 39 won't be used yet for about a couple of days.

Now I see what you mean for recycled: I thought it meant marked free
for later use, but it means renamed for future use. My mistake was
assuming that the rename part happens lazily when PG starts using the
file. Instead, it happens right after (the eventual) archiving.

That makes the strategy in my original post somehow unfeasable. Still, I
was not completely wrong:

# cmp /var/lib/pgsql/data/pg_xlog/000100010039
/u1/pg_wal_archive/000100010034  echo Yes
Yes

They do contain the same data, that of segment 34, and the *39 file will
stay there, untouched, for quite a while after the backup. So the WAL
segment I need *is* there, just with a different name.

The only problem is figuring out what segment that data actually belongs
to. I know only because I can compare it with the archived one. Now, I
could still make some educated guesses, by looking at modification
times, but definitely a guessing game is not something you want to play
when restoring your precious data. :) Archiving the WAL segments and
letting the recovery procedure handle them at restore time is easier anyway.

Again, thanks a lot.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Are foreign key's automatically set as indicies?

2007-05-19 Thread camb
Hey all,
I'm sorta new to PG and am just wondering, when I create a foreign key
is it automatically set as an indexed column or do I need to also set
that manually?

Thanks in advance,
Cam


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Committing in a trigger

2007-05-19 Thread Henry

Hi there,

I'm using PG 8.2.3.

Is it possible to (somehow) commit a specific statement in a trigger function 
if the function itself is rolled back because of an error (eg, for a unique 
index error)?

For example:

create table tab1 (col1 int unique);
create table tab2 (col1 int);

CREATE OR REPLACE FUNCTION
f_func1 () RETURNS trigger AS $$
BEGIN
--  ... some processing ... 

INSERT INTO tab2 (col1) VALUES (new.col1);
-- COMMIT the above statement, irrespective of whether this 
-- trigger/function is rolled back or not.

RETURN new;
end;
$$ LANGUAGE 'plpgsql';

CREATE TRIGGER tiu_t1
BEFORE UPDATE OR INSERT ON tab1
FOR EACH ROW
EXECUTE PROCEDURE f_func1();


* * *

Now, if you:

insert into tab1 (col1) values (1);

and tab1 already has a row with col1=1, tab2 must be updated even when the 
statement fails with:

ERROR:  duplicate key violates unique constraint...

I know this can be achieved outside the DB (ie, with checks, etc), but I'd 
like to keep this aspect inside the DB.  Also, I could perform selects inside 
the trigger to pre-empt a unique constraint error, but this will slow the 
inserts down.

I can't wrap BEGIN/COMMIT around the INSERT in the trigger.  Is there another 
way of achieving this?

Any suggestions are appreciated.

Regards

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] change database encoding without corrupting data (latin9 to utf8)

2007-05-19 Thread filippo
I have a problem to entry data to postgres database (latin9) from my
perl/tk application running on windows (utf8). Whenever I try to entry
letter with accents, these looks corrupted once stored into database.

A workaround is to set client encoding to UTF8 after creating the
database connection:

$dbh-do(qq/SET client_encoding to 'UTF8'/);

To avoid such kind of workaround I'd like to convert the whole
database from LATIN9 to UTF8, how can I do it without corrupting the
data?

Thanks,
Filippo


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Are foreign key's automatically set as indicies?

2007-05-19 Thread Raymond O'Donnell

On 16/05/2007 05:09, camb wrote:


I'm sorta new to PG and am just wondering, when I create a foreign key
is it automatically set as an indexed column or do I need to also set
that manually?


AFAIK you need to create the index separately. I use PgAdmin most of the 
time, which does that by default, so I'm a bit hazy on doing it by hand.


Ray.

---
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
---

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Are foreign key's automatically set as indicies?

2007-05-19 Thread PFC

On Wed, 16 May 2007 06:09:15 +0200, camb [EMAIL PROTECTED] wrote:


Hey all,
I'm sorta new to PG and am just wondering, when I create a foreign key
is it automatically set as an indexed column or do I need to also set
that manually?


Primary key creates unique index.
	Foreign keys do not create any indexes, it is up to you to decide if you  
need indexes or not.


Say :

CREATE TABLE items ( ... category_id REFERENCES categories( id ) )

	if you want cascaded deletes/updates to your items table to be fast, or  
you need it for other reasons, create an index on category_id.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Data replication through disk replication

2007-05-19 Thread Joris Dobbelsteen
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
Andrew Sullivan
Sent: zaterdag 19 mei 2007 15:28
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Data replication through disk replication

On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote:

 that all changes are replicated, it won't say an fsync is finished 
 until it's finished on the remote host too, and it won't let 
you mount 
 the block device on the slave system (at least with 0.7x).

How can it guarantee these things?  The web pages say this:

   If the primary node fails, heartbeat is switching the
   secondary device into primary state and starts the
   application there. (If you are using it with a non-journaling
   FS this involves running fsck)

   If the failed node comes up again, it is a new secondary node
   and has to synchronise its content to the primary. This, of
   course, will happen whithout interruption of service in the
   background.

So what happens in those cases where the primary node gets in 
trouble but isn't actually dead yet?  I see a potential for a 
race condition here that is really troubling to me.  
(Especially since it uses the TCP/IP stack, which is 
notoriously subject to DoS on Linux.)  I think you really had 
better have something like STONITH running to use this.

General advise you see at linux-ha is to use redundant heartbeat paths.
You can use a serial link if you want to. Other options are redundent
networks. This is to reduce the probability of a split-brain situation.

As you guessed stonith is much required to guard against possible 'race'
conditions caused by hanging nodes.

As a note, DRDB will also do a lot of work for you. It prevents you from
some mistakes you can make.
While starting it waits if it does not detect the other node, and thus
doesn't know who has the latest data available. Can be overriden by a
timeout if desired. (In practice only a single will fail or both will
come up at the same time). Prevents running out of sync.
It also detects when its out of sync, requiring administrator
intervention.

Another advice is to take a look at the linux-ha web sit and mailing
list. Though the web-site might not be the best, the mailing list is
quite active and a lot of knowledge is available.

In general high availability is complex and requires a lot of thought to
cover all possible cases.

[snip]

- Joris Dobbelsteen


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Data replication through disk replication

2007-05-19 Thread Ben
Er, yes, sorry, I didn't mean to imply that you should run without  
some kind of STONITH solution, to catch the case when the link DRDB  
uses goes down but the other network links are still working fine.  
It's in the common case, when everything is working, that DRBD won't  
accidentally let you mount the same device on both machines. (In my  
experience, operator error more common than hardware error.)


On May 19, 2007, at 6:28 AM, Andrew Sullivan wrote:


On Fri, May 18, 2007 at 05:03:30PM -0700, Ben wrote:

that all changes are replicated, it won't say an fsync is finished  
until
it's finished on the remote host too, and it won't let you mount  
the block

device on the slave system (at least with 0.7x).


How can it guarantee these things?  The web pages say this:

If the primary node fails, heartbeat is switching the
secondary device into primary state and starts the
application there. (If you are using it with a non-journaling
FS this involves running fsck)

If the failed node comes up again, it is a new secondary node
and has to synchronise its content to the primary. This, of
course, will happen whithout interruption of service in the
background.

So what happens in those cases where the primary node gets in trouble
but isn't actually dead yet?  I see a potential for a race condition
here that is really troubling to me.  (Especially since it uses the
TCP/IP stack, which is notoriously subject to DoS on Linux.)  I think
you really had better have something like STONITH running to use
this.


In case you think it matters, IBM recommends DRBD for DB2.


IBM also sells HACMP.  I refuse to comment on whether IBM's advice on
high availability products is worth taking seriously.

A

--
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the  
marketplace.

--Philip Greenspun

---(end of  
broadcast)---

TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that  
your

   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Data replication through disk replication

2007-05-19 Thread Joshua D. Drake

Alvaro Herrera wrote:

Ben wrote:
If you're just looking for a way to have high availability and you're ok 
being tied to linux, DRBD is a good way to go. It keeps things simple in 
that all changes are replicated, it won't say an fsync is finished until 
it's finished on the remote host too,


Oh, so that's how it works.  I assume performance must be, huh, not stellar?


It depends... if the machines are equal, and you have bonded two 
gigabits together


Joshua D. Drake






--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Against legal disclaimers in email

2007-05-19 Thread Tom Lane
[ redirected for a wider audience ]

Lew [EMAIL PROTECTED] writes:
 John Summerfield wrote:
 Penchalaiah P. wrote:
 Information transmitted by this e-mail is proprietary to Infinite 
 Computer Solutions 
 
 It may be proprietary, but it shore ain't confidential!

 Placing confidential on every document without regard for its
 content, especially when some of it's publicly disseminated, can
 remove the protection of confidentiality at law from all such marked
 documents in many jurisdictions, including the U.S.  There must be
 discrimination applied in the marking of information as
 confidential.

 Quite aside from the foolishness you pointed out of marking something
 confidential then placing it into the public eye in an archived
 forum where it will be visible by everybody forever.

 Now we can publish everything ever written at or by Infinite Computer 
 Solutions without fear of liability.

Wow, that is a great argument that I've not seen before.  The only
way we'll ever make a dent in the plague of legalistic disclaimers
auto-attached to email is if we can convince corporate lawyers that
there is a downside *to them* of insisting on the damned things.
This is the first argument I've seen that there might actually be
a serious legal downside to the practice.

Can you cite chapter and verse to back up this position?  It'd be
great to have a well-reasoned FAQ document that people could just
automatically send to anyone exhibiting this folly ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Performance tuning?

2007-05-19 Thread Robert Fitzpatrick
We have a MS SQL server db that we successfully migrated to pgsql 8.2
and I am now working on some views and notice select queries being very
slow. I have vacuumed last night and running the exact same query (with
minor changes to syntax of course), it runs in just a few seconds
compared to several minutes in pgsql. Since I've never dealt with MS
SQL, I wanted to ask here if this is expected performance for the type
of query *or* do I need to just learn how to properly tune my
performance on the pgsql server? Perhaps some commands or tests may help
me determine where issues may lie?

I am running the following query on a linux server with comparable
processor and memory as the windows server. The query was just taken
from the SQL server as is and adjusted teh syntax...the query only
returns 3 records, but several tables have tens of thousands of records,
the tblactivitytag table has over 100K...

SELECT distinct A.fldClientNumber as cNumber, A.fldClientName as cName,
 B.fldContactNumber as contactNumber, B.fldContactCity as cCity,
 B.fldContactState as cState, B.fldContactFirstName as contactFName,
 B.fldContactLastName as contactLName, B.fldContactEmail as ContactEmail,
 B.fldContactTitle as cTitle, B.fldContactPhone1_Num as B1Phonenumber,
 B.fldContactPhone4_Type as Num4Type, B.fldContactPhone4_Num as CellNum
FROM tblClientMaster A, tblContactInfo B,tblClientProductPreference C,
 tblClientRoomSize D,tblProductMaster F, tblClientProductRelation G,
 tblclientcomments H, tblgeopreference E ,tblClientActivityTag
WHERE
 A.fldClientNumber = B.fldClientNumber AND
 A.fldClientNumber = C.fldClientNumber AND
 A.fldClientNumber = D.fldClientNumber AND
 A.fldClientName ilike '%ADVISOR%' AND
 B.fldContactFirstName ilike '%%%' AND
 A.fldClientNumber = G.fldClientNumber AND
 G.fldProductNumber = F.fldProductNumber AND
 F.fldProductName ilike '%%%' AND
 A.fldClientNumber = H.fldClientNumber AND
 H.fldenable = 't' AND
 H.fldcontactnumber = b.fldcontactnumber AND
 H.fldClientcomments ilike '%%%' AND
 (A.fldBuyingStatus = 'Now' ) AND
 (A.fldSellingStatus = 'Now' ) AND
 (C.fldFullService = 't' ) AND
 (D.fldSize149 = 't' ) AND
 (E.fldW = 't' ) AND
 A.fldClientNumber = E.fldClientNumber AND
 A.fldclientnumber = tblClientActivityTag.fldclientnumber AND
 tblClientActivityTag.fldcontactnumber = b.fldcontactnumber AND
 tblClientActivityTag.fldcontactactivitytag like 'A%' AND
 b.fldcontactnumber in (select fldcontactnumber from tblclientcomments where 
tblclientcomments$
 A.fldEnable = 't' AND B.fldEnable = 't'
ORDER BY A.fldClientName, B.fldContactLastName;

-- 
Robert


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Performance tuning?

2007-05-19 Thread Tom Lane
Robert Fitzpatrick [EMAIL PROTECTED] writes:
 I am running the following query on a linux server with comparable
 processor and memory as the windows server.

Show us the table definitions and the EXPLAIN ANALYZE output, please.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] change database encoding without corrupting data (latin9 to utf8)

2007-05-19 Thread Michael Fuhr
On Fri, May 18, 2007 at 02:46:26AM -0700, filippo wrote:
 I have a problem to entry data to postgres database (latin9) from my
 perl/tk application running on windows (utf8). Whenever I try to entry
 letter with accents, these looks corrupted once stored into database.
 
 A workaround is to set client encoding to UTF8 after creating the
 database connection:
 
 $dbh-do(qq/SET client_encoding to 'UTF8'/);

Workaround has a negative connotation that's perhaps misused in
this case because setting client_encoding is the proper way of
telling the database what the client's encoding is.  If the connecting
role will always use UTF8 then you could use ALTER ROLE (or ALTER
USER in 8.0 and earlier) to automatically set client_encoding for
every connection:

ALTER ROLE rolename SET client_encoding TO 'UTF8';

 To avoid such kind of workaround I'd like to convert the whole
 database from LATIN9 to UTF8, how can I do it without corrupting the
 data?

If all of the data is uncorrupted LATIN9 then you could use pg_dump
to dump the LATIN9 database and then restore it into a UTF8 database.
But if you have a mix of uncorrupted and corrupted characters (UTF8
byte sequences stored as LATIN9) then you have a bit of a problem
because some data needs to be converted from LATIN9 to UTF8 but
other data is already UTF8 and shouldn't be converted.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Performance tuning?

2007-05-19 Thread Robert Fitzpatrick
On Sat, 2007-05-19 at 17:05 -0400, Tom Lane wrote:
 Robert Fitzpatrick [EMAIL PROTECTED] writes:
  I am running the following query on a linux server with comparable
  processor and memory as the windows server.
 
 Show us the table definitions and the EXPLAIN ANALYZE output, please.
 

Thanks Tom...

---
 Unique  (cost=2326081.07..2354383.40 rows=12445 width=998) (actual 
time=71931.967..71989.731 rows=3 loops=1)
   -  Sort  (cost=2326081.07..2328258.17 rows=870841 width=998) (actual 
time=71931.959..71943.845 rows=9110 loops=1)
 Sort Key: a.fldclientname, b.fldcontactlastname, a.fldclientnumber, 
b.fldcontactnumber, b.fldcontactcity, b.fldcontactstate, b.fldcontactfirstname, 
b.fldcontactemail, b.fldcontacttitle, b.fldcontactphone1_num, 
b.fldcontactphone4_type, b.fldcontactphone4_num
 -  Merge Join  (cost=55798.98..60543.68 rows=870841 width=998) 
(actual time=46902.686..70218.041 rows=9110 loops=1)
   Merge Cond: (outer.fldclientnumber = inner.fldclientnumber)
   -  Merge Join  (cost=679.89..4617.75 rows=224283 width=8) 
(actual time=17.104..74.653 rows=125 loops=1)
 Merge Cond: (outer.fldclientnumber = 
inner.fldclientnumber)
 -  Index Scan using ix_tblgeopreference_fldclientnumber 
on tblgeopreference e  (cost=0.00..556.87 rows=6699 width=4) (actual 
time=0.205..56.266 rows=143 loops=1)
   Filter: (fldw = true)
 -  Sort  (cost=679.89..696.63 rows=6696 width=4) (actual 
time=16.844..17.005 rows=247 loops=1)
   Sort Key: c.fldclientnumber
   -  Seq Scan on tblclientproductpreference c  
(cost=0.00..254.39 rows=6696 width=4) (actual time=0.084..15.884 rows=663 
loops=1)
 Filter: (fldfullservice = true)
   -  Materialize  (cost=55119.09..55127.13 rows=804 width=1014) 
(actual time=46827.886..70028.280 rows=9110 loops=1)
 -  Merge Join  (cost=53060.03..55118.29 rows=804 
width=1014) (actual time=46827.877..69956.976 rows=9110 loops=1)
   Merge Cond: (outer.fldclientnumber = 
inner.fldclientnumber)
   -  Nested Loop  (cost=53060.03..54565.61 rows=24 
width=1010) (actual time=37189.898..69232.176 rows=25048 loops=1)
 -  Nested Loop  (cost=53060.03..54420.94 
rows=24 width=1014) (actual time=37148.445..67472.468 rows=25048 loops=1)
   -  Nested Loop IN Join  
(cost=53060.03..53581.73 rows=1 width=1006) (actual time=37129.788..66642.591 
rows=1017 loops=1)
 Join Filter: 
(inner.fldcontactnumber = outer.fldcontactnumber)
 -  Nested Loop  
(cost=53060.03..53565.72 rows=1 width=1040) (actual time=36584.031..37402.166 
rows=1017 loops=1)
   Join Filter: 
(outer.fldcontactnumber = (inner.fldcontactnumber)::numeric)
   -  Merge Join  
(cost=53060.03..53087.19 rows=1 width=210) (actual time=36561.298..36603.979 
rows=1873 loops=1)
 Merge Cond: 
((outer.fldclientnumber = inner.fldclientnumber) AND 
(outer.fldcontactnumber = inner.fldcontactnumber))
 -  Sort  
(cost=50577.52..50585.04 rows=3008 width=189) (actual time=36156.473..36159.932 
rows=6167 loops=1)
   Sort Key: 
a.fldclientnumber, h.fldcontactnumber
   -  Nested Loop  
(cost=0.00..50403.74 rows=3008 width=189) (actual time=6.180..36110.024 
rows=6167 loops=1)
 Join 
Filter: ((outer.fldclientnumber)::numeric = inner.fldclientnumber)
 -  Seq 
Scan on tblclientmaster a  (cost=0.00..728.70 rows=1 width=172) (actual 
time=0.680..197.224 rows=4 loops=1)
   
Filter: (((fldclientname)::text ~~* '%ADVISOR%'::text) AND 
((fldbuyingstatus)::text = 'Now'::text) AND ((fldsellingstatus)::text = 
'Now'::text) AND (fldenable = true))
 -  Seq 
Scan on tblclientcomments h  (cost=0.00..40651.36 rows=601579 width=34) (actual 
time=0.019..7026.388 rows=1202169 loops=4)
   
Filter: ((fldenable = true) AND 

Re: [GENERAL] Performance tuning?

2007-05-19 Thread Tom Lane
Robert Fitzpatrick [EMAIL PROTECTED] writes:
 On Sat, 2007-05-19 at 17:05 -0400, Tom Lane wrote:
 Show us the table definitions and the EXPLAIN ANALYZE output, please.

There seem to be a couple of problems visible in the EXPLAIN output:

  -  Nested Loop  (cost=53060.03..53565.72 rows=1 width=1040) 
 (actual time=36584.031..37402.166 rows=1017 loops=1)
Join Filter: (outer.fldcontactnumber = 
 (inner.fldcontactnumber)::numeric)

-  Nested Loop  (cost=0.00..50403.74 
 rows=3008 width=189) (actual time=6.180..36110.024 rows=6167 loops=1)
  Join Filter: 
 ((outer.fldclientnumber)::numeric = inner.fldclientnumber)

You're comparing fields of distinct types, which not only incurs
run-time type conversions but can interfere with the ability to
use some plan types at all.  Looking at the table definitions,
you've got primary keys declared as SERIAL (ie, integer) and the
referencing columns declared as NUMERIC(18,0).  This is just horrid for
performance :-( --- NUMERIC arithmetic is pretty slow, and it's really
pointless when the referenced columns are only integers.  I suspect
you should have translated these column types as BIGINT (and BIGSERIAL).

-  Merge Join  (cost=53060.03..53087.19 rows=1 width=210) 
 (actual time=36561.298..36603.979 rows=1873 loops=1)
  Merge Cond: ((outer.fldclientnumber = 
 inner.fldclientnumber) AND (outer.fldcontactnumber = 
 inner.fldcontactnumber))

The planner is drastically underestimating the number of rows out of
this join, probably because it does not know that there is any
correlation between fldclientnumber and fldcontactnumber, where in
reality I bet there's a lot.  Is it possible that one of these fields is
actually functionally dependent on the other, such that you could use
just one of them in the join?  The one-result-row estimate is bad
because it leads to inappropriate choices of nestloop joins.

There may not be much you can do about that part, but I suspect if you
get rid of the pointless use of NUMERIC arithmetic you'll find a lot
of the performance issue goes away.

Another thing that might be worth fixing is the rather silly use of '%%%'
rather than '%' for a no-op LIKE pattern.  It looks like the planner's
LIKE-estimator gets fooled by that and doesn't realize it's a
match-everything pattern.  (Yeah, we should fix that, but it won't
help you today...)  Again, underestimating the number of rows is bad
for the quality of the plan.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Performance tuning?

2007-05-19 Thread Tom Lane
I wrote:
 Another thing that might be worth fixing is the rather silly use of '%%%'
 rather than '%' for a no-op LIKE pattern.  It looks like the planner's
 LIKE-estimator gets fooled by that and doesn't realize it's a
 match-everything pattern.

Uh, scratch that advice, I fat-fingered my test.  It does seem to
estimate that '%%%' matches every row.

But that leads into another question, because some of the scan estimates
are further off than one would like:

  -  Seq Scan on tblclientcomments  (cost=0.00..40651.36 
 rows=601579 width=17) (actual time=0.014..17.342 rows=6912 loops=1017)
Filter: (((fldproductcode)::text ~~* '%%%'::text) AND 
 (fldenable = true))

I had thought that this was explained by a bad LIKE estimate but that
seems not the case, which means that your statistics for fldenable must
be way off.   Have you ANALYZEd these tables since loading the data?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Performance tuning?

2007-05-19 Thread Robert Fitzpatrick
On Sat, 2007-05-19 at 19:19 -0400, Tom Lane wrote:
 You're comparing fields of distinct types, which not only incurs
 run-time type conversions but can interfere with the ability to
 use some plan types at all.  Looking at the table definitions,
 you've got primary keys declared as SERIAL (ie, integer) and the
 referencing columns declared as NUMERIC(18,0).  This is just horrid
 for
 performance :-( --- NUMERIC arithmetic is pretty slow, and it's really
 pointless when the referenced columns are only integers.  I suspect
 you should have translated these column types as BIGINT (and
 BIGSERIAL). 

Thanks again, I'll be sure to get this straightened out and tested again
tomorrow. I thought my nightly backup was analyze'ing the database
afterward, I'll be sure to check that as well.

I really appreciate your analysis! It is my first migration from another
SQL database.

-- 
Robert


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] FULL JOIN is only supported with merge-joinable join conditions

2007-05-19 Thread Tom Lane
Andrus [EMAIL PROTECTED] writes:
 I try to port application to PostgreSQL 8.1+
 The following query runs OK in VFP but causes error in Postgres
 FULL JOIN is only supported with merge-joinable join conditions

 SELECT
 ...
 FROM iandmed
   FULL JOIN koosseis ON iandmed.ametikoht=koosseis.ametikoht
 AND  iandmed.Kuluobj= koosseis.objekt1
 AND iandmed.AmetiKoht is not null

Uh, can't you just drop the iandmed.AmetiKoht is not null condition?
It seems redundant considering that iandmed.ametikoht=koosseis.ametikoht
isn't going to succeed when ametikoht is null.

In the long run we should teach hash join to support full-join behavior,
which would allow cases like this one to work; but it seems not very
high priority, since I've yet to see a real-world case where a
non-merge-joinable full-join condition was really needed.  (FULL JOIN
being inherently symmetric, the join condition should usually be
symmetric as well...)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] tsearch2 keep throw-away characters

2007-05-19 Thread Ivan Zolotukhin

Hello,

Your problem is not about stop words, it's about the fact that tsearch
parser treats '+' and '#' symbols as a lexemes of a blank type (use
ts_debug() function to figure it out) and drops it without any further
processing. AFAIK, typical solution for this is to rewrite your text
and then queries to some auxiliary words, like 'SYScpp' and
'SYScsharp', that will be included in tsvectors and indexed without
any problems. Usually you can do replacements in tsvector trigger when
indexing documents and via query rewriting (in tsearch or your
application) when quering database.

Trivial examples:

test=# select to_tsvector('english','I know how to code in SYScsharp,
java and SYScpp');
to_tsvector
--
'code':5 'java':8 'know':2 'syscpp':10 'syscsharp':7
(1 row)

and, sure:

test=# select 'I know how to code in SYScsharp, java and SYScpp' @@ 'SYScpp';
?column?
--
t
(1 row)

There might be more sophisticated solution like prevent parser from
treating '++' as a blank lexemes, but Oleg will explain this much
better, as soon as he has time.

--
Regards,
Ivan


On 5/16/07, Kimball [EMAIL PROTECTED] wrote:


postgres=# select to_tsvector('default','I know how to code in C#, java and
C++');
  to_tsvector
-
 'c':7,10 'code':5 'java':8 'know':2
 (1 row)

postgres=# select to_tsvector('simple','I know how to code in C#, java and
C++');
   to_tsvector
-
 'c':7,10 'i':1 'in':6 'to':4 'and':9 'how':3 'code':5 'java':8 'know':2
(1 row)


I'd like to get lexemes/tokens 'c#' and 'c++' out of this query.  Everything
I can find has to do with stop words.   How do I keep characters that
tsearch throws out?  I've already tried 'c\#' and 'c\\#' etc, which don't
work.

Kimball


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/