Re: [GENERAL] Implementing a change log

2005-09-20 Thread Berend Tober

Greg Sabino Mullane wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 
 


My original intention was to keep two sets of tables. The first
containing only the working set of current records. The second
containing all prior versions. I haven't experimented with such a setup
yet and I'm wondering if it is even necessary. The alternative being to
keep only a single set of tables.
   

 
 


Can anyone relate their experiences with such a thing? Which approaches
should I take into consideration?
   



I like the multi-table approach; I use a schema named audit that contains
a copy of some of the important tables (sans constraints). The nice part is
that I can use the exact same table name, which makes things easier. A few
extra columns on each audit table track who made the change, what type it
was (insert, update, or delete [trigger event]), and the time of the change
[default timestamptz]. Throw in some triggers and you're done.

 

There was a very exciting discussion of this last May, in which Greg 
Patnude suggested the most insightful, and in hindsight obviously 
appropriate, use of table inheritance ever (IMHO). I slightly refined 
the idea and posted documentation comments at the time. See User 
Comments at


http://www.postgresql.org/docs/8.0/interactive/tutorial-inheritance.html;

for something that should set you afire.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Implementing a change log

2005-09-20 Thread Berend Tober

Berend Tober wrote:


...See User Comments at

http://www.postgresql.org/docs/8.0/interactive/tutorial-inheritance.html; 



for something that should set you afire.


And, commenting on my own post, try this cool function:

/*
The following is based on suggestion by Mike Rylander posted on 
Postgresql-General
Sun, 18 Sep 2005 23:29:51 + 


Rylander's original suggestion employed a trigger and tracked
only row updates. My implementation makes use of rules and
handles both updates and deletions.
*/

\o output.txt
\set ON_ERROR_STOP OFF

DROP SCHEMA auditor CASCADE;
DROP SCHEMA test CASCADE;

\set ON_ERROR_STOP ON

-- Create a schema to contain all of our audit tables and the creator 
function

CREATE SCHEMA auditor;
  
CREATE OR REPLACE FUNCTION auditor.create_auditor(name, name)

 RETURNS bool AS
'
BEGIN
   -- This is the function that does the heavy lifting of creating 
audit tables

   -- and the triggers that will populate them.
  
   -- Create the audit table: auditor.{schema}_{table}

   EXECUTE \'
   CREATE TABLE auditor.\' || $1 || \'_\' || $2 || \' (
   update_action VARCHAR(6) NOT NULL,
   update_date TIMESTAMP NOT NULL DEFAULT NOW(),
   update_user NAME NOT NULL DEFAULT CURRENT_USER
   ) INHERITS (\' || $1 || \'.\' || $2 || \') WITHOUT OIDS;
   \';

   EXECUTE \'
   CREATE RULE \'|| $2 ||\'_ru AS ON UPDATE TO \'|| $1 ||\'.\'|| $2 
||\'

   DO INSERT INTO auditor.\'|| $1 ||\'_\'|| $2 ||\'
   SELECT OLD.*, \'\'UPDATE\'\';
   \';

   EXECUTE \'
   CREATE RULE \'|| $2 ||\'_rd AS ON DELETE TO \'|| $1 ||\'.\'|| $2 
||\'

   DO INSERT INTO auditor.\'|| $1 ||\'_\'|| $2 ||\'
   SELECT OLD.*, \'\'DELETE\'\';
   \';

   RETURN TRUE;
END;
'
 LANGUAGE 'plpgsql' VOLATILE;



/* BEGIN EXAMPLE */

CREATE SCHEMA test AUTHORIZATION postgres;

-- This option makes it unnecessary to use the ONLY keyword in your 
SELECT and UPDATE statements.

\set SQL_INHERITANCE TO OFF;

\set search_path = test, pg_catalog;
\set default_with_oids = false;

CREATE TABLE test.person (
   first_name character varying(24),
   last_name character varying(24),
   gender character(1),
   marital_status character(1)
) WITHOUT OIDS;

INSERT INTO test.person VALUES ('Charlie', 'Bucket', 'M', 'S');
INSERT INTO test.person VALUES ('Grandpa', 'Joe', 'M', NULL);
INSERT INTO test.person VALUES ('Veruca', 'Salt', NULL, 'S');
INSERT INTO test.person VALUES ('Augustus', 'Gloop', 'M', 'S');
INSERT INTO test.person VALUES ('Micheal', 'Teevee', 'M', 'S');
INSERT INTO test.person VALUES ('Violet', 'Beaureguard', 'M', 'S');

SELECT auditor.create_auditor('test', 'person');

UPDATE test.person set marital_status = 'M' WHERE last_name = 'Joe';
SELECT * FROM auditor.test_person;
/*
first_name | last_name | gender | marital_status | update_action 
|update_date | update_user

+---+++---++-
Grandpa| Joe   | M  || UPDATE| 
2005-09-20 03:26:23.063965 | postgres

(1 row)
*/

UPDATE test.person set first_name = 'Joe', last_name = 'Bucket' WHERE 
last_name = 'Joe';

SELECT * FROM auditor.test_person;
/*
first_name | last_name | gender | marital_status | update_action 
|update_date | update_user

+---+++---++-
Grandpa| Joe   | M  || UPDATE| 
2005-09-20 03:26:23.063965 | postgres
Grandpa| Joe   | M  | M  | UPDATE| 
2005-09-20 03:26:23.13654  | postgres

(2 rows)
*/

UPDATE test.person set gender = 'F' WHERE last_name = 'Salt';
SELECT * FROM auditor.test_person;
/*
first_name | last_name | gender | marital_status | update_action 
|update_date | update_user

+---+++---++-
Grandpa| Joe   | M  || UPDATE| 
2005-09-20 03:26:23.063965 | postgres
Grandpa| Joe   | M  | M  | UPDATE| 
2005-09-20 03:26:23.13654  | postgres
Veruca | Salt  || S  | UPDATE| 
2005-09-20 03:26:23.175714 | postgres

(3 rows)
*/

DELETE FROM test.person WHERE last_name = 'Salt';
SELECT * FROM auditor.test_person;
/*
first_name | last_name | gender | marital_status | update_action 
|update_date | update_user

+---+++---++-
Grandpa| Joe   | M  || UPDATE| 
2005-09-20 03:26:23.063965 | postgres
Grandpa| Joe   | M  | M  | UPDATE| 
2005-09-20 03:26:23.13654  | postgres
Veruca | Salt  || S  | UPDATE| 
2005-09-20 03:26:23.175714 | postgres
Veruca | Salt  | F  | S  | DELETE| 

[GENERAL] shared Locks

2005-09-20 Thread Daniel Schuchardt

Hi group,

I have the following problem:

We have developed a  ERP/PPS Developed with pgsql over the last 4 years. 
Now we introduce it on some of our customers {pgsql works great and gets 
good ratings :-)} and so we have to change Tablestructure and so on very 
often. For technologie reasons every clients starts a Connection and 
holds it until the client terminates his program.


So if we want to change a table structure (add a field or sth like this) 
many clients own AccessShareLock's because it seams that a simple SELECT 
* FROM table will grant a AccessShareLock and don't release it unitl the 
connection is terminated. Is that true? Is it is possible to release 
this lock without a disconnect? {Problem is that about 30 users has to 
disconnect sometimes. :-( }



thnx for comments,
Daniel

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Blob data type and it's efficiency on PostgreSQL

2005-09-20 Thread Daniel Schuchardt

Stas Oskin schrieb:


Hi.

 

We are using PostgreSQL as the RDBMS for our product, and are very 
happy with it. Recently, we have encountered a need to store a lot of 
binary files, mainly images (up to ~100,000 files, with sizes varying 
from 300K-2MB).


 

The question is, how well PostgreSQL performs with the blob data type, 
and is it practical to store these files as blobs?


 


Thanks in advance,

Stas Oskin.

We save binary data in pgsql without problems. (Especially icons, Report 
definitions, Images, ...) Be carefull with dump and restore because it 
is a bit complicated to work with blobs here.


Daniel

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] shared Locks

2005-09-20 Thread Martijn van Oosterhout
On Tue, Sep 20, 2005 at 11:18:48AM +0200, Daniel Schuchardt wrote:
 So if we want to change a table structure (add a field or sth like this) 
 many clients own AccessShareLock's because it seams that a simple SELECT 
 * FROM table will grant a AccessShareLock and don't release it unitl the 
 connection is terminated. Is that true? Is it is possible to release 
 this lock without a disconnect? {Problem is that about 30 users has to 
 disconnect sometimes. :-( }

I think you'll find that locks are held to the end of the transaction.
You're not holding a transaction open but not doing anything, are you?

Otherwise, provide an example including output of pg_locks.

Hope this helps,

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpW0rscNeoBm.pgp
Description: PGP signature


Re: [GENERAL] shared Locks

2005-09-20 Thread Daniel Schuchardt

Martijn van Oosterhout schrieb:


On Tue, Sep 20, 2005 at 11:18:48AM +0200, Daniel Schuchardt wrote:
 

So if we want to change a table structure (add a field or sth like this) 
many clients own AccessShareLock's because it seams that a simple SELECT 
* FROM table will grant a AccessShareLock and don't release it unitl the 
connection is terminated. Is that true? Is it is possible to release 
this lock without a disconnect? {Problem is that about 30 users has to 
disconnect sometimes. :-( }
   



I think you'll find that locks are held to the end of the transaction.
You're not holding a transaction open but not doing anything, are you?
 

Yes you'r right here. Because we use Cursor Fetch, every statement 
starts a transaction. So your right I tested it and this forces a table 
lock. Hm... i will look how to do this in another way.


thnx,
Daniel

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] shared Locks

2005-09-20 Thread Martijn van Oosterhout
On Tue, Sep 20, 2005 at 12:01:46PM +0200, Daniel Schuchardt wrote:
 Martijn van Oosterhout schrieb:
 I think you'll find that locks are held to the end of the transaction.
 You're not holding a transaction open but not doing anything, are you?
  
 
 Yes you'r right here. Because we use Cursor Fetch, every statement 
 starts a transaction. So your right I tested it and this forces a table 
 lock. Hm... i will look how to do this in another way.

Just COMMIT when you're done. This does kill the cursor though...

If you put a timeout in your app so that it commits that transaction
after, say, 30 seconds idle then your ALTER commands will only wait for
a while. Although, your ALTER will in turn block the following users...

Is your biggest problem that people tend to leave connections open
overnight or something? I simple timeout would work fine if there you
only want to make changes when there are just a few active users.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgppxaewrJiMe.pgp
Description: PGP signature


Re: [GENERAL] shared Locks

2005-09-20 Thread Daniel Schuchardt

Martijn van Oosterhout schrieb:



Yes you'r right here. Because we use Cursor Fetch, every statement 
starts a transaction. So your right I tested it and this forces a table 
lock. Hm... i will look how to do this in another way.
   



Just COMMIT when you're done. This does kill the cursor though...

If you put a timeout in your app so that it commits that transaction
after, say, 30 seconds idle then your ALTER commands will only wait for
a while. Although, your ALTER will in turn block the following users...

Is your biggest problem that people tend to leave connections open
overnight or something? I simple timeout would work fine if there you
only want to make changes when there are just a few active users.
 

Y i will try it this way. There are some other problems : Some 
connections catch CNC-Center and other mashine data all over the time. 
So its not that easy at all. But i will try it with a commit on idle. 
And Reconnect on that 24h connections. (They sleep most of the time so 
it would be a better technic to terminate the connection and reconnect 
all 5 mins. It will save resources too)


Daniel

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Implementing a change log

2005-09-20 Thread Mike Rylander
On 9/20/05, Berend Tober [EMAIL PROTECTED] wrote:
 /*
 The following is based on suggestion by Mike Rylander posted on
 Postgresql-General
 Sun, 18 Sep 2005 23:29:51 +
 
 Rylander's original suggestion employed a trigger and tracked
 only row updates. My implementation makes use of rules and
 handles both updates and deletions.
 */

I'm glad that was inspirational for you.  Not to nit pick, but my
trigger based version did in fact track deletions:

CREATE TRIGGER audit_$$ || sch || $$_$$ || tbl || $$_update_trigger
AFTER UPDATE OR DELETE ON $$ || sch || $$.$$ || tbl || $$ FOR EACH ROW
  ^^^
EXECUTE PROCEDURE auditor.audit_$$ || sch || $$_$$ || tbl || $$_func ();

You may want to consider using the LIKE style of table copying, as it
strips all constraints from the new table.  It's safer IMHO, as this
way you wouldn't have to worry about the primary key being propagated
to the new table (and accidentally forgetting to remove it).

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] running vacuum in scripts

2005-09-20 Thread Belinda M. Giardine
This seems like it should be a frequently asked question, but I am having
trouble finding the answer.  I am in the process of switching to using
Postgres, and realize that I need to run vacuum analyze regularly on the
tables.  This is on a Unix system so cron is the obvious choice.  The
problem is I don't want to put the user name and password in the script.
As far as I can tell vacuum must be run by the table or database owner.
It wouldn't be as bad to have the password in the script if it was a
limited permissions user.  Any suggestions on the best methods?

Thanks,
Belinda


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Implementing a change log

2005-09-20 Thread Berend Tober

Mike Rylander wrote:


On 9/20/05, Berend Tober [EMAIL PROTECTED] wrote:
 


/*
The following is based on suggestion by Mike Rylander posted on
Postgresql-General
Sun, 18 Sep 2005 23:29:51 +

Rylander's original suggestion employed a trigger and tracked
only row updates. My implementation makes use of rules and
handles both updates and deletions.
*/
   



I'm glad that was inspirational ...
 



That was indeed pretty cool.


...did in fact track deletions:
 



Guess I was too excited to actually read the whole thing more closely 
once I grasped the direction you were going!!



You may want to consider using the LIKE style of table copying, as it
strips all constraints from the new table.  It's safer IMHO, as this
way you wouldn't have to worry about the primary key being propagated
to the new table (and accidentally forgetting to remove it).
 

I'm glad you pointed that out because you reminded me that when I tried 
the original idea from Greg Patnude in Mar 2005 using inheritance, I did 
indeed run into a problem with constraints. The problem there I think 
was that I had a check constraint on the table for which I created the 
audit log table, but the check constraint was defined in a different 
schema than the original table. Something about the way inheritance 
table creation works found this a problematic situation. I'll have to 
revisit that and see if using LIKE overcomes that problem.


I guess I originally thought using INHERIT rather than LIKE was that, 
having the audit history, I might at some point present a select view 
across both the base and descendant tables or something (...if you 
record it, they (PHB's) will eventually ask for a report on it...), but 
I haven't actually had an implementation where such an audit history 
table was actually required in production -- I'm just exercising the 
functionality and exploring the quirks in order to be prepared for when 
such a requirement is actually promulgated.


Any other significant distinquishing features of INHERIT verses LIKE for 
this kind of use that you (or others) can think of?



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Implementing a change log

2005-09-20 Thread Michael Schuerig
On Tuesday 20 September 2005 08:44, Berend Tober wrote:
 Greg Sabino Mullane wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

[MS: on audit tables]
 Can anyone relate their experiences with such a thing? Which
  approaches should I take into consideration?
 
 I like the multi-table approach; I use a schema named audit that
  contains a copy of some of the important tables (sans constraints).
  The nice part is that I can use the exact same table name, which
  makes things easier. A few extra columns on each audit table track
  who made the change, what type it was (insert, update, or delete
  [trigger event]), and the time of the change [default timestamptz].
  Throw in some triggers and you're done.

 There was a very exciting discussion of this last May, in which Greg
 Patnude suggested the most insightful, and in hindsight obviously
 appropriate, use of table inheritance ever (IMHO). I slightly refined
 the idea and posted documentation comments at the time. See User
 Comments at

 http://www.postgresql.org/docs/8.0/interactive/tutorial-inheritance.
html

Looks interesting, but I'm hesitant to use the approach as I don't (yet) 
understand it well enough. Also, I'm using the Rails framework and I 
don't think it would play well with too much intelligence in the DB.

The consequences are that for my current project I'll do the change 
logging at application level. However, this doesn't preclude learning 
more about programming PgSQL and possibly writing an extension for 
Rails latter on.

Regarding the former, learning about programming on the DB server, my 
impression is that documentation is scarce. There is reference 
documentation, of course, and the book by Douglas  Douglas devotes a 
few chapters to the topic. Still, coverage appears pretty introductory. 
In comparison, there's a wealth of books on everything above the 
database: languages, design and architecture, frameworks. Literature on 
database-server programming seems to be tied to specific products, 
mostly Oracle. I'd like to be wrong, of course. What are good ways to 
learn about these things in a free database world?

Michael

-- 
Michael Schuerig  Failures to use one's frontal lobes
mailto:[EMAIL PROTECTED]can result in the loss of them.
http://www.schuerig.de/michael/   --William H. Calvin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Csaba Nagy
You could use the system wide crontab, which is configured via
/etc/crontab, and there you can specify what user should execute the
command, without needing passwords. The system wide crontab is executed
as root and it will su to the user you specify.

For vacuuming, you could try to use the autovacuum daemon, it is
included in the contrib part of postgres. If you installed from source,
you will likely need to separately install autovacuum, if you installed
a prepackaged postgres, chances are that you already have the
pg_autovacuum executable installed. It is fairly easy to set up.

HTH,
Csaba.


On Mon, 2005-09-19 at 20:27, Belinda M. Giardine wrote:
 This seems like it should be a frequently asked question, but I am having
 trouble finding the answer.  I am in the process of switching to using
 Postgres, and realize that I need to run vacuum analyze regularly on the
 tables.  This is on a Unix system so cron is the obvious choice.  The
 problem is I don't want to put the user name and password in the script.
 As far as I can tell vacuum must be run by the table or database owner.
 It wouldn't be as bad to have the password in the script if it was a
 limited permissions user.  Any suggestions on the best methods?
 
 Thanks,
 Belinda
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread John DeSoi


On Sep 19, 2005, at 2:27 PM, Belinda M. Giardine wrote:

As far as I can tell vacuum must be run by the table or database  
owner.

It wouldn't be as bad to have the password in the script if it was a
limited permissions user.  Any suggestions on the best methods?


Setup a .pgpass file so you don't need to embed the password in the  
script:


http://www.postgresql.org/docs/8.0/interactive/libpq-pgpass.html


John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Backup and Restore mechanism in Postgres

2005-09-20 Thread Vivek Khera


On Sep 14, 2005, at 9:45 AM, vinita bansal wrote:

I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD  
and a 40GB database. I need to take backup of this database and  
restore it some other location (say some test environment). I am  
currently using pg_dump and pg_restore utilities to get this done  
which takes 4-5 hrs for a dump and 8-9 hrs for restore  
respectively. I am using custom format for taking dumps.




i'll bet you've saturated your disk I/O bandwidth, since for me  
dumping a db a bit larger than that takes roughly 1 hour, and restore  
about 4.


you could also investigate making a copy using a replication system  
like slony (http://slony.info) then once the copy is made turning off  
the replication.


Vivek Khera, Ph.D.
+1-301-869-4449 x806



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Problem with 64-bit Postgres

2005-09-20 Thread Vivek Khera


On Sep 15, 2005, at 9:56 AM, Peter Alberer wrote:

I compiled postgres in 64-bit mode by adding the following switches  
in the

make file:
To the gcc lines: -m64 -mcpu=power5 -mtune=power5
To ld lines: -m elf64ppc



FWIW Postgres works splendidly in 64-bit mode on FreeBSD on Opteron  
systems.


Perhaps it is a bug in the power5 gcc compiler?  Turn off machine  
specific tuning and see what happens.


Vivek Khera, Ph.D.
+1-301-869-4449 x806



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] BIG installations of PostgresQL?

2005-09-20 Thread Vivek Khera
On Sep 15, 2005, at 11:25 PM, Logan Bowers wrote:Many of the other folks at the company feel an open-source DB is more risky because it is less well tested compared to commercial counterparts.  I’m looking for examples of large installations of Postgres with huge data sets, high traffic volumes, high update rates, etc, particularly large, recognizable names.  I know you guys get this question a lot, but can anyone share experiences with using Postgres in large settings with huge query rates, replication, etc?  Any stories you guys can share (in public or private) would be greatly appreciated.  Thanks in advance! Every so often on the slony (Postgres replication) list, a fellow from Skype pops up.   I suspect they use postgres under high volume, huge dataset situation...Also, do a "whois slony.info" or any *.info domain and you hit a replicated postgres database.  The entire .info (and now .org) domain registries are stored on postgres. Vivek Khera, Ph.D. +1-301-869-4449 x806  

Re: [GENERAL] Replication

2005-09-20 Thread Scott Ribe
 Indeed. But just to stress the point, I wasn't stating that the included
 replication in MySQL was any good (though it's not terrible as we're
 using it heavily in an extremely high-volume situation with few
 problems), I was just bringing up the idea of getting a decent
 replication solution included in PostgreSQL for relatively little effort.

No, but IIRC, you didn't state that is was a substandard solution, and, also
IIRC, it really sounded as though you believed it was a good one.

 So, feel free to mention MySQL, but know that mostly when it's mentioned
 here, it's mentioned as an example of how things shouldn't be done.  In
 terms of coding, marketing, testing, or licensing.
 
 
 I think in future I'll just stick to not mentioning it. :)

Probably not necessary; just make it clear whether you're saying MySQL
claims... and it would be good for Postgres to have its own solution... or
MySQL has an actual working full-blown good solution for... that Postgres
would do well to emulate.


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 665-7007 voice



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] deactivating/activating constraint

2005-09-20 Thread Oleg

Dear All,
is it possible to temporary deactivate a constraint in PostgreSQL?
There is a constraint that does not allow me to write some data (see 
e-mail below). But after all datasets are written the constraint is 
valid. So I was wondering wether it is possible to deactivate a 
constraint write all records in all tables then activate constraint 
again. Somebody told me that it is possible in Oracle.

Thanks a lot in advance
Oleg


Dear All
I am PostgreSQL beginner. I am trying to write some values from 
external source table “Table1” (using PostGIS plugin) to 2 destination 
PostgreSQL tables “Table2” and “Table3” that describes relations of data:


Table1:
| A | B |
-
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |

Table2
| C |
-
| 1 |
| 2 |

Table3
| D | E |
-
| 1 | 1 | case 1
| 2 | 3 | case 2

Table3 has constraint:
FOREIGN KEY E REFERENCES Table2 (C) ON UPDATE NO ACTION ON DELETE 
RESTRICT;


I use updateable view “View1” (due to PostGIS interface) with columns 
“C” and “E” and rule:
AS ON INSERT TO View1 DO INSTEAD (INSERT INTO Table2 VALUES (NEW.C); 
INSERT INTO Table3 VALUES (NEW.D, NEW.E));


I faced following problem: As it is shown in Table3 while trying to 
write data in case 1 everything works fine, but case 2 is not possible 
due to constraint in Table3. However my aim is to write a column A to 
column C, column A to column D and column B to column E not removing 
constraint for table3. May be there is a way to adjust constraint? Or 
may be to adjust rule somehow to make it to write all data from column 
A to column C first and after that fill in Table3?

Thanks a lot in advance
Oleg




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Backup and Restore mechanism in Postgres

2005-09-20 Thread Lincoln Yeoh

At 10:00 AM 9/20/2005 -0400, Vivek Khera wrote:



On Sep 14, 2005, at 9:45 AM, vinita bansal wrote:


I have a 4 proc. AMD Opteron machine with 32 GB RAM and ~400GB HDD
and a 40GB database. I need to take backup of this database and
restore it some other location (say some test environment). I am
currently using pg_dump and pg_restore utilities to get this done
which takes 4-5 hrs for a dump and 8-9 hrs for restore
respectively. I am using custom format for taking dumps.


i'll bet you've saturated your disk I/O bandwidth, since for me
dumping a db a bit larger than that takes roughly 1 hour, and restore
about 4.


I don't think disk I/O is saturated, unless the database is very fragmented 
on disk.


Most modern drives can manage at least 40MB/sec sequential throughput. Even 
random seeks of 64KB or 128KB blocks should get you about 6-12MB/sec. So 4 
hours is quite slow. And 8-9 hours for a restore of 40GB probably won't be 
very pleasant if you have a boss or customer breathing down your neck...


Any reason why Postgresql would only get 2.8MB/sec for dumps or slower?

Regards,
Link.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] deactivating/activating constraint

2005-09-20 Thread Sebastian Böck

Oleg wrote:

Dear All,
is it possible to temporary deactivate a constraint in PostgreSQL?
There is a constraint that does not allow me to write some data (see 
e-mail below). But after all datasets are written the constraint is 
valid. So I was wondering wether it is possible to deactivate a 
constraint write all records in all tables then activate constraint 
again. Somebody told me that it is possible in Oracle.

Thanks a lot in advance
Oleg


Have you tried to make the Foreign Key deferrable and initially deferred?

See:
http://www.postgresql.org/docs/8.0/interactive/sql-createtable.html

HTH

Sebastian

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] deactivating/activating constraint

2005-09-20 Thread Csaba Nagy
You might be looking for a DEFERRED constraint.
You can declare your constraint to be checked only at transaction end,
and then make all your data changes, in one transaction.

You will find details here:
http://www.postgresql.org/docs/8.0/static/sql-createtable.html
Search for DEFERRED.

HTH,
Csaba.


On Tue, 2005-09-20 at 16:26, Oleg wrote:
 Dear All,
 is it possible to temporary deactivate a constraint in PostgreSQL?
 There is a constraint that does not allow me to write some data (see 
 e-mail below). But after all datasets are written the constraint is 
 valid. So I was wondering wether it is possible to deactivate a 
 constraint write all records in all tables then activate constraint 
 again. Somebody told me that it is possible in Oracle.
 Thanks a lot in advance
 Oleg
 
  Dear All
  I am PostgreSQL beginner. I am trying to write some values from 
  external source table “Table1” (using PostGIS plugin) to 2 destination 
  PostgreSQL tables “Table2” and “Table3” that describes relations of data:
 
  Table1:
  | A | B |
  -
  | 1 | 1 |
  | 2 | 3 |
  | 3 | 1 |
 
  Table2
  | C |
  -
  | 1 |
  | 2 |
 
  Table3
  | D | E |
  -
  | 1 | 1 | case 1
  | 2 | 3 | case 2
 
  Table3 has constraint:
  FOREIGN KEY E REFERENCES Table2 (C) ON UPDATE NO ACTION ON DELETE 
  RESTRICT;
 
  I use updateable view “View1” (due to PostGIS interface) with columns 
  “C” and “E” and rule:
  AS ON INSERT TO View1 DO INSTEAD (INSERT INTO Table2 VALUES (NEW.C); 
  INSERT INTO Table3 VALUES (NEW.D, NEW.E));
 
  I faced following problem: As it is shown in Table3 while trying to 
  write data in case 1 everything works fine, but case 2 is not possible 
  due to constraint in Table3. However my aim is to write a column A to 
  column C, column A to column D and column B to column E not removing 
  constraint for table3. May be there is a way to adjust constraint? Or 
  may be to adjust rule somehow to make it to write all data from column 
  A to column C first and after that fill in Table3?
  Thanks a lot in advance
  Oleg
 
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Belinda M. Giardine


On Tue, 20 Sep 2005, Csaba Nagy wrote:

 You could use the system wide crontab, which is configured via
 /etc/crontab, and there you can specify what user should execute the
 command, without needing passwords. The system wide crontab is executed
 as root and it will su to the user you specify.

 For vacuuming, you could try to use the autovacuum daemon, it is
 included in the contrib part of postgres. If you installed from source,
 you will likely need to separately install autovacuum, if you installed
 a prepackaged postgres, chances are that you already have the
 pg_autovacuum executable installed. It is fairly easy to set up.

 HTH,
 Csaba.



Thanks.  I didn't find autovacuum anywhere in our install.  It was done
from source so I sent a request to the sysadmin.  It does sound like what
I was looking for.

Belinda


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Belinda M. Giardine


On Tue, 20 Sep 2005, John DeSoi wrote:


 On Sep 19, 2005, at 2:27 PM, Belinda M. Giardine wrote:

  As far as I can tell vacuum must be run by the table or database
  owner.
  It wouldn't be as bad to have the password in the script if it was a
  limited permissions user.  Any suggestions on the best methods?

 Setup a .pgpass file so you don't need to embed the password in the
 script:

 http://www.postgresql.org/docs/8.0/interactive/libpq-pgpass.html


 John DeSoi, Ph.D.
 http://pgedit.com/
 Power Tools for PostgreSQL


Thanks, this also could be useful.  It sounds like I won't need it for
this if I use autovacuum, but could come in handy later.

Belinda


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[GENERAL] Slow search.. quite clueless

2005-09-20 Thread Yonatan Ben-Nes

Hi all,

Im building a site where the users can search for products with up to 4 
diffrent keywords which all MUST match to each product which found as a 
result to the search.


I got 2 tables (which are relevant to the issue :)), one is the product 
table (5 million rows) and the other is the keyword table which hold the 
keywords of each product (60 million rows).


The scheme of the tables is as follows:

  Table public.product
   Column   | Type  |  Modifiers
+---+-
 product_id | text  | not null
 product_name   | text  | not null
 retail_price   | numeric(10,2) | not null
 etc...
Indexes:
product_product_id_key UNIQUE, btree (product_id)

 Table public.keyword
   Column| Type  | Modifiers
-+---+---
 product_id  | text  | not null
 keyword | text  | not null
Indexes:
keyword_keyword btree (keyword)

The best query which I succeded to do till now is adding the keyword 
table for each keyword searched for example if someone search for belt 
 black  pants it will create the following query:


poweraise.com=# EXPLAIN ANALYZE SELECT 
product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price 
FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword 
t2 USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE 
t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13;


   QUERY PLAN
---
 Limit  (cost=37734.15..39957.20 rows=13 width=578) (actual 
time=969.798..1520.354 rows=6 loops=1)
   -  Hash Join  (cost=37734.15..3754162.82 rows=21733 width=578) 
(actual time=969.794..1520.337 rows=6 loops=1)

 Hash Cond: (outer.product_id = inner.product_id)
 -  Nested Loop  (cost=18867.07..2858707.34 rows=55309 
width=612) (actual time=82.266..1474.018 rows=156 loops=1)
   -  Hash Join  (cost=18867.07..2581181.09 rows=55309 
width=34) (actual time=82.170..1462.104 rows=156 loops=1)

 Hash Cond: (outer.product_id = inner.product_id)
 -  Index Scan using keyword_keyword on keyword t2 
 (cost=0.00..331244.43 rows=140771 width=17) (actual 
time=0.033..1307.167 rows=109007 loops=1)

   Index Cond: (keyword = 'black'::text)
 -  Hash  (cost=18851.23..18851.23 rows=6337 
width=17) (actual time=16.145..16.145 rows=0 loops=1)
   -  Index Scan using keyword_keyword on 
keyword t1  (cost=0.00..18851.23 rows=6337 width=17) (actual 
time=0.067..11.050 rows=3294 loops=1)

 Index Cond: (keyword = 'belt'::text)
   -  Index Scan using product_product_id_key on product 
(cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1 
loops=156)

 Index Cond: (product.product_id = outer.product_id)
 -  Hash  (cost=18851.23..18851.23 rows=6337 width=17) (actual 
time=42.863..42.863 rows=0 loops=1)
   -  Index Scan using keyword_keyword on keyword t3 
(cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 
rows=3932 loops=1)

 Index Cond: (keyword = 'pants'::text)
 Total runtime: 1521.441 ms
(17 rows)

Sometimes the query work fast even for 3 keywords but that doesnt help 
me if at other times it take ages


Now to find a result for 1 keyword its really flying so I also tried to 
make 3 queries and do INTERSECT between them but it was found out to be 
extremly slow...


Whats make this query slow as far as I understand is all the merging 
between the results of each table... I tried to divide the keyword table 
into lots of keywords table which each hold keywords which start only 
with a specific letter, it did improve the speeds but not in a real 
significant way.. tried clusters,indexes,SET STATISTICS,WITHOUT OIDS on 
the keyword table and what not.. im quite clueless...


Actually I even started to look on other solutions and maybe you can say 
something about them also.. maybe they can help me:

1. Omega (From the Xapian project) - http://www.xapian.org/
2. mnoGoSearch - http://www.mnogosearch.org/doc.html
3. Swish-e - http://swish-e.org/index.html

To add on everything I want at the end to be able to ORDER BY the 
results like order the product by price, but im less concerned about 
that cause I saw that with cluster I can do it without any extra overhead.


Thanks alot in advance,
Yonatan Ben-Nes


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  

Re: [GENERAL] Slow search.. quite clueless

2005-09-20 Thread Oleg Bartunov

contrib/tsearch2 ( http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ )
might works for you. It might because performance depends on 
cardinality of your keywords.


Oleg
On Tue, 20 Sep 2005, Yonatan Ben-Nes wrote:


Hi all,

Im building a site where the users can search for products with up to 4 
diffrent keywords which all MUST match to each product which found as a 
result to the search.


I got 2 tables (which are relevant to the issue :)), one is the product table 
(5 million rows) and the other is the keyword table which hold the keywords 
of each product (60 million rows).


The scheme of the tables is as follows:

 Table public.product
  Column   | Type  |  Modifiers
+---+-
product_id | text  | not null
product_name   | text  | not null
retail_price   | numeric(10,2) | not null
etc...
Indexes:
   product_product_id_key UNIQUE, btree (product_id)

Table public.keyword
  Column| Type  | Modifiers
-+---+---
product_id  | text  | not null
keyword | text  | not null
Indexes:
   keyword_keyword btree (keyword)

The best query which I succeded to do till now is adding the keyword table 
for each keyword searched for example if someone search for belt  black 
 pants it will create the following query:


poweraise.com=# EXPLAIN ANALYZE SELECT 
product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price 
FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword t2 
USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE 
t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13;


  QUERY PLAN
---
Limit  (cost=37734.15..39957.20 rows=13 width=578) (actual 
time=969.798..1520.354 rows=6 loops=1)
  -  Hash Join  (cost=37734.15..3754162.82 rows=21733 width=578) (actual 
time=969.794..1520.337 rows=6 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Nested Loop  (cost=18867.07..2858707.34 rows=55309 width=612) 
(actual time=82.266..1474.018 rows=156 loops=1)
  -  Hash Join  (cost=18867.07..2581181.09 rows=55309 width=34) 
(actual time=82.170..1462.104 rows=156 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Index Scan using keyword_keyword on keyword t2 
(cost=0.00..331244.43 rows=140771 width=17) (actual time=0.033..1307.167 
rows=109007 loops=1)

  Index Cond: (keyword = 'black'::text)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) 
(actual time=16.145..16.145 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword t1 
(cost=0.00..18851.23 rows=6337 width=17) (actual time=0.067..11.050 rows=3294 
loops=1)

Index Cond: (keyword = 'belt'::text)
  -  Index Scan using product_product_id_key on product 
(cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1 
loops=156)

Index Cond: (product.product_id = outer.product_id)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) (actual 
time=42.863..42.863 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword t3 
(cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 rows=3932 
loops=1)

Index Cond: (keyword = 'pants'::text)
Total runtime: 1521.441 ms
(17 rows)

Sometimes the query work fast even for 3 keywords but that doesnt help me if 
at other times it take ages


Now to find a result for 1 keyword its really flying so I also tried to make 
3 queries and do INTERSECT between them but it was found out to be extremly 
slow...


Whats make this query slow as far as I understand is all the merging between 
the results of each table... I tried to divide the keyword table into lots of 
keywords table which each hold keywords which start only with a specific 
letter, it did improve the speeds but not in a real significant way.. tried 
clusters,indexes,SET STATISTICS,WITHOUT OIDS on the keyword table and what 
not.. im quite clueless...


Actually I even started to look on other solutions and maybe you can say 
something about them also.. maybe they can help me:

1. Omega (From the Xapian project) - http://www.xapian.org/
2. mnoGoSearch - http://www.mnogosearch.org/doc.html
3. Swish-e - http://swish-e.org/index.html

To add on everything I want at the end to be able to ORDER BY the results 
like order the product by price, but im less concerned about that cause I saw 
that with cluster I can do it without any extra overhead.



Re: [GENERAL] Slow search.. quite clueless

2005-09-20 Thread Philip Hallstrom

contrib/tsearch2 ( http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ )
might works for you. It might because performance depends on cardinality of 
your keywords.


Seconded.  We use tsearch2 to earch about 40,000 rows containing 
manufacturer, brand, and product name and it returns a result almost 
instantly.  Before when we did normal SQL manufacture LIKE ..., etc. it 
would take 20-30 seconds.


One thing to check is the english.stop file which contains words to skip 
(i, a, the, etc.).  In our case we removed almost all of them since one of 
our products is 7 up (the drink) and it would remove up.  Made it 
really hard to pull up 7 up in the results :)


-philip



Oleg
On Tue, 20 Sep 2005, Yonatan Ben-Nes wrote:


Hi all,

Im building a site where the users can search for products with up to 4 
diffrent keywords which all MUST match to each product which found as a 
result to the search.


I got 2 tables (which are relevant to the issue :)), one is the product 
table (5 million rows) and the other is the keyword table which hold the 
keywords of each product (60 million rows).


The scheme of the tables is as follows:

 Table public.product
  Column   | Type  |  Modifiers
+---+-
product_id | text  | not null
product_name   | text  | not null
retail_price   | numeric(10,2) | not null
etc...
Indexes:
   product_product_id_key UNIQUE, btree (product_id)

Table public.keyword
  Column| Type  | Modifiers
-+---+---
product_id  | text  | not null
keyword | text  | not null
Indexes:
   keyword_keyword btree (keyword)

The best query which I succeded to do till now is adding the keyword table 
for each keyword searched for example if someone search for belt  
black  pants it will create the following query:


poweraise.com=# EXPLAIN ANALYZE SELECT 
product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price 
FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword t2 
USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE 
t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13;


  QUERY PLAN

---
Limit  (cost=37734.15..39957.20 rows=13 width=578) (actual 
time=969.798..1520.354 rows=6 loops=1)
  -  Hash Join  (cost=37734.15..3754162.82 rows=21733 width=578) (actual 
time=969.794..1520.337 rows=6 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Nested Loop  (cost=18867.07..2858707.34 rows=55309 width=612) 
(actual time=82.266..1474.018 rows=156 loops=1)
  -  Hash Join  (cost=18867.07..2581181.09 rows=55309 
width=34) (actual time=82.170..1462.104 rows=156 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Index Scan using keyword_keyword on keyword t2 
(cost=0.00..331244.43 rows=140771 width=17) (actual time=0.033..1307.167 
rows=109007 loops=1)

  Index Cond: (keyword = 'black'::text)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) 
(actual time=16.145..16.145 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword 
t1 (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.067..11.050 
rows=3294 loops=1)

Index Cond: (keyword = 'belt'::text)
  -  Index Scan using product_product_id_key on product 
(cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1 
loops=156)

Index Cond: (product.product_id = outer.product_id)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) (actual 
time=42.863..42.863 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword t3 
(cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 
rows=3932 loops=1)

Index Cond: (keyword = 'pants'::text)
Total runtime: 1521.441 ms
(17 rows)

Sometimes the query work fast even for 3 keywords but that doesnt help me 
if at other times it take ages


Now to find a result for 1 keyword its really flying so I also tried to 
make 3 queries and do INTERSECT between them but it was found out to be 
extremly slow...


Whats make this query slow as far as I understand is all the merging 
between the results of each table... I tried to divide the keyword table 
into lots of keywords table which each hold keywords which start only with 
a specific letter, it did improve the speeds but not in a real significant 
way.. tried clusters,indexes,SET STATISTICS,WITHOUT OIDS on the keyword 
table and what not.. im quite 

Re: [GENERAL] Slow search.. quite clueless

2005-09-20 Thread Alex Turner
Im by no means an expert on this, and perhaps someone with more
knowledge can help, but it looks to me like the planner estimate and
the actual cost are significantly different which to me means that an
analyze is required, or/and increase the stats on these tables would be
usefull. Also I'm wondering if you can avoid the dereference oid
lookup by created the index as keyword,product_id instead of just
keyword.

Alex Turner
NetEconomistOn 9/20/05, Oleg Bartunov oleg@sai.msu.su wrote:
contrib/tsearch2 ( http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ )might works for you. It might because performance depends oncardinality of your keywords.
OlegOn Tue, 20 Sep 2005, Yonatan Ben-Nes wrote: Hi all, Im building a site where the users can search for products with up to 4 diffrent keywords which all MUST match to each product which found as a
 result to the search. I got 2 tables (which are relevant to the issue :)), one is the product table (5 million rows) and the other is the keyword table which hold the keywords of each product (60 million rows).
 The scheme of the tables is as follows:Table
public.product
Column
|
Type|Modifiers +---+-
product_id
| text| not
null
product_name
| text| not
null retail_price | numeric(10,2) | not null etc... Indexes:product_product_id_key UNIQUE, btree (product_id) Table public.keyword
 Column| Type| Modifiers -+---+--- product_id| text| not null keyword | text| not null Indexes:keyword_keyword btree (keyword)
 The best query which I succeded to do till now is adding the keyword table for each keyword searched for example if someone search for belt  black  pants it will create the following query:
 poweraise.com=# EXPLAIN ANALYZE SELECT product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword t2
 USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13; QUERY PLAN ---
 Limit(cost=37734.15..39957.20 rows=13 width=578) (actual time=969.798..1520.354 rows=6 loops=1) -Hash Join(cost=37734.15..3754162.82 rows=21733 width=578) (actual time=969.794..1520.337
 rows=6 loops=1) Hash Cond: (outer.product_id = inner.product_id)
-Nested Loop(cost=18867.07..2858707.34
rows=55309 width=612) (actual time=82.266..1474.018 rows=156 loops=1)
-Hash Join(cost=18867.07..2581181.09
rows=55309 width=34) (actual time=82.170..1462.104 rows=156 loops=1)
Hash Cond: (outer.product_id = inner.product_id)
-Index Scan using keyword_keyword on keyword t2 (cost=0.00..331244.43 rows=140771 width=17) (actual time=0.033..1307.167 rows=109007 loops=1)
Index Cond: (keyword = 'black'::text)
-Hash(cost=18851.23..18851.23 rows=6337
width=17) (actual time=16.145..16.145 rows=0 loops=1)
-Index Scan using keyword_keyword on keyword t1 (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.067..11.050 rows=3294 loops=1)
Index Cond: (keyword = 'belt'::text)
-Index Scan using product_product_id_key on product (cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1 loops=156)
Index Cond: (product.product_id = outer.product_id)
-Hash(cost=18851.23..18851.23 rows=6337
width=17) (actual time=42.863..42.863 rows=0 loops=1)
-Index Scan using keyword_keyword on keyword t3 (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 rows=3932 loops=1)
Index Cond: (keyword = 'pants'::text) Total runtime: 1521.441 ms (17 rows) Sometimes the query work fast even for 3 keywords but that doesnt help me if at other times it take ages
 Now to find a result for 1 keyword its really flying so I also tried to make 3 queries and do INTERSECT between them but it was found out to be extremly slow... Whats make this query slow as far as I understand is all the merging between
 the results of each table... I tried to divide the keyword table into lots of keywords table which each hold keywords which start only with a specific letter, it did improve the speeds but not in a real significant way.. tried
 clusters,indexes,SET STATISTICS,WITHOUT OIDS on the keyword table and what not.. im quite clueless... Actually I even started to look on other solutions and maybe you can say something about them also.. maybe they can help me:
 1. Omega (From the Xapian project) - http://www.xapian.org/ 2. mnoGoSearch - http://www.mnogosearch.org/doc.html 3. Swish-e - 
http://swish-e.org/index.html To add on everything I want at the end to be able to ORDER BY the results like order the product by price, but im less concerned about that cause I saw
 that with cluster I can do it without any extra overhead. Thanks alot in advance, Yonatan Ben-Nes 

Re: [GENERAL] Slow search.. quite clueless

2005-09-20 Thread Oleg Bartunov

On Tue, 20 Sep 2005, Philip Hallstrom wrote:

contrib/tsearch2 ( http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ 
)
might works for you. It might because performance depends on cardinality of 
your keywords.


Seconded.  We use tsearch2 to earch about 40,000 rows containing 
manufacturer, brand, and product name and it returns a result almost 
instantly.  Before when we did normal SQL manufacture LIKE ..., etc. it 
would take 20-30 seconds.


One thing to check is the english.stop file which contains words to skip (i, 
a, the, etc.).  In our case we removed almost all of them since one of our 
products is 7 up (the drink) and it would remove up.  Made it really hard 
to pull up 7 up in the results :)


we have rewriting query support ( thesauri search) in our todo
(http://www.sai.msu.su/~megera/oddmuse/index.cgi/todo).




-philip



Oleg
On Tue, 20 Sep 2005, Yonatan Ben-Nes wrote:


Hi all,

Im building a site where the users can search for products with up to 4 
diffrent keywords which all MUST match to each product which found as a 
result to the search.


I got 2 tables (which are relevant to the issue :)), one is the product 
table (5 million rows) and the other is the keyword table which hold the 
keywords of each product (60 million rows).


The scheme of the tables is as follows:

 Table public.product
  Column   | Type  |  Modifiers
+---+-
product_id | text  | not null
product_name   | text  | not null
retail_price   | numeric(10,2) | not null
etc...
Indexes:
   product_product_id_key UNIQUE, btree (product_id)

Table public.keyword
  Column| Type  | Modifiers
-+---+---
product_id  | text  | not null
keyword | text  | not null
Indexes:
   keyword_keyword btree (keyword)

The best query which I succeded to do till now is adding the keyword table 
for each keyword searched for example if someone search for belt  
black  pants it will create the following query:


poweraise.com=# EXPLAIN ANALYZE SELECT 
product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price 
FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword t2 
USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE 
t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13;


  QUERY PLAN


---
Limit  (cost=37734.15..39957.20 rows=13 width=578) (actual 
time=969.798..1520.354 rows=6 loops=1)
  -  Hash Join  (cost=37734.15..3754162.82 rows=21733 width=578) (actual 
time=969.794..1520.337 rows=6 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Nested Loop  (cost=18867.07..2858707.34 rows=55309 width=612) 
(actual time=82.266..1474.018 rows=156 loops=1)
  -  Hash Join  (cost=18867.07..2581181.09 rows=55309 
width=34) (actual time=82.170..1462.104 rows=156 loops=1)

Hash Cond: (outer.product_id = inner.product_id)
-  Index Scan using keyword_keyword on keyword t2 
(cost=0.00..331244.43 rows=140771 width=17) (actual time=0.033..1307.167 
rows=109007 loops=1)

  Index Cond: (keyword = 'black'::text)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) 
(actual time=16.145..16.145 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword 
t1 (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.067..11.050 
rows=3294 loops=1)

Index Cond: (keyword = 'belt'::text)
  -  Index Scan using product_product_id_key on product 
(cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1 
loops=156)

Index Cond: (product.product_id = outer.product_id)
-  Hash  (cost=18851.23..18851.23 rows=6337 width=17) (actual 
time=42.863..42.863 rows=0 loops=1)
  -  Index Scan using keyword_keyword on keyword t3 
(cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 
rows=3932 loops=1)

Index Cond: (keyword = 'pants'::text)
Total runtime: 1521.441 ms
(17 rows)

Sometimes the query work fast even for 3 keywords but that doesnt help me 
if at other times it take ages


Now to find a result for 1 keyword its really flying so I also tried to 
make 3 queries and do INTERSECT between them but it was found out to be 
extremly slow...


Whats make this query slow as far as I understand is all the merging 
between the results of each table... I tried to divide the keyword table 
into lots of keywords table which each hold keywords which start only with 
a 

[GENERAL] More efficient INs when comparing two columns

2005-09-20 Thread Magnus Naeslund(t)
I was thinking if this was possible in some way..
I have this table where we have X and Y coordinates, and i need to
select several in one go.

# select * from xy where (x = 1 and y = 2) or (x = 2 and y = 2);

This works but are not so nice looking.
It would be nice to be able to do it like this:

# select * from xy where (x, y) in ((1, 2), (2, 2));

But that doesn't work.
A funny thing is that this works:

# select * from xy where (x, y) = (1, 2);

What's the most efficient way of doing these kind of selects?
A multicol index on x and y is what we currently do and the first select
above...

Regards,
Magnus


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Dean Gibson (DB Administrator)

You can set up pg_hba.conf so that only certain Unix users that have
access to the local Unix PostgreSQL socket can access the database
without a password (every other process uses a TCP/IP connection);  then
move the socket location to other than /tmp and restrict its access w/
Unix controls.  Details are in the PostgreSQL documentation, and it
works fine.

-- Dean

On 2005-09-19 11:27, Belinda M. Giardine wrote:


This seems like it should be a frequently asked question, but I am having
trouble finding the answer.  I am in the process of switching to using
Postgres, and realize that I need to run vacuum analyze regularly on the
tables.  This is on a Unix system so cron is the obvious choice.  The
problem is I don't want to put the user name and password in the script.
As far as I can tell vacuum must be run by the table or database owner.
It wouldn't be as bad to have the password in the script if it was a
limited permissions user.  Any suggestions on the best methods?

Thanks,
Belinda


---(end of broadcast)---
TIP 6: explain analyze is your friend
 





---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] pg_autovacuum not sleeping

2005-09-20 Thread Brandon Metcalf
We're using pg_autovacuum with PostgreSQL 8.0.3 and it isn't sleeping
as long as the claims it will.  For example,

  ...
  [2005-09-20 15:40:23 CDT] INFO: last_analyze_count: 21735550; 
last_vacuum_count: 21735009
  [2005-09-20 15:40:23 CDT] INFO: analyze_threshold: 40676; 
vacuum_threshold: 81352
  [2005-09-20 15:40:23 CDT] DEBUG:   347 All DBs checked in: 1428748500 usec, 
will sleep for 4886 secs.
  [2005-09-20 15:50:14 CDT] DEBUG:   Performing: VACUUM ANALYZE 
public.model_lastexport

So, pg_autovacuum says it's going to sleep for 4886 seconds, but fires
up again after just under 600 seconds.

Can anyone explain what I'm seeing?

Thanks.

-- 
Brandon

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Implementing a change log

2005-09-20 Thread Mike Rylander
On 9/20/05, Berend Tober [EMAIL PROTECTED] wrote:

[snip]

 I guess I originally thought using INHERIT rather than LIKE was that,
 having the audit history, I might at some point present a select view
 across both the base and descendant tables or something (...if you
 record it, they (PHB's) will eventually ask for a report on it...), but
 I haven't actually had an implementation where such an audit history
 table was actually required in production -- I'm just exercising the
 functionality and exploring the quirks in order to be prepared for when
 such a requirement is actually promulgated.
 

I can see your point.  You could use a UNION ALL view to combine the
main table with the audit table, though.

 Any other significant distinquishing features of INHERIT verses LIKE for
 this kind of use that you (or others) can think of?
 

I would personally still go with LIKE simply for the CONSTRAINT
stripping since the audit table will, by definition, have duplicates
for the main table's primary key.  In fact after looking at the
documentation more closely it seems that all child table must contain
all CHECK constraints from the base table when using INHERITS.  If you
decided to add a CHECK constraint to the base table at some future
time then you might have to modify the data in the audit table to
match it (big no-no!).

Read the INHERITS and LIKE sections of this* closely to see what I'm
talking about.

However, INHERITS may be useful for the audit table.  If the base
table is updated very frequently you could set up RULE based
partitioning for the audit table.  Then you would have the option of
archiving and dropping older sections of the audit table without
affecting the on-disk layout of the rest of the audit data by creating
holes in the audit table (or having to CLUSTER the table).


* http://www.postgresql.org/docs/8.0/interactive/sql-createtable.html

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Question about a query plan

2005-09-20 Thread Bill Moseley
I'm still trying to understand EXPLAIN ANALYZE output.

ws2= select count(*) from person_role;
 count 
---
   123
(1 row)

ws2= select count(*) from person; 
 count 
---
 11033
(1 row)

ws2= EXPLAIN ANALYZE select id, first_name, last_name from person, person_role 
where id = 94 and person_role.person = person.id and (person_role.role = 2);
QUERY PLAN  
   
---
 Nested Loop  (cost=0.00..8.28 rows=1 width=23) (actual time=0.198..0.237 
rows=1 loops=1)
   -  Index Scan using person_pkey on person  (cost=0.00..5.44 rows=1 
width=23) (actual time=0.054..0.056 rows=1 loops=1)
 Index Cond: (id = 94)
   -  Seq Scan on person_role  (cost=0.00..2.83 rows=1 width=4) (actual 
time=0.130..0.165 rows=1 loops=1)
 Filter: ((role = 2) AND (person = 94))
 Total runtime: 0.379 ms
(6 rows)


Why does it say Seq Scan on person_role?  The query has both the
person and role to use as a primary key -- which is indexed.
Indeed, rows=1 so it looks like an index fetch.

Perhaps, I'm reading that incorrectly?


ws2= \d person_role;
  Table public.person_role
 Column |  Type   | Modifiers 
+-+---
 person | integer | not null
 role   | integer | not null
Indexes:
person_role_pkey primary key, btree (person, role)
Foreign-key constraints:
$2 FOREIGN KEY (role) REFERENCES role(id) ON DELETE RESTRICT
$1 FOREIGN KEY (person) REFERENCES person(id) ON DELETE CASCADE

Thanks,



-- 
Bill Moseley
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] pg_autovacuum not sleeping

2005-09-20 Thread Bruce Momjian
Brandon Metcalf wrote:
 We're using pg_autovacuum with PostgreSQL 8.0.3 and it isn't sleeping
 as long as the claims it will.  For example,
 
   ...
   [2005-09-20 15:40:23 CDT] INFO: last_analyze_count: 21735550; 
 last_vacuum_count: 21735009
   [2005-09-20 15:40:23 CDT] INFO: analyze_threshold: 40676; 
 vacuum_threshold: 81352
   [2005-09-20 15:40:23 CDT] DEBUG:   347 All DBs checked in: 1428748500 usec, 
 will sleep for 4886 secs.
   [2005-09-20 15:50:14 CDT] DEBUG:   Performing: VACUUM ANALYZE 
 public.model_lastexport
 
 So, pg_autovacuum says it's going to sleep for 4886 seconds, but fires
 up again after just under 600 seconds.
 
 Can anyone explain what I'm seeing?

Yep, this was fixed in 8.0.X CVS in May, two days after the release of
8.0.3:

revision 1.27.4.4
date: 2005/05/11 17:58:32;  author: momjian;  state: Exp;  lines: +11 -2
Fix pg_autovacuum -s flag to handle values  2000 by using sleep()
instead of pg_usleep.

Backpatch to 8.0.X.

We are thinking of putting out an 8.0.4 in a few days, but until then
you can pull from CVS branch REL8_0_STABLE and use that fix in
pg_autovacuum.c.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Question about a query plan

2005-09-20 Thread Stephan Szabo
On Tue, 20 Sep 2005, Bill Moseley wrote:

 ws2= select count(*) from person_role;
  count
 ---
123
 (1 row)

 ws2= select count(*) from person;
  count
 ---
  11033
 (1 row)

 ws2= EXPLAIN ANALYZE select id, first_name, last_name from person, 
 person_role where id = 94 and person_role.person = person.id and 
 (person_role.role = 2);
 QUERY PLAN
 ---
  Nested Loop  (cost=0.00..8.28 rows=1 width=23) (actual time=0.198..0.237 
 rows=1 loops=1)
-  Index Scan using person_pkey on person  (cost=0.00..5.44 rows=1 
 width=23) (actual time=0.054..0.056 rows=1 loops=1)
  Index Cond: (id = 94)
-  Seq Scan on person_role  (cost=0.00..2.83 rows=1 width=4) (actual 
 time=0.130..0.165 rows=1 loops=1)
  Filter: ((role = 2) AND (person = 94))
  Total runtime: 0.379 ms
 (6 rows)


 Why does it say Seq Scan on person_role?  The query has both the
 person and role to use as a primary key -- which is indexed.
 Indeed, rows=1 so it looks like an index fetch.

IIRC, that's how many rows met the filter.

My guess is that unless there's dead space, 123 rows of person_role should
fit in 1 page, so it's probably deciding that using the index would
involve more disk access than not.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Question about a query plan

2005-09-20 Thread Tom Lane
Bill Moseley [EMAIL PROTECTED] writes:
 ws2= select count(*) from person_role;
  count 
 ---
123
 (1 row)
 ...
-  Seq Scan on person_role  (cost=0.00..2.83 rows=1 width=4) (actual 
 time=0.130..0.165 rows=1 loops=1)
  Filter: ((role = 2) AND (person = 94))

 Why does it say Seq Scan on person_role?

Probably because it doesn't consider that table big enough to warrant
using an index.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] ltree and ordering - what index?

2005-09-20 Thread Jim C. Nasby
On Mon, Sep 19, 2005 at 07:23:54AM +0200, hubert depesz lubaczewski wrote:
 On 9/18/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
  
  ltree is part of contrib, right?
 
 
 
 yes.
 
 You probably need to define a functional index of some kind. How are you
  querying now? IIRC you'll be doing something like region IN (ltree)?
  
 
 i didn't thought about functional indices, but this might be a solution. 
 have to think about it for a while
 searching through ltree's is done using specific operators (@).

Yes, which is something I don't think the indexing code can deal with.

IIRC you're looking for a way to index something that's in a specific
place in the ltree, so you'd want to index that actual function
expression.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] running vacuum in scripts

2005-09-20 Thread Jim C. Nasby
On Tue, Sep 20, 2005 at 11:04:44AM -0400, Belinda M. Giardine wrote:
  For vacuuming, you could try to use the autovacuum daemon, it is
  included in the contrib part of postgres. If you installed from source,
  you will likely need to separately install autovacuum, if you installed
  a prepackaged postgres, chances are that you already have the
  pg_autovacuum executable installed. It is fairly easy to set up.

 Thanks.  I didn't find autovacuum anywhere in our install.  It was done
 from source so I sent a request to the sysadmin.  It does sound like what
 I was looking for.

http://cvs.distributed.net/viewcvs.cgi/stats-sql/tools/ has some scripts
you might find useful for running autovacuum.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Preventing duplicate records according to several fields

2005-09-20 Thread CSN
I have a table like so:

id, title, yield, directions

and would like to prevent duplicate records from being
added (i.e. according to the title, yield, and
directions fields). I won't normally be querying on
the yield or directions fields, so I just have indexes
for id and title. What's the best way to prevent
duplicates from being added?

- Before inserting, do a 'select id from stuff where
title=? and yield=? and directions=?'. This would want
the title and directions fields indexed (which seems
like a waste of space since they won't be used except
for rare inserts).

- Create a unique index across the title, yield, and
directions fields.

- Create a 'hash' field by md5'ing the title, yield,
and directions fields, and create a unique index on
it. Then when inserting new records, first create a
hash and check if it already exists, or have the
database automatically handle this (trigger to compute
hash field at insert time - unique index will raise an
exception).

Thanks for any help, insights, suggestions, etc.
CSN



__ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Preventing duplicate records according to several fields

2005-09-20 Thread Michael Glaesemann


On Sep 21, 2005, at 10:34 AM, CSN wrote:


id, title, yield, directions

and would like to prevent duplicate records from being
added (i.e. according to the title, yield, and
directions fields).


snip /


- Create a unique index across the title, yield, and
directions fields.


This scenario is exactly what the UNIQUE contraint is for.

Michael Glaesemann
grzm myrealbox com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] help me pls

2005-09-20 Thread suresh ramasamy
hi this following  words from Dinesh did worked for me.  Thanks guys
and really appreciate your help and advice.

PG_DATA=/usr/local/pgsql/data

regards
suresh

---(end of broadcast)---
TIP 6: explain analyze is your friend