[GENERAL] Problem with index not being chosen inside PL/PgSQL function...

2007-12-18 Thread Weber, Geoffrey M.
Hi all,

Thanks to those on this list who contribute their knowledge for our 
consumption!  I have another question:

The problem I'm having is in one particular spot where I'm trying to run a 
parametized query inside a PL/PgSQL function.  When I run the query directly, 
the planner correctly chooses to use an index.  When I run the query with 
hard-coded values inside the function, the planner uses the index again.  
However, when I try to pass in one of the values as a parameter, the query 
suddenly takes 45+ seconds to run (and is obviously no longer using the query). 
 Inside the function, I've tried using bound cursors, unbound cursors, and a 
direct query as shown in the test_unlock function below.


Here are the details:

EXPLAIN ANALYZE SELECT ah.* FROM alert ah WHERE ( (ah.replaced_by_id = '0') AND 
(ah.not_displayed_id = '7714598') );

Index Scan using idx_acurr on alert ah  (cost=0.00..4.44 rows=1 width=768) 
(actual time=61.100..61.100 rows=0 loops=1)
  Index Cond: ((replaced_by_id = 0) AND (not_displayed_id = 7714598))
Total runtime: 61.459 ms


CREATE OR REPLACE FUNCTION test_unlock ( id_locked alert.id%TYPE ) RETURNS 
alert.id%TYPE AS $test_unlock$
DECLARE
last_alert  alert%ROWTYPE;
BEGIN
RAISE NOTICE 'Fetching data...';
SELECT ah.* INTO last_alert FROM alert ah where ( (ah.replaced_by_id = 
'0') AND (ah.not_displayed_id = id_locked ) );
RAISE NOTICE 'Data fetched...';

IF NOT FOUND THEN
RAISE NOTICE 'No locked out alert was found!';
ELSE
RAISE NOTICE 'Alert id % was found!', last_alert.id;
END IF;

RETURN last_alert.id;
END;
$test_unlock$ LANGUAGE plpgsql;

SELECT * FROM test_unlock( '7714598');


Using the function, I get a 45+ second delay between the fetching notice and 
the fetched notice.  Is there some way I can help the planner out further?  I 
ever tried adding an 'ORDER BY replaced_by_id, not_displayed_id to help it 
find the index, but no luck there...

Thanks in advance for any help!

NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying, or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...

2007-12-18 Thread Weber, Geoffrey M.
Hmm - good question!  However, it is - both the id and not_displayed_id are 
INTEGERs.  Changing the function header to:

CREATE OR REPLACE FUNCTION test_unlock ( id_locked alert.not_displayed_id%TYPE 
) RETURNS alert.id%TYPE AS $test_unlock$

sadly doesn't affect the performance at all.  I should have been a little more 
careful with the datatypes there, but this was a temporary function used to 
help me debug the problem and also help show it to the world.  The original 
function has a bit more to it and is called by a higher-level function, but 
I've tracked the slowness down to this issue :)...

Just for grins, I also changed the query to:

SELECT ah.* INTO last_alert FROM alert ah where ( (ah.replaced_by_id = 
'0') AND (not_displayed_id = id_locked::INTEGER ) ) ORDER BY replaced_by_id, 
not_displayed_id;

Still no improvement :(.

Thanks for the suggestion though!



From: Tom Lane [EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 10:11 AM
To: Weber, Geoffrey M.
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL  
 function...

Weber, Geoffrey M. [EMAIL PROTECTED] writes:
 The problem I'm having is in one particular spot where I'm trying to
 run a parametized query inside a PL/PgSQL function.

I wonder whether the parameter is actually of the same datatype as the
indexed column.

regards, tom lane

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
__



NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.




NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying, or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...

2007-12-18 Thread Weber, Geoffrey M.
Tom,

Yes, the distribution must be what's doing it.  I guess I knew that 
subconciously, but was looking for something like hints to force the planner to 
do what I wanted.  Instead it looks like I'll have to do a bit of tweaking with 
my indexes.  Probably a partial index on the 'not_displayed_id' column.  It'll 
be very small and shouldn't cause much overhead.  I was trying to keep my index 
count down, and have had a dual-column index on (replaced_by_id, 
not_displayed_id) to this point.

Thanks once again for your help!



From: Tom Lane [EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 10:36 AM
To: Weber, Geoffrey M.
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL  
 function...

Weber, Geoffrey M. [EMAIL PROTECTED] writes:
 Hmm - good question!  However, it is - both the id and
 not_displayed_id are INTEGERs.

Well, in that case it must be a statistics issue --- does the indexed
column have a badly skewed distribution?

You could investigate how many rows the planner thinks will be fetched
via

PREPARE foo(int) AS
SELECT ah.* FROM alert ah where ( (ah.replaced_by_id = '0') AND 
(not_displayed_id = $1 ) ) ORDER BY replaced_by_id, not_displayed_id;

EXPLAIN EXECUTE foo(42);

which will set up exactly the same planning situation as occurs in the
plpgsql function: no knowledge of the exact value being compared to.

regards, tom lane

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
__



NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.




NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying, or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...

2007-12-18 Thread Weber, Geoffrey M.
OK - in that same vain, I thought of something even better - using dynamic SQL 
instead.  It sped things up right away!

Thanks for putting me on the right track!



From: Scott Marlowe [EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 12:02 PM
To: Weber, Geoffrey M.
Cc: Tom Lane; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL  
 function...

On Dec 18, 2007 10:54 AM, Weber, Geoffrey M.
[EMAIL PROTECTED] wrote:
 Tom,

 Yes, the distribution must be what's doing it.  I guess I knew that 
 subconciously, but was looking for something like hints to force the planner 
 to do what I wanted.  Instead it looks like I'll have to do a bit of tweaking 
 with my indexes.  Probably a partial index on the 'not_displayed_id' column.  
 It'll be very small and shouldn't cause much overhead.  I was trying to keep 
 my index count down, and have had a dual-column index on (replaced_by_id, 
 not_displayed_id) to this point.

Fix not with a hammer that which you can fix with a screwdriver.  Fix
not with a screwdriver that which you can fix with a knob

Have you tried increasing the stats target of the guilty column and
reanalyzing to see if that helps?

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
__



NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.




NOTICE: This electronic mail transmission may contain confidential information 
and is intended only for the person(s) named.  Any use, copying, or disclosure 
by any other person is strictly prohibited. If you have received this 
transmission in error, please notify the sender via e-mail.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] Continual Postgres headaches...

2007-12-06 Thread Weber, Geoffrey M.
I've been trying for quite a while to get Postgresql tuned for use as an
OLTP system.  I have several PL/pgSQL functions that handle inserts and
updates to the main table and several near-real-time daemons written that
access the data and can take automated actions on it (email/page concerned
people, get complimentary information from a different system, etc.).  I
started with Postgres 8.1 and am now using 8.2.4 (and have been since its
release).  I'll try to provide enough information for a decent response, but
as I can't obviously put my entire schema and database out there, I'm hoping
that I can get some decent guidelines beyond that what I've found though
Google, etc. to get this thing tuned better.

 

Most of the data centers in on a central table and has 23 columns, 1
constraint, and 9 indexes.  4 of the indexes are partial.  The table usually
contains about 3-4 million rows, but I've cut it down to 1.2 million (cut
out 2/3 of the data) in an effort to migrate the database to a 2nd sever for
more testing.  The two partial indexes used the most: 242MB accessed nearly
constantly, and 15MB accessed every 5 seconds - but also updated constantly
via inserts using the 242MB index.  Other than one other 25MB index, the
others seem to average around 300MB each, but these aren't used quite as
often (usually about every minute or so).

 

My problems really are with performance consistency.  I have tweaked the
execution so that everything should run with sub-second execution times, but
even after everything is running well, I can get at most a week or two of
steady running before things start to degrade.

 

In my efforts to correct these consistency in execution problems, I have
gone from vacuuming (with analyze) twice a day to every 30 minutes (how long
it takes a vacuum analyze to run - another seeming problem because it
shouldn't take so long?).  I've done a VACUUM FULL and that sometimes helps,
but obviously is not a long-term solution for a true OLTP system.  Why I
said sometimes helps: earlier this week I did a VACUUM FULL, VACUUM
ANALYZE on the entire schema. A query that was supposed to use the 25MB
index above (that matched the parameters of the partial index exactly) was
still not doing so.  I had to DROP and re-CREATE the index (and do another
ANALYZE) to get the database to (correctly) use the index.

 

Another problem: sometimes I get these unexplainable delays in using a
'stored procedure' even though all its SQL calls run just fine individually.
For example, calling a particular function will take 1 minute to execute
even though entering the SQL commands individually through psql will total
up to about 1 second.  When I log 'long-duration queries', I only see the
top function call in the log.  When I put 'RAISE NOTICE' statements in the
various PL/pgSQL functions, I only see them displayed for the function I
call directly; any underlying functions called from that function does not
show any of their NOTICE statements.  Because of this I can't track down
where the delays are occurring for me to try and correct the problem (not
that there should be one in the first place)!  I should also note that there
is not any lock contention (double-checked with pgadmin3) showing that would
artificially delay the queries...

 

I've played with quite a few of the parameters like vacuum_cost_delay,
work_mem, shared_buffers, turned fsync off, etc. in the postgresql.conf
file.  The server I've been given to use is quite old (read: not the fastest
anymore), but normally shows a load average of 0.25.to 0.50 or so.

 

I guess my questions boil down to what I can do other than use the EXPLAIN,
EXPLAIN ANALYZE, VACUUM ANALYZE, etc. to get my schema performing acceptably
across the board in a consistent manner.  If there is more information that
I can give out that would improve responses, please let me know.

 

Thanks in advance for any help you send my way!

 

 

 




NOTICE: This electronic mail transmission may contain confidential
information and is intended only for the person(s) named.  Any use, copying
or disclosure by any other person is strictly prohibited. If you have
received this transmission in error, please notify the sender via e-mail.





[GENERAL] Triggers: Detecting if a column value is explicitly set in an UPD ATE statement or not... (IS NULL not working?)

2007-08-01 Thread Weber, Geoffrey M.
I'm having a problem, and can't seem to find a good answer in the mailing
list archives... sorry if I'm missing something obvious!

Postgres version: 8.2.4
O/S: Solaris 10


I want to set a BOOLEAN column value to FALSE by default for all INSERT and
UPDATE statements performed against a particular table _UNLESS_ it's
explicitly set to TRUE in the SQL statement.  Here is the trigger I created:


CREATE TABLE table1 ( id INTEGER PRIMARY KEY, data VARCHAR(64), b_flag
BOOLEAN);

CREATE OR REPLACE FUNCTION func1() RETURNS trigger AS $func1$
BEGIN
 RAISE NOTICE 'NEW.b_flag=%', NEW.b_flag;
IF ( NEW.b_flag IS NULL ) THEN
NEW.b_flag := FALSE;
END IF;

RETURN NEW;
END;
$func1$ LANGUAGE plpgsql;

CREATE TRIGGER func1 BEFORE INSERT OR UPDATE ON table1 FOR EACH ROW EXECUTE
PROCEDURE func1();



What I get, however, is that for an UPDATE, the NEW.b_flag value evaluates
to TRUE if it has been set to TRUE by a previous UPDATE that explicitly sets
it.  So...

tqa= INSERT INTO table1 VALUES ( '1', 'some data');
NOTICE:  NEW.b_flag=NULL
INSERT 0 1
tqa= SELECT b_flag FROM table1 WHERE id='1';
 b_flag

 f
(1 row)

tqa= UPDATE table1 SET b_flag=TRUE where id='1';
NOTICE:  NEW.b_flag=t
UPDATE 1
tqa= SELECT b_flag FROM table1 WHERE id='1';
 b_flag

 t
(1 row)

tqa= UPDATE table1 SET data='new data' where id='1';
NOTICE:  NEW.b_flag=t
UPDATE 1
tqa= SELECT b_flag FROM table1 WHERE id='1';
 b_flag

 t
(1 row)



As you can see, I put a RAISE NOTICE and verified that for some reason, the
NEW row contains a 'TRUE' value for b_flag, even though I didn't explicitly
set it in the last UPDATE statement.  Why does it seem to be reading the
value from the OLD row for that column unless I override it inside the SQL
statement?  Is there any way to achieve the desired result without having to
explicitly set 'b_flag' each time I touch a row in the table?




NOTICE: This electronic mail transmission may contain confidential
information and is intended only for the person(s) named.  Any use, copying
or disclosure by any other person is strictly prohibited. If you have
received this transmission in error, please notify the sender via e-mail.