[GENERAL] Problem with index not being chosen inside PL/PgSQL function...
Hi all, Thanks to those on this list who contribute their knowledge for our consumption! I have another question: The problem I'm having is in one particular spot where I'm trying to run a parametized query inside a PL/PgSQL function. When I run the query directly, the planner correctly chooses to use an index. When I run the query with hard-coded values inside the function, the planner uses the index again. However, when I try to pass in one of the values as a parameter, the query suddenly takes 45+ seconds to run (and is obviously no longer using the query). Inside the function, I've tried using bound cursors, unbound cursors, and a direct query as shown in the test_unlock function below. Here are the details: EXPLAIN ANALYZE SELECT ah.* FROM alert ah WHERE ( (ah.replaced_by_id = '0') AND (ah.not_displayed_id = '7714598') ); Index Scan using idx_acurr on alert ah (cost=0.00..4.44 rows=1 width=768) (actual time=61.100..61.100 rows=0 loops=1) Index Cond: ((replaced_by_id = 0) AND (not_displayed_id = 7714598)) Total runtime: 61.459 ms CREATE OR REPLACE FUNCTION test_unlock ( id_locked alert.id%TYPE ) RETURNS alert.id%TYPE AS $test_unlock$ DECLARE last_alert alert%ROWTYPE; BEGIN RAISE NOTICE 'Fetching data...'; SELECT ah.* INTO last_alert FROM alert ah where ( (ah.replaced_by_id = '0') AND (ah.not_displayed_id = id_locked ) ); RAISE NOTICE 'Data fetched...'; IF NOT FOUND THEN RAISE NOTICE 'No locked out alert was found!'; ELSE RAISE NOTICE 'Alert id % was found!', last_alert.id; END IF; RETURN last_alert.id; END; $test_unlock$ LANGUAGE plpgsql; SELECT * FROM test_unlock( '7714598'); Using the function, I get a 45+ second delay between the fetching notice and the fetched notice. Is there some way I can help the planner out further? I ever tried adding an 'ORDER BY replaced_by_id, not_displayed_id to help it find the index, but no luck there... Thanks in advance for any help! NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying, or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...
Hmm - good question! However, it is - both the id and not_displayed_id are INTEGERs. Changing the function header to: CREATE OR REPLACE FUNCTION test_unlock ( id_locked alert.not_displayed_id%TYPE ) RETURNS alert.id%TYPE AS $test_unlock$ sadly doesn't affect the performance at all. I should have been a little more careful with the datatypes there, but this was a temporary function used to help me debug the problem and also help show it to the world. The original function has a bit more to it and is called by a higher-level function, but I've tracked the slowness down to this issue :)... Just for grins, I also changed the query to: SELECT ah.* INTO last_alert FROM alert ah where ( (ah.replaced_by_id = '0') AND (not_displayed_id = id_locked::INTEGER ) ) ORDER BY replaced_by_id, not_displayed_id; Still no improvement :(. Thanks for the suggestion though! From: Tom Lane [EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 10:11 AM To: Weber, Geoffrey M. Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function... Weber, Geoffrey M. [EMAIL PROTECTED] writes: The problem I'm having is in one particular spot where I'm trying to run a parametized query inside a PL/PgSQL function. I wonder whether the parameter is actually of the same datatype as the indexed column. regards, tom lane __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying, or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...
Tom, Yes, the distribution must be what's doing it. I guess I knew that subconciously, but was looking for something like hints to force the planner to do what I wanted. Instead it looks like I'll have to do a bit of tweaking with my indexes. Probably a partial index on the 'not_displayed_id' column. It'll be very small and shouldn't cause much overhead. I was trying to keep my index count down, and have had a dual-column index on (replaced_by_id, not_displayed_id) to this point. Thanks once again for your help! From: Tom Lane [EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 10:36 AM To: Weber, Geoffrey M. Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function... Weber, Geoffrey M. [EMAIL PROTECTED] writes: Hmm - good question! However, it is - both the id and not_displayed_id are INTEGERs. Well, in that case it must be a statistics issue --- does the indexed column have a badly skewed distribution? You could investigate how many rows the planner thinks will be fetched via PREPARE foo(int) AS SELECT ah.* FROM alert ah where ( (ah.replaced_by_id = '0') AND (not_displayed_id = $1 ) ) ORDER BY replaced_by_id, not_displayed_id; EXPLAIN EXECUTE foo(42); which will set up exactly the same planning situation as occurs in the plpgsql function: no knowledge of the exact value being compared to. regards, tom lane __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying, or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function...
OK - in that same vain, I thought of something even better - using dynamic SQL instead. It sped things up right away! Thanks for putting me on the right track! From: Scott Marlowe [EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 12:02 PM To: Weber, Geoffrey M. Cc: Tom Lane; pgsql-general@postgresql.org Subject: Re: [GENERAL] Problem with index not being chosen inside PL/PgSQL function... On Dec 18, 2007 10:54 AM, Weber, Geoffrey M. [EMAIL PROTECTED] wrote: Tom, Yes, the distribution must be what's doing it. I guess I knew that subconciously, but was looking for something like hints to force the planner to do what I wanted. Instead it looks like I'll have to do a bit of tweaking with my indexes. Probably a partial index on the 'not_displayed_id' column. It'll be very small and shouldn't cause much overhead. I was trying to keep my index count down, and have had a dual-column index on (replaced_by_id, not_displayed_id) to this point. Fix not with a hammer that which you can fix with a screwdriver. Fix not with a screwdriver that which you can fix with a knob Have you tried increasing the stats target of the guilty column and reanalyzing to see if that helps? __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying, or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Continual Postgres headaches...
I've been trying for quite a while to get Postgresql tuned for use as an OLTP system. I have several PL/pgSQL functions that handle inserts and updates to the main table and several near-real-time daemons written that access the data and can take automated actions on it (email/page concerned people, get complimentary information from a different system, etc.). I started with Postgres 8.1 and am now using 8.2.4 (and have been since its release). I'll try to provide enough information for a decent response, but as I can't obviously put my entire schema and database out there, I'm hoping that I can get some decent guidelines beyond that what I've found though Google, etc. to get this thing tuned better. Most of the data centers in on a central table and has 23 columns, 1 constraint, and 9 indexes. 4 of the indexes are partial. The table usually contains about 3-4 million rows, but I've cut it down to 1.2 million (cut out 2/3 of the data) in an effort to migrate the database to a 2nd sever for more testing. The two partial indexes used the most: 242MB accessed nearly constantly, and 15MB accessed every 5 seconds - but also updated constantly via inserts using the 242MB index. Other than one other 25MB index, the others seem to average around 300MB each, but these aren't used quite as often (usually about every minute or so). My problems really are with performance consistency. I have tweaked the execution so that everything should run with sub-second execution times, but even after everything is running well, I can get at most a week or two of steady running before things start to degrade. In my efforts to correct these consistency in execution problems, I have gone from vacuuming (with analyze) twice a day to every 30 minutes (how long it takes a vacuum analyze to run - another seeming problem because it shouldn't take so long?). I've done a VACUUM FULL and that sometimes helps, but obviously is not a long-term solution for a true OLTP system. Why I said sometimes helps: earlier this week I did a VACUUM FULL, VACUUM ANALYZE on the entire schema. A query that was supposed to use the 25MB index above (that matched the parameters of the partial index exactly) was still not doing so. I had to DROP and re-CREATE the index (and do another ANALYZE) to get the database to (correctly) use the index. Another problem: sometimes I get these unexplainable delays in using a 'stored procedure' even though all its SQL calls run just fine individually. For example, calling a particular function will take 1 minute to execute even though entering the SQL commands individually through psql will total up to about 1 second. When I log 'long-duration queries', I only see the top function call in the log. When I put 'RAISE NOTICE' statements in the various PL/pgSQL functions, I only see them displayed for the function I call directly; any underlying functions called from that function does not show any of their NOTICE statements. Because of this I can't track down where the delays are occurring for me to try and correct the problem (not that there should be one in the first place)! I should also note that there is not any lock contention (double-checked with pgadmin3) showing that would artificially delay the queries... I've played with quite a few of the parameters like vacuum_cost_delay, work_mem, shared_buffers, turned fsync off, etc. in the postgresql.conf file. The server I've been given to use is quite old (read: not the fastest anymore), but normally shows a load average of 0.25.to 0.50 or so. I guess my questions boil down to what I can do other than use the EXPLAIN, EXPLAIN ANALYZE, VACUUM ANALYZE, etc. to get my schema performing acceptably across the board in a consistent manner. If there is more information that I can give out that would improve responses, please let me know. Thanks in advance for any help you send my way! NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail.
[GENERAL] Triggers: Detecting if a column value is explicitly set in an UPD ATE statement or not... (IS NULL not working?)
I'm having a problem, and can't seem to find a good answer in the mailing list archives... sorry if I'm missing something obvious! Postgres version: 8.2.4 O/S: Solaris 10 I want to set a BOOLEAN column value to FALSE by default for all INSERT and UPDATE statements performed against a particular table _UNLESS_ it's explicitly set to TRUE in the SQL statement. Here is the trigger I created: CREATE TABLE table1 ( id INTEGER PRIMARY KEY, data VARCHAR(64), b_flag BOOLEAN); CREATE OR REPLACE FUNCTION func1() RETURNS trigger AS $func1$ BEGIN RAISE NOTICE 'NEW.b_flag=%', NEW.b_flag; IF ( NEW.b_flag IS NULL ) THEN NEW.b_flag := FALSE; END IF; RETURN NEW; END; $func1$ LANGUAGE plpgsql; CREATE TRIGGER func1 BEFORE INSERT OR UPDATE ON table1 FOR EACH ROW EXECUTE PROCEDURE func1(); What I get, however, is that for an UPDATE, the NEW.b_flag value evaluates to TRUE if it has been set to TRUE by a previous UPDATE that explicitly sets it. So... tqa= INSERT INTO table1 VALUES ( '1', 'some data'); NOTICE: NEW.b_flag=NULL INSERT 0 1 tqa= SELECT b_flag FROM table1 WHERE id='1'; b_flag f (1 row) tqa= UPDATE table1 SET b_flag=TRUE where id='1'; NOTICE: NEW.b_flag=t UPDATE 1 tqa= SELECT b_flag FROM table1 WHERE id='1'; b_flag t (1 row) tqa= UPDATE table1 SET data='new data' where id='1'; NOTICE: NEW.b_flag=t UPDATE 1 tqa= SELECT b_flag FROM table1 WHERE id='1'; b_flag t (1 row) As you can see, I put a RAISE NOTICE and verified that for some reason, the NEW row contains a 'TRUE' value for b_flag, even though I didn't explicitly set it in the last UPDATE statement. Why does it seem to be reading the value from the OLD row for that column unless I override it inside the SQL statement? Is there any way to achieve the desired result without having to explicitly set 'b_flag' each time I touch a row in the table? NOTICE: This electronic mail transmission may contain confidential information and is intended only for the person(s) named. Any use, copying or disclosure by any other person is strictly prohibited. If you have received this transmission in error, please notify the sender via e-mail.