[SQL] DROP IF ...
PostgreSQL 7.4 ... I'm trying to find a way to drop a table via SQL if it contains 0 rows. Here was my thought: CREATE OR REPLACE FUNCTION dropif(text, bool) RETURNS bool AS 'DECLARE tblname ALIAS FOR $1; condition ALIAS FOR $2; BEGIN IF (condition) THEN EXECUTE(\'DROP TABLE "\' || tblname || \'";\'); END IF; RETURN \'t\'::bool; END;' LANGUAGE 'plpgsql' VOLATILE; ... then ... BEGIN; CREATE TABLE testtbl (i int4); SELECT dropif('testtbl',(SELECT count(*)::int4 FROM testtbl) = 0); ERROR: relation 286000108 is still open CONTEXT: PL/pgSQL function "dropif" line 6 at execute statement ... It makes sense. The select is still open when the table is going to be dropped. I need a different strategy. Please advise! CG __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] Clustering problem
I have what I call a "dictionary" table which supports a "master table". This dictionary table is designed to hold generic data : CREATE TABLE sup_data (link_id uniqueidentifier, field_name varchar(255), field_data text) WITH OIDS; ... It works well when you're digging into it to pull the supplementary information for a small number of rows in the master table. It uses an index on the link_id, and can jump right to the few pages for the supplemental data. That was the design. Now "the powers that be" want to do some aggreate inquiries on subsets of the generic data, based on many rows from the master table. This doesn't work so well... Its having to pull many pages to create the result set to aggreate on. If I could cluster the generic data to match the clustering on the "master table" it would reduce the number of pulled pages considerably and the speedup would make it work well. I'm trying to avoid replicating the column and index used to cluster the main table in this dictionary table. Is it even possible to cluster a table based on the clustering scheme (which is not the link_id ...) from the master table? Can you gurus think of a better strategy? :) (Please??) :) CG Sell on Yahoo! Auctions no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Clustering problem
Why would you then drop the index? Performance and storage issues? I imagine that I would cluster the table at regular intervals to maintain the ordering, so I'd need to to keep the index around, yes? --- PFC <[EMAIL PROTECTED]> wrote: > > > > Is it even possible to cluster a table based on the clustering scheme > > (which is > > not the link_id ...) from the master table? > > > > Can you gurus think of a better strategy? :) (Please??) :) > > You can create a functional index on a function which returns the > desired > order by looking in the main table, cluster it, then drop the index... > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > Sell on Yahoo! Auctions no fees. Bid on great items. http://auctions.yahoo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[SQL] empty set
PostgreSQL 8.1 I've been trying to write a SQL prepare routine. One of the challenging elements I'm running into is an empty set ... "select foo from bar where foo in ? ;" What if "?" is an set with zero elements? What is the proper value to use to replace "?" indicating an empty set? CG __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] empty set
"select 1 where false" does indeed indicate an empty set. I was hoping for something more elegant, but I'll take what I can get. :) --- Bruno Wolff III <[EMAIL PROTECTED]> wrote: > On Thu, Jun 08, 2006 at 14:40:12 -0700, > CG <[EMAIL PROTECTED]> wrote: > > PostgreSQL 8.1 > > > > I've been trying to write a SQL prepare routine. One of the challenging > > elements I'm running into is an empty set ... > > > > "select foo from bar where foo in ? ;" > > > > What if "?" is an set with zero elements? What is the proper value to use > to > > replace "?" indicating an empty set? > > Something like the following suggests you can put a subquery there that > returns > 0 rows. I don't think that wil work for prepared queries though; so it may > not > help you. > > bruno=> select * from test where test in (select 1 where false); > test > -- > (0 rows) > > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] Nested loops are killing throughput
Postgresql 8.1 I've tried turning off nested loops to see what the query planner would choose instead of nested loops. It chose a hash join, and it still had a nested loop in the mix! How can I entice the query planner to use a simpler join scheme? What criteria is used to determine whether or not a merge join will be used? Here's an explain analyze... Subquery Scan foo (cost=762.21..762.50 rows=1 width=12) (actual time=10784.849..10786.992 rows=9 loops=1) -> GroupAggregate (cost=762.21..762.24 rows=1 width=30) (actual time=10767.702..10768.539 rows=9 loops=1) -> Sort (cost=762.21..762.21 rows=1 width=30) (actual time=10767.525..10767.989 rows=360 loops=1) Sort Key: (dpdl.field_value)::timestamp with time zone -> Nested Loop (cost=2.01..762.20 rows=1 width=30) (actual time=283.824..10749.007 rows=360 loops=1) -> Nested Loop (cost=0.00..744.28 rows=1 width=16) (actual time=31.227..6980.765 rows=5436 loops=1) -> Nested Loop (cost=0.00..135.29 rows=101 width=16) (actual time=25.514..273.660 rows=5436 loops=1) -> Index Scan using ut_company_name_idx on user_table ut (cost=0.00..21.96 rows=5 width=11) (actual time=6.596..6.649 rows=9 loops=1) Index Cond: ((company_name)::text = Acme, Inc.'::text) -> Index Scan using packet_user_idx on packet dp (cost=0.00..19.89 rows=222 width=27) (actual time=16.939..28.025 rows=604 loops=9) Index Cond: ((dp.username)::text = ("outer".username)::text) Filter: (trans_date > (date_trunc('month'::text, (now() - '1 mon'::interval)) - '1 year 6 mons'::interval)) -> Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.00..6.02 rows=1 width=16) (actual time=1.226..1.228 rows=1 loops=5436) Index Cond: (dps.packet_uuid = "outer".packet_uuid) Filter: ((status & 2) = 0) -> Bitmap Heap Scan on packet_datalink dpdl (cost=2.01..17.87 rows=4 width=30) (actual time=0.685..0.688 rows=0 loops=5436) Recheck Cond: (dpdl.packet_uuid = "outer".packet_uuid) Filter: (((field_name)::text = 'event_date'::text) AND ((field_value)::date >= ('now'::text)::date)) -> Bitmap Index Scan on packet_dl_puuid_idx (cost=0.00..2.01 rows=4 width=0) (actual time=0.402..0.402 rows=4 loops=5436) Index Cond: (dpdl.packet_uuid = "outer".packet_uuid) Total runtime: 10787.198 ms Also, no one here can figure out why the row count prediction is off on the packet table either. It's vacuumed and analyzed. ... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] Nested loops are killing throughput
- Original Message From: Tom Lane <[EMAIL PROTECTED]> To: CG <[EMAIL PROTECTED]> Cc: pgsql-sql@postgresql.org Sent: Tuesday, September 19, 2006 11:03:07 AM Subject: Re: [SQL] Nested loops are killing throughput CG <[EMAIL PROTECTED]> writes: > -> Nested Loop (cost=0.00..744.28 rows=1 width=16) > (actual time=31.227..6980.765 rows=5436 loops=1) > -> Nested Loop (cost=0.00..135.29 rows=101 > width=16) (actual time=25.514..273.660 rows=5436 loops=1) > ... > -> Index Scan using packet_status_puuid_pkey on > packet_status dps (cost=0.00..6.02 rows=1 width=16) (actual > time=1.226..1.228 rows=1 loops=5436) > Index Cond: (dps.packet_uuid = > "outer".packet_uuid) > Filter: ((status & 2) = 0) >One problem you've got is that the planner has no stats about the >selectivity of that status condition. My advice would be to forget the >cute bitmask and store one or more plain boolean columns. Easier to >write queries against and way more tractable statistically. n/p ... I'll ditch it. Even if I simplify that query down to a straight-forward example... select dp.*, dps.status from dpo.packet dp, dpo.packet_status dps where dp.packet_uuid = dps.packet_uuid and dp.username='joeuser'; I'm still getting the nested loop for a join. Nested Loop (cost=1.00..100013378.98 rows=2206 width=145) (actual time=46.743..18202.318 rows=2225 loops=1) -> Index Scan using packet_user_idx on packet dp (cost=0.00..88.03 rows=2206 width=125) (actual time=42.263..124.519 rows=2225 loops=1) Index Cond: ((username)::text = 'joeuser'::text) -> Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.00..6.01 rows=1 width=20) (actual time=8.115..8.117 rows=1 loops=2225) Index Cond: ("outer".packet_uuid = dps.packet_uuid) Total runtime: 18205.880 ms These indexes are being used, and the tables are freshly vacuum-analyzed... CREATE UNIQUE INDEX packet_puuid_idx ON dpo.packet USING btree (packet_uuid); CREATE INDEX packet_user_idx ON dpo.packet USING btree (username); CREATE UNIQUE INDEX packet_status_puuid_pkey ON dpo.packet_status USING btree (packet_uuid); I have no idea why I'm having such a difficult time digging through this data. It should be very straight-forward. CG ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Nested loops are killing throughput
>If packet_status is large, that seems like a perfectly reasonable plan >to me. If not ... what data type is packet_uuid? Is its equality op >marked mergeable or hashable? It is of type uniqueidentifier ... CREATE OPERATOR =( PROCEDURE = uniqueidentifier_eq, LEFTARG = uniqueidentifier, RIGHTARG = uniqueidentifier, COMMUTATOR = =, RESTRICT = eqsel, JOIN = eqjoinsel, HASHES, MERGES, SORT1 = <, SORT2 = <, LTCMP = <, GTCMP = >); It looks like it is marked both mergable and hashable. So what size (I have no idea what units to use to think about table size... kilobytes? pages? rows?) is "large" ... It seems like we recently stepped over that imaginary line between medium and large. We could re-tune PostgreSQL... What settings would make sense to tweak? Sort Memory? We could partition the tables... I understand the theory behind table partitioning and partial indexes, but I've never come across a howto or even a best practices list. We could remove data from the table... I don't like this as much, but if that's the only way, that's the only way. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Nested loops are killing throughput
>Is that the one off gborg? It's broken because the equality function is >marked volatile, and so the planner is afraid to try to use it for >merging or hashing. (It's also not marked strict, which means you can >trivially crash the backend by passing it a null ...) It is the one from gborg. That's good to know. So to fix, I should alter the supporting functions and set the ones that should not be VOLATILE to IMMUTABLE (newid() should be VOLATILE and that's about it) and set them to STRICT as well. I haven't had a problem with it crashing my backend in the three years we've been using it, but I'd rather not take any chances. I know the hacker list has been batting back and forth the idea of a built-in UUID type. I'll be interested to see if you end up with a standard UUID type, a fixed field type with multiple generators for differing levels of paranoia, or the status quo. CG ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] Nested loops are killing throughput
>Is that the one off gborg? It's broken because the equality function is >marked volatile, and so the planner is afraid to try to use it for >merging or hashing. (It's also not marked strict, which means you can >trivially crash the backend by passing it a null ...) Tom, you are a genius. I changed the supporting functions and started getting what I expected. You were right on the money and saved my bacon. I don't say it often enough: Thanks! ---(end of broadcast)--- TIP 6: explain analyze is your friend