[SQL] bug in information_schema?
I noticed that it seemed a bit slow to query information_schema.view_column_usage. As I look at the code in information_schema.sql, I'm not sure why pg_user is referenced twice (once without an alias). It looks like we can take out the first pg_user and remove the DISTINCT keyword and this improves the efficiency significantly. It seems to return the same result but in half the time. Anyone see a problem with this? (The same problem may also be in view_table_usage but I haven't done any testing there yet.) Code from information_schema.sql: CREATE VIEW view_column_usage AS SELECT DISTINCT CAST(current_database() AS sql_identifier) AS view_catalog, CAST(nv.nspname AS sql_identifier) AS view_schema, CAST(v.relname AS sql_identifier) AS view_name, CAST(current_database() AS sql_identifier) AS table_catalog, CAST(nt.nspname AS sql_identifier) AS table_schema, CAST(t.relname AS sql_identifier) AS table_name, CAST(a.attname AS sql_identifier) AS column_name FROM pg_user, pg_namespace nv, pg_class v, pg_depend dv, pg_depend dt, pg_class t, pg_namespace nt, pg_attribute a, pg_user u ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[SQL] Trigger on select?
Hey, anyone know if it is possible to fire a trigger before a select? I'm considering creating some tables which contain data summarized from other tables as kind of a cache mechanism. The hope is I can speed up some queries that get executed a lot (and are kind of slow) by maintaining data (like sums of big columns, for example). I was all ready to go and then I discovered that trigger-before-select is not supported. (Is it in any DB?) The idea is this: Any time I execute a query that would access the summary data, the "before select" trigger fires and goes out and builds any summary data missing from the summary table. When I do an insert,update,delete on the primary data table, another trigger fires that removes the applicable data from the summary table. This way, I only cache the information I need in the summary table, right before I need it. But it can stay there as long as the base information doesn't change so I don't have to redo the expensive operation of summarizing it any more often than necessary. Its kind of like an index in a way, but it is not maintained at insert/update time. Rather, it is updated as it is needed. Anyone have any ideas about how I can accomplish this? Kyle ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] information_schema problem
I'm trying to use information_schema.view_column_usage to determine the native table from which various view columns descend. This is so my interface can automatically generate the correct foreign key links from one view to another. But in the case where a view references two tables linked by a foreign key, the key fields are reported as belonging to both tables. The enclosed example shows two tables related by a foreign key relationship. The view "event_remind_v" really only produces fields from event_remind. But in order to satisfy the "exists" clause, it also references fields from the event table. view_column_usage reports the fields "own_id" and "seq" as belonging to both table "event_remind" and "event". My code needs a way to know that "event_remind" is the table they "really" come from. 1. Is this the correct behavior for information_schema.view_column_usage? Should it report a table/column as belonging to more than one table? If not, how can I fix it? The enclosed script includes a (slightly revised) version of view_column_usage that is easier to hack on than the one inside information_schema. 2. If information_schema.view_column_usage is working right, is there a way I can modify my local view_column_usage to distinguish between tables/columns that actually "belong" to the view and related columns from a foreign key relationship? Example code: - drop view event_remind_v; drop table event_remind; drop table event; drop view view_column_usage; -- Contains an entry for each scheduled calendar event create table event ( own_id int4, seq int4, status varchar, summary varchar, primary key (own_id,seq) ); -- Contains an entry for each reminder for each event create table event_remind ( own_id int4, seq int4, advance interval, primary key (own_id, seq, advance), foreign key (own_id, seq) references event on update cascade on delete cascade ); create view event_remind_v as select * from event_remind r where exists (select * from event where own_id = r.own_id and seq = r.seq and status = 'open'); ; create view view_column_usage as select v.relname as "view_name", t.relname as "table_name", at.attname as "column_name" from pg_depend dv, pg_class v, pg_namespace nv, pg_depend dt, pg_class t, pg_namespace nt, pg_attribute at where dv.objid = dt.objid and dv.refobjid <> dt.refobjid and dv.deptype = 'i' and v.relkind = 'v' and t.relkind IN ('r', 'v') and v.oid = dv.refobjid and t.oid = dt.refobjid and t.relnamespace = nt.oid and v.relnamespace = nv.oid and dv.classid= dt.classidand dv.classid= 'pg_catalog.pg_rewrite'::regclass and dv.refclassid = dt.refclassid and dv.refclassid = 'pg_catalog.pg_class'::regclass and t.oid = at.attrelid and dt.refobjsubid = at.attnum and nv.nspname = 'public' and nt.nspname = 'public' ; select view_name,column_name,table_name from view_column_usage where view_name = 'event_remind_v'; select view_name,column_name,table_name from information_schema.view_column_usage where view_name = 'event_remind_v'; ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[SQL] unplanned sub-select error?
I have a query: insert into mtr_reg_v_wt (ropnum, inum, pnum, rquant, value, status, ddate, fr_proj, to_proj) values (28985,1,1014,1,(select cost from prd_part where pnum = 1014),'work','2005-Nov-15',50,75); That used to work fine under 7.1.3 but now gives the error: ERROR: cannot handle unplanned sub-select Anyone know what this means? Is there a good reason why this update should no longer work? Or is this a bug? Kyle wyatterp.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] unplanned sub-select error?
Tom Lane wrote: Kyle Bateman <[EMAIL PROTECTED]> writes: I have a query: insert into mtr_reg_v_wt (ropnum, inum, pnum, rquant, value, status, ddate, fr_proj, to_proj) values (28985,1,1014,1,(select cost from prd_part where pnum = 1014),'work','2005-Nov-15',50,75); That used to work fine under 7.1.3 but now gives the error: ERROR: cannot handle unplanned sub-select You need to offer a little more context, like what PG version you are using now and what is the underlying DDL --- I suspect some rules or views are involved here, but you didn't show them to us. Sorry, you're right. I have now confirmed that this only happens when updating via a view/rule (as you suspected). Attached is a minimalist sql file that demonstrates the same error message from a blank database. I'm using 8.1.0. I'm pretty sure this problem did not exist on 8.0.3. Kyle -- Expose the "unplanned sub-select" error message create table parts ( partnum varchar(18) primary key, cost float8 ); create table shipped ( ttype char(2), ordnum int4, partnum varchar(18) references parts, value float8, primary key (ttype, ordnum) ); create view shipped_view as select * from shipped where ttype = 'wt'; create rule shipped_view_insert as on insert to shipped_view do instead insert into shipped (ttype, ordnum, partnum, value) values ('wt', new.ordnum, new.partnum, new.value); insert into parts (partnum, cost) values (1, 1234.56); insert into shipped_view (ordnum, partnum, value) values (100,1,(select cost from parts where partnum = 1)); ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] permissions from system catalogs
Can anyone point me where to look in the system catalogs (or information schema) to determine in advance whether a given user (or the current user) will have select permission to a given table or view? Kyle ---(end of broadcast)--- TIP 6: explain analyze is your friend
[SQL] Schema management tool
I just saw this very old post and thought I'd respond for the benefit of the list: Patrick, > Could someone give me some advice or point me to an article that would help > manage views? I have a view that I use in conjunction with other views, > and since I have to drop the view and then recreate it, I'm force to drop > and recreate all other views that the original view was used in. > TIA As I recall, the IT staff at Action Target, Inc. two years ago developed a non-free tool for this purpose. They were planning to start a spin-off company to market PostgreSQL admin tools, but I don't know how far they got with the idea. Try contacting Action Target's webmaster to see if you can find anything out: webmaster ( at ) actiontarget ( dot ) com We have a free (GPL) tool called "wyseman" that is a framework for authoring schemas in postgres. You create a meta-sql structure for each database object you want in the database. Then you call wyseman to drop and/or re-create the object, along with all its dependencies. This is a great tool for working on a live database. You can maintain a pristine, authoritative definition of your schema, and still migrate your live database along as you make changes to the design. When you want to modify a section of the schema that is deep within the tree of dependencies, wyseman will take care of dropping, rebuilding, and granting permissions on all the necessary objects in order to get into the part you want to change. Wyseman also takes care of building a "data dictionary" where you can query for things like printable titles, and pop-up helps for tables and columns in multiple languages. This is a nice layer to put on top of postgres and just under your ERP. These tools are accessible on wyatterp.com. Due to lack of time, I have not kept current software in the download area. We are using much later stuff in-house. But I would be happy to share more recent versions if there is any interest out there. It would be particularly helpful if anyone were willing to help maintain the website and (currently a sad excuse for) documentation. There is also a run-time library (wylib) for rapid construction of ERP applications. We have about 30 applications built in the framework that run our entire business. Kyle Bateman Action Target Inc.
[SQL] clock command regression in pltcl?
I have the following function defined: create function _date_week(int4,int4,int4) returns text language pltcl immutable as $$ return [clock format [clock scan "$2/$3/$1"] -format "%U"] $$; It worked fine in 8.3 but in 8.4 now I try to build an index using the function and get: SQL ERROR: In database query: begin; create index i_pay_req_empl_id_week on pay_req (empl_id,(date_week(wdate)));: ERROR: invalid command name "clock" CONTEXT: invalid command name "clock" invoked from within "clock scan "$2/$3/$1"" (procedure "__PLTcl_proc_12360682" line 3) invoked from within "__PLTcl_proc_12360682 2003 12 20" in PL/Tcl function "_date_week" PL/pgSQL function "date_week" line 13 at assignment Is this a regression or is there a reason the clock command is no longer accessible? Kyle -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Using bitmap index scans-more efficient
um)::text = "inner"."?column10?")) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=13543.42..13693.47 rows=60020 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_items i (cost=0.00..7197.27 rows=60020 width=55) Filter: (((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) AND (proj = ANY ($0))) -> Index Scan using vend_org_pkey on vend_org v (cost=0.00..145.52 rows=1799 width=26) -> Materialize (cost=3.54..3.55 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4) Filter: ((code)::text = 'ap'::text) --- The worst case is the following types of queries (about 5 seconds): select * from ledger where proj in (select prog_id from proj_prog where proj_id = 4737); select l.* from ledger l, proj_prog p where l.proj = p.prog_id and p.proj_id = 4737; --- Hash Join (cost=19032.47..23510.23 rows=6 width=85) Hash Cond: ("outer".proj = "inner".prog_id) -> Nested Loop Left Join (cost=18994.38..23378.41 rows=1700 width=85) -> Hash Join (cost=18990.84..23340.87 rows=1700 width=81) Hash Cond: ("outer".vendid = "inner".org_id) -> Merge Join (cost=18935.35..23255.64 rows=1700 width=63) Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?")) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=18935.35..19235.45 rows=120041 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_items i (cost=0.00..4152.99 rows=120041 width=55) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> Hash (cost=50.99..50.99 rows=1799 width=26) -> Seq Scan on vend_org v (cost=0.00..50.99 rows=1799 width=26) -> Materialize (cost=3.54..3.55 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=38.04..38.04 rows=21 width=4) -> Index Scan using proj_prog_pkey on proj_prog p (cost=0.00..38.04 rows=21 width=4) Index Cond: (proj_id = 4737) --- I would like to be able to get the best performance like in the first query but without having to enumerate the projects (i.e. using a single query). The secret seems to be the bitmap index scans. Any ideas about whether/how this can be done? Thanks! Kyle Bateman --- BTW, The ledger view is built roughly as follows: create view rp_v_api as select h.adate as adate, (i.price * i.quant)::numeric(14,2) as amount, substring(v.org_name from 1 for 40) as descr, i.proj as proj, i.acct as acct, 1 as cr_proj, a.acct_id as cr_acct from ( apinv_hdr h join apinv_itemsi on i.vendid = h.vendid and i.invnum = h.invnum join vend_org v on v.org_id = h.vendid left join acct a on a.code = 'ap' ) where i.status in ('en','cl','pd'); ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Using bitmap index scans-more efficient
Florian Weimer wrote: * Kyle Bateman: Any ideas about whether/how this can be done? If the project tree is fairly consistent, it's convenient to encode it using intervals instead of parent/child intervals. IIRC, Celko's "SQL for smarties" explains how to do this, and Kristian Koehntopp has written some PHP code to implement it. I agree that this produces a more efficient query for finding the projects that are the progeny of another project, but I'm trying to figure out how that helps me select the right project transactions from my ledger efficiently. This query produces wonderful results (very fast): select * from ledger where proj >= 4737 and proj <= 4740; But I'm assuming that using an interval-encoded project tree, I would have to do something like the following to get a progency group: select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.right < 2345; The problem (at least according to my initial testing) is that this forces a join of the entire ledger and I get my slowest performance group (5 seconds). What am I missing? Kyle ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Using bitmap index scans-more efficient
Tom Lane wrote: Kyle Bateman <[EMAIL PROTECTED]> writes: But I'm assuming that using an interval-encoded project tree, I would have to do something like the following to get a progency group: select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.right < 2345; btree has no idea about the constraint (that I imagine exists) that left <= right. If you're just doing a simple index on (left, right) then the above query requires scanning all index entries with left > 1234. It would probably help to say select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.left < 2345 and p.right < 2345; so that you can constrain the range of "left" values scanned. Thanks for the replies, Tom and Florian. My problem is not that it is difficult (or costly) to determine the progeny of a given project. I can determine this in about 90 msec regardless of whether I use an adjacency model, interval-encoding, or materialized path (current implementation). The problem is, when I try to extract the ledger entries belonging to that progeny from a set of a million records, it seems to need to process all million records rather than being able to index right into the ones I want. I'm not very good at reading explain output, but I tried to set up the query Tom suggests by creating an interval-encoded project table (proj_int) and then joining it to my ledger like so: select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft >= 5283 and i.lft < 5300 and i.rgt <= 5300; On my mini-test-ledger of 100,000 entries, this takes the longest time (5 seconds) with the following explain output: Hash Join (cost=19018.46..23411.52 rows=14 width=85) Hash Cond: ("outer".proj = "inner".proj_id) -> Nested Loop Left Join (cost=18994.38..23378.41 rows=1700 width=85) -> Hash Join (cost=18990.84..23340.87 rows=1700 width=81) Hash Cond: ("outer".vendid = "inner".org_id) -> Merge Join (cost=18935.35..23255.64 rows=1700 width=63) Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?")) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=18935.35..19235.45 rows=120041 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_items i (cost=0.00..4152.99 rows=120041 width=55) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> Hash (cost=50.99..50.99 rows=1799 width=26) -> Seq Scan on vend_org v (cost=0.00..50.99 rows=1799 width=26) -> Materialize (cost=3.54..3.55 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=24.06..24.06 rows=10 width=4) -> Bitmap Heap Scan on proj_int i (cost=2.26..24.06 rows=10 width=4) Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 5300)) -> Bitmap Index Scan on i_proj_int_lft_rgt (cost=0.00..2.26 rows=10 width=0) Index Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 5300)) That is roughly equivalent to my materialized path method: select l.* from ledger l where l.projin (select proj_id from proj_v where 4737 = any(ppath)); And is quite slow compared to 150 msec when enumerating the progeny projects like so: select l.* from ledger l where l.proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935); Nested Loop Left Join (cost=19.73..4164.10 rows=7 width=85) -> Nested Loop (cost=19.73..4139.08 rows=7 width=81) -> Nested Loop (cost=19.73..4100.07 rows=7 width=63) -> Bitmap Heap Scan on apinv_items i (cost=19.73..1185.71 rows=487 width=55) Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR (proj = 4933) OR (proj = 4934) OR (proj = 4935)) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> BitmapOr (cost=19.73..19.73 rows=495 width=0) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4737) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Re: [SQL] Using bitmap index scans-more efficient
Tom Lane wrote: Kyle Bateman <[EMAIL PROTECTED]> writes: But I'm assuming that using an interval-encoded project tree, I would have to do something like the following to get a progency group: select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.right < 2345; Here's an interesting result: I created a function proj_left(int4) that returns the left interval number for a given project. Then I created an index on the underlying table for the ledger view(which took forever to build) like so: create index i_test on apinv_items (proj_left(proj)); Now my query: select * from ledger where proj_left(dr_proj) >= 5283 and proj_left(dr_proj) < 5300; is very speedy. Problem is, I had to mark the function proj_left() as immutable, which it can not be since the left and right values for a given project will change any time a project is added, removed, or moved around the hierarchy :( So is there any good way to tell the planner to do several individual index scans for the projects involved in the desired progeny, or the results together and return the result? This is what it seems to be choosing in the case of the query: select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935); ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Using bitmap index scans-more efficient
Tom Lane wrote: Kyle Bateman <[EMAIL PROTECTED]> writes: I'm wondering if this might expose a weakness in the optimizer having to do with left joins. Before 8.2 the optimizer has no ability to rearrange the order of outer joins. Do you have time to try your test case against CVS HEAD? OK, I figured it out--grabbed the latest snapshot (hope that is what you need). My results are similar: select l.* from ledg_v1 l, proj p where l.proj = p.proj_id and 5 = p.par; (24 msec) Nested Loop (cost=0.00..1991.93 rows=480 width=23) -> Nested Loop (cost=0.00..4.68 rows=6 width=8) -> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Index Scan using i_proj_par on proj p (cost=0.00..3.49 rows=6 width=4) Index Cond: (5 = par) -> Index Scan using i_ledg_proj on ledg l (cost=0.00..330.17 rows=83 width=19) Index Cond: (l.proj = "outer".proj_id) select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = p.par; (1.25 sec) Hash Join (cost=4.63..16768.43 rows=480 width=23) Hash Cond: ("outer".proj = "inner".proj_id) -> Nested Loop Left Join (cost=1.13..14760.13 rows=40 width=23) -> Seq Scan on ledg l (cost=0.00..6759.00 rows=40 width=19) -> Materialize (cost=1.13..1.14 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=3.49..3.49 rows=6 width=4) -> Index Scan using i_proj_par on proj p (cost=0.00..3.49 rows=6 width=4) Index Cond: (5 = par) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[SQL] Finding context for error log
I get the following message in my postgres log: ERROR: 42601: syntax error at or near "%" at character 269 LOCATION: yyerror, scan.l:761 I have a fairly complex schema (lots of database objects) and many users hitting the db with lots of connections, so I'm not really sure what bit of code is causing the problem. Is there an easy way to get postgres to spit out the SQL statement it was parsing when it generated the error? I've experimented with log_error_verbosity and log_min_messages but still haven't seen any SQL in the log. Kyle ---(end of broadcast)--- TIP 6: explain analyze is your friend
[SQL] PG 8.2beta reordering working for this case?
Kyle Bateman wrote: Tom Lane wrote: Before 8.2 the optimizer has no ability to rearrange the order of outer joins. Do you have time to try your test case against CVS HEAD? I've done some more refinement on my accounting ledger system that has clarified some of the problems I was having with performance joining to a union. Here's an interesting test case. I just tried it with PG 8.2beta1. I don't claim to understand the new feature of reordering queries very well, but it seems like this is related to that feature. Since its still a performance problem in 8.2, I thought this might be a helpful test case during beta. To see this demo, run create.sql on a clean database. It will create all the needed tables, views and test data. Then run q1 and q2 which are very efficient. Then q3 is the slow one. The reason is, it does the union, producing 300,000 records before trying to select by project. It seems like the optimizer should internally rewrite the query to look more like what is in q4 (which is very fast). Is there a way to make the optimizer do this? Kyle Bateman -- Make test schema for demonstrating how the postgres optimizer might improve -- performance on joins with unions -- Contains a record for each project (job-costing) code -- Projects are arranged in a hierarchical structure (parent/child) -- -- create table proj ( proj_id int4 primary key, title varchar, par int4 references proj on update cascade ); create index i_proj_par on proj (par); -- Contains a record for every 2 combinations of projects which are related -- to each other in the hierarchical project tree -- (parent/child, ancestor/progenitor, etc.) -- -- create table proj_rel ( anst_id int4 references proj on update cascade on delete cascade, -- ancestor project number prog_id int4 references proj on update cascade on delete cascade, -- progenitor project number rel int4,-- 0=self, 1=child, 2=grandchild, etc. primary key (anst_id, prog_id) ); -- Contains a record for each account number and an optional alpha code to identify a sub-ledger -- -- create table acct ( acct_id int4 primary key, -- account number title varchar, -- name of the account code varchar -- alpha code for the account ); create index i_acct_code on acct (code); -- Each sub-ledger contains transactions unique to a certain part of the business -- In addiiton to the standard fields, they all share in common, each sub-ledger -- contains additional fields that are unique to it (so they can not all be -- stored in a single table). In our actual implementation, these sub-ledgers -- are actually implemented as views joining even lower level tables. -- -- create table subledg_A ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_A varchar-- some other data ); create index i_subledg_A_proj on subledg_A (proj); -- -- create table subledg_B ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_B varchar-- some other data ); create index i_subledg_B_proj on subledg_B (proj); -- -- create table subledg_C ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_C varchar-- some other data ); create index i_subledg_C_proj on subledg_C (proj); -- These views allow a standard account code to presented in the appropriate ledgers -- -- create view subview_A as select 'AP ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_A l join acct a on a.code = 'ap'; -- -- create view subview_B as select 'AR ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_B l join acct a on a.code = 'ar'; -- -- create view subview_C as select 'PR ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_C l join acct a on a.code = 'pr'; -- General ledger - this should contain all transactions from all subledgers -- -- create view gen_ledg as select trans_id, amount, proj, acct from subview_A union select trans_id, amo
Re: [SQL] PG 8.2beta reordering working for this case?
Tom Lane wrote: Kyle Bateman <[EMAIL PROTECTED]> writes: Is there a way to make the optimizer do this? Sorry, that's not happening for 8.2. Consider using a union all (not union) across the subledg_N tables directly and then joining to that. That boils down to being a partitioning case and I think probably will be covered by the 8.2 improvements. Yup, union all is much more efficient. It hadn't really occurred to me the difference between union and union all. But it makes sense to eliminate the need for a unique sort. The q3 query went from 10 seconds to 1 second with just the addition of union all in the general ledger. BTW, explain analyze still says 10 seconds of run time (and takes 10 seconds to run), but when I remove the explain analyze, the query runs in about a second. What's that all about? Also, I came up with the view shown in the attachment. It is still much faster than joining to the union-all ledger (40 ms). I'm not sure why because I'm not sure if explain analyze is telling me the real story (I see a sequential scan of the ledgers in there when it runs 10 seconds). I'm not sure what it's doing when it runs in 1 second. Kyle -- This view is a possible workaround for the problem drop view gen_ledg_pr; --explain analyze create view gen_ledg_pr as select lg.*, pr.anst_id from subview_A lg join proj_rel pr on pr.prog_id = lg.proj union all select lg.*, pr.anst_id from subview_B lg join proj_rel pr on pr.prog_id = lg.proj union all select lg.*, pr.anst_id from subview_C lg join proj_rel pr on pr.prog_id = lg.proj ; ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] olympics ranking query
Several years ago someone posted a question about how to achieve a running total of columns in sql. I have been trying to find a solution to a similar problem for some time and just came up with something that works great for me so I thought I'd post it for the benefit of the list. The problem is to produce a query like the following: select date,amount,run_sum(amount) from ledger order by date; DateAmount Sum --- -- - 2007-Jan-01 10.00 10.00 2007-Jan-02 20.00 30.00 2007-Jan-05 5.00 35.00 2007-Jan-10 -3.00 32.00 . . . Using pl/tcl, I made the following function: #Keep a running sum on a column where tag and trans are used to keep the #results distinct from other calls to the same function #Usage: run_sum(value,tag,trans) #--- function run_sum(numeric,text,text) {} { returns numeric language pltcl called on null input as $$ if {![info exists GD(sum.$2.$3)]} { set GD(sum.$2.$3) 0.00 } if {[argisnull 1]} { return $GD(sum.$2.$3) } else { return [set GD(sum.$2.$3) [expr $GD(sum.$2.$3) + $1]] } $$;} Then I added a wrapper function to automatically produce a separate namespace for each statement instance that uses the query: #Keep a running sum on a column where tag is used to keep the results distinct #Automatically keeps results distinct for each separate statement #Usage: run_sum(value,tag) #--- function run_sum(numeric,text) {run_sum(numeric,text,text)} { returns numeric language sql as $$ select run_sum($1,$2,statement_timestamp()::text); $$;} Now you can do: select date,amount,run_sum(amount,'amount') from ledger; to get an initial result. The only problem is now ordering the data. If you just put an 'order by' clause on the end, you don't get what you might expect because the ordering happens after the function has produced its result. So I do the following to order and sum it correctly: select date,amount,run_sum(amount,'amount') from (select * from ledger order by date) ss; The use of the "tag" parameter allows you to use this on multiple columns such as: select date,debit,credit, run_sum(debit,'debit')::numeric(14,2) as debits, run_sum(credit,'credit')::numeric(14,2) as credits from (select * from ledger order by date) ss; Enjoy, Kyle ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[SQL] Clarified Question
How can I create a function that will take in two variables and return an integer, when one of the variables is the tablename ?! I have tried : create function tst_func(text, varchar(16)) as 'BEGIN result=select max(history_id)+1 from $1 where client_id = $2; return result; END;' This kind of thing does work in pltcl but AFAIK you can't specify a table name with a parameter in plpgsql. This may give you the idea: create function myfunc(text,text) returns result as ' set res [spi_exec "select f1 from $1 where f2 = \'$2\'] return $res ' language 'pltcl'; begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard
[SQL] create constraint trigger
Hi Jan: In response to your suggestion about possibly being able to use "create constraint trigger," I have tried the following: I have the following trigger function (probably not all that important what it does...): create function prd_parm_tf_iu () returns opaque as ' declare trec record; baseholder varchar; begin select into trec min(pos) as n,max(pos) as m,count(*) as c from prd_parm where new.base = base; if trec.n != 1 or trec.m != trec.c then raise exception \'Parameters for base % not in order\', new.base; end if; return new; end;' language 'plpgsql'; This seems to work pretty good: create trigger prd_part_tr_iu after insert or update on prd_parm for each row execute procedure prd_parm_tf_iu(); But when I create it with "create constraint trigger" as shown next, the trigger doesn't seem to be invoked. It says it is created, but it allows data to pass that would not be allowed with the "create trigger." So I'm assuming I'm doing something wrong. create constraint trigger prd_part_tr_iu after insert or update on prd_parm deferrable initially deferred for each row execute procedure prd_parm_tf_iu(); Any ideas? Also, its not clear to me how to drop a trigger created with "create constrataint trigger." It would be very helpful if I could figure out how to make my triggers deferred. There are things stored in one table that have to be a certain way if other things are a certain way in a related table. It is more than what foreign key relationships can handle. It would really be great if I could figure out how to have the backend monitor the consistency of these details. Without deferral, I can check the consistency in one direction. But if I try to constrain it in both directions, I get into a problem where I can't update either table without breaking the constraints. I'm guessing that if I can get deferral to work, I should be able to update both tables in a transaction and be OK. Kyle begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard
[SQL] Re: create constraint trigger
Kyle Bateman wrote: Hi Jan: But when I create it with "create constraint trigger" as shown next, the trigger doesn't seem to be invoked. It says it is created, but it allows data to pass that would not be allowed with the "create trigger." So I'm assuming I'm doing something wrong. create constraint trigger prd_part_tr_iu after insert or update on prd_parm deferrable initially deferred for each row execute procedure prd_parm_tf_iu(); OK, here's a reply to my own post. I've got the constraint trigger working now. The problem was that it was never getting dropped. I still don't know how to drop the constraint. I thought by dropping the trigger function it would get the idea. When I ran the create constraint again, it said it was created (no errors) so I figured everything was OK, but it was still running the previously defined constraint (apparently). I dropped the table and started from scratch and it seems to work fine now. So is there a way to "drop constraint trigger" without having to drop the table? Kyle begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard
[SQL] Deferral of primary key constraint
Is it possible to defer the check on a primary key constraint (or a check constraint, for that matter). Here is an example that shows why it would be nice to be able to do so. We have a real-life scenario that is similar (but much more complex). drop table btab; create table btab ( base varchar, pos int4, cmt varchar, primary key (base, pos) ); insert into btab (base,pos,cmt) values ('aa',1,'The'); insert into btab (base,pos,cmt) values ('aa',2,'quick'); insert into btab (base,pos,cmt) values ('aa',3,'grey'); insert into btab (base,pos,cmt) values ('aa',4,'fox'); insert into btab (base,pos,cmt) values ('bb',3,'dog'); insert into btab (base,pos,cmt) values ('bb',2,'brown'); insert into btab (base,pos,cmt) values ('bb',1,'The'); select * from btab order by base,pos; begin; delete from btab where base = 'aa' and pos = 2; update btab set pos = pos - 1 where pos > 2 and base = 'aa'; commit; select * from btab order by base,pos; begin; update btab set pos = pos + 1 where pos >= 2 and base = 'bb'; insert into btab (base,pos,cmt) values ('bb',2,'slow'); commit; select * from btab order by base,pos; The last transaction fails (on my box, anyway) because of the primary key index. We would like to be able to do inserts/deletes to a list of records and still ensure that they are in contiguous sequential order so we want to renumber higher records if a new record is inserted in the middle. The sequence is part of the primary key and we want to ensure uniqueness. Some renumbering will work (by chance) if the records happen to get adjusted in the right order. But if one of the updates tries to rename to an already existing record, it fails. How hard is it to take the deferral mechanism you have for foreign key references and apply it to the primary key too? It would also be handy to be able to defer a check constraint. begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard
[SQL] References and privileges
[EMAIL PROTECTED] (Jan Wieck) writes: >> If one does: >> >> [...] >> grant select on master to a_user; >> grant select, insert, update, delete on detail to a_user; >> >> then if login as "a_user" and does: >> >> insert into detail (id, master_id) values (1, 10); >> >> this will result in: "ERROR: master: Permission denied". >> >> This seems a bug to me ? Isn't it ? > Outch, > yes, we missed something here. Peter, you said you'll > probably work on the ACL stuff after 7.0. We need to > coordinate that work with the function manager redesign to go > for SETUID triggers and functions. Hey Jan: This one just bit me hard. We're trying to implement RI so a foreign key (employee ID number) is verified to exist in the employee table. Problem is, we have to grant everyone and their dog access to the employee table (select,update, no less) which contains all kinds of sensitive data about employees. We even tried making a limited view of the employee table and referencing that. No luck, the RI triggers try to find an OID column in the referenced table (view). If we could name a column "oid" in a view, we might have a chance with this approach. How hard would it be to get the "grant references" thing going? Is there anything I could do to help? Grant permissions on columns would also solve this problem. Is that one huge or not so huge? Kyle begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard
[SQL] finding (and recycling) holes in sequences
If one has a unique-id generating sequence that sometimes is bound to have holes in it (ie: it could happen that a nextval(seq) happens without a corresponding INSERT in the table), then how could one efficiently scan for these holes to recycle them in subsequent INSERTs? I'm just looking for a "standard" way of doing this if such a thing exists. Louis-David Mitterrand - [EMAIL PROTECTED] - http://www.apartia.fr If you want to create a sequence of numbers with no holes, you might consider NOT using sequences at all (the are really meant for something different). Rather, each time you wish to do an insert, query the existing data to see what the maximum value is existing in the database. Add 1 to that and use this as your new value. This is guaranteed to not have any holes in the sequence. To make inserts easier, you can define an sql function that will return the max value. Here's an example: create function maxid () returns int4 as ' declare mv int4; begin select max(id) into mv from mytab; if mv is null then return 0; end if; return mv; end; ' language 'plpgsql'; create table mytab ( id int4 primary key default (maxid() + 1), data text ); insert into mytab (data) values ('Hi there'); insert into mytab (data) values ('Howdy'); insert into mytab (data) values ('Guten Tag'); insert into mytab (data) values ('Terve'); select * from mytab; BTW hackers, this is a common enough request, I wonder if there should be a built-in feature to make it easier for people to put this into the default value in the table definition i.e.: create table mytab ( keycolumn int4 primary key default coalesce(max(keycolumn),0)+1, data text ); Wouldn't this eliminate the need for rollback-able sequences? begin:vcard n:Bateman;Kyle tel;fax:801-377-8096 tel;work:801-377-8033x101 x-mozilla-html:FALSE url:www.actiontarget.com org:Action Target Inc adr:;;PO Box 636;Provo;UT;84603;US version:2.1 email;internet:[EMAIL PROTECTED] title:President x-mozilla-cpt:;-15520 fn:Kyle Bateman end:vcard