[SQL] bug in information_schema?

2005-07-29 Thread Kyle Bateman
I noticed that it seemed a bit slow to query 
information_schema.view_column_usage.   As I look at the code in 
information_schema.sql, I'm not sure why pg_user is referenced twice 
(once without an alias).  It looks like we can take out the first 
pg_user and remove the DISTINCT keyword and this improves the efficiency 
significantly.  It seems to return the same result but in half the 
time.  Anyone see a problem with this?  (The same problem may also be in 
view_table_usage but I haven't done any testing there yet.)


Code from information_schema.sql:
CREATE VIEW view_column_usage AS
   SELECT DISTINCT
  CAST(current_database() AS sql_identifier) AS view_catalog,
  CAST(nv.nspname AS sql_identifier) AS view_schema,
  CAST(v.relname AS sql_identifier) AS view_name,   
  CAST(current_database() AS sql_identifier) AS table_catalog,

  CAST(nt.nspname AS sql_identifier) AS table_schema,
  CAST(t.relname AS sql_identifier) AS table_name,   
  CAST(a.attname AS sql_identifier) AS column_name   


   FROM pg_user, pg_namespace nv, pg_class v, pg_depend dv,
pg_depend dt, pg_class t, pg_namespace nt,
pg_attribute a, pg_user u



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[SQL] Trigger on select?

2005-08-02 Thread Kyle Bateman
Hey, anyone know if it is possible to fire a trigger before a select?  
I'm considering creating some tables which contain data summarized from 
other tables as kind of a cache mechanism.  The hope is I can speed up 
some queries that get executed a lot (and are kind of slow) by 
maintaining data (like sums of big columns, for example).  I was all 
ready to go and then I discovered that trigger-before-select is not 
supported.  (Is it in any DB?)


The idea is this:

Any time I execute a query that would access the summary data, the 
"before select" trigger fires and goes out and builds any summary data 
missing from the summary table.  When I do an insert,update,delete on 
the primary data table, another trigger fires that removes the 
applicable data from the summary table.  This way, I only cache the 
information I need in the summary table, right before I need it.  But it 
can stay there as long as the base information doesn't change so I don't 
have to redo the expensive operation of summarizing it any more often 
than necessary.  Its kind of like an index in a way, but it is not 
maintained at insert/update time.  Rather, it is updated as it is needed.


Anyone have any ideas about how I can accomplish this?

Kyle

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[SQL] information_schema problem

2005-10-28 Thread Kyle Bateman

I'm trying to use information_schema.view_column_usage to determine the
native table from which various view columns descend.  This is so my
interface can automatically generate the correct foreign key links from
one view to another.

But in the case where a view references two tables linked by a foreign  
key, the key fields are reported as belonging to both tables.   


The enclosed example shows two tables related by a foreign key relationship.
The view "event_remind_v" really only produces fields from 
event_remind.  But

in order to satisfy the "exists" clause, it also references fields from the
event table.

view_column_usage reports the fields "own_id" and "seq" as belonging to
both table "event_remind" and "event".  My code needs a way to know that
"event_remind" is the table they "really" come from.

1. Is this the correct behavior for information_schema.view_column_usage?
  Should it report a table/column as belonging to more than one table?
  If not, how can I fix it?
  
  The enclosed script includes a (slightly revised) version of   
  view_column_usage that is easier to hack on than the one inside

  information_schema.
  
2. If information_schema.view_column_usage is working right, is there a
  way I can modify my local view_column_usage to distinguish between   
  tables/columns that actually "belong" to the view and related columns

  from a foreign key relationship?

Example code:
-
drop view event_remind_v;
drop table event_remind;
drop table event;
drop view view_column_usage;

-- Contains an entry for each scheduled calendar event
create table event (
   own_id  int4,
   seq int4,
   status  varchar,
   summary varchar,

   primary key (own_id,seq)
);

-- Contains an entry for each reminder for each event
create table event_remind (
   own_id  int4,
   seq int4,
   advance interval,

   primary key (own_id, seq, advance),
   foreign key (own_id, seq) references event on update cascade on 
delete cascade

);

create view event_remind_v as
   select *
   from event_remind r
   where exists (select * from event where own_id = r.own_id and 
seq = r.seq and status = 'open');

;

create view view_column_usage as
 select
   v.relname   as "view_name",
   t.relname   as "table_name",
   at.attname  as "column_name"

   from pg_depend dv, pg_class v, pg_namespace nv,
pg_depend dt, pg_class t, pg_namespace nt, pg_attribute at
   where dv.objid = dt.objid
 and dv.refobjid <> dt.refobjid
 and dv.deptype = 'i'

 and v.relkind = 'v'
 and t.relkind IN ('r', 'v')

 and v.oid = dv.refobjid
 and t.oid = dt.refobjid

 and t.relnamespace = nt.oid
 and v.relnamespace = nv.oid
 and dv.classid= dt.classidand dv.classid= 
'pg_catalog.pg_rewrite'::regclass
 and dv.refclassid = dt.refclassid and dv.refclassid = 
'pg_catalog.pg_class'::regclass


 and t.oid = at.attrelid and dt.refobjsubid = at.attnum
 and nv.nspname = 'public' and nt.nspname = 'public'
;

select view_name,column_name,table_name from
view_column_usage where view_name = 'event_remind_v';
select view_name,column_name,table_name from 
information_schema.view_column_usage where view_name = 'event_remind_v';



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[SQL] unplanned sub-select error?

2005-11-21 Thread Kyle Bateman

I have a query:

insert into mtr_reg_v_wt (ropnum, inum, pnum, rquant, value, status, 
ddate, fr_proj, to_proj) values (28985,1,1014,1,(select cost from 
prd_part where pnum = 1014),'work','2005-Nov-15',50,75);


That used to work fine under 7.1.3 but now gives the error:

ERROR:  cannot handle unplanned sub-select

Anyone know what this means?  Is there a good reason why this update 
should no longer work?  Or is this a bug?


Kyle
wyatterp.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] unplanned sub-select error?

2005-11-22 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 


I have a query:
insert into mtr_reg_v_wt (ropnum, inum, pnum, rquant, value, status, 
ddate, fr_proj, to_proj) values (28985,1,1014,1,(select cost from 
prd_part where pnum = 1014),'work','2005-Nov-15',50,75);
   



 


That used to work fine under 7.1.3 but now gives the error:
   



 


ERROR:  cannot handle unplanned sub-select
   



You need to offer a little more context, like what PG version you are
using now and what is the underlying DDL --- I suspect some rules or
views are involved here, but you didn't show them to us.

 

Sorry, you're right.  I have now confirmed that this only happens when 
updating via a view/rule (as you suspected).  Attached is a minimalist 
sql file that demonstrates the same error message from a blank 
database.  I'm using 8.1.0.  I'm pretty sure this problem did not exist 
on 8.0.3.


Kyle

-- Expose the "unplanned sub-select" error message

create table parts (
partnum	varchar(18) primary key,
cost	float8
);

create table shipped (
ttype	char(2),
ordnum	int4,
partnum	varchar(18) references parts,
value	float8,

primary key (ttype, ordnum)
);

create view shipped_view as
select * from shipped where ttype = 'wt';

create rule shipped_view_insert as on insert to shipped_view
do instead insert into shipped
(ttype, ordnum, partnum, value) 
values
('wt', new.ordnum, new.partnum, new.value);

insert into parts (partnum, cost) values (1, 1234.56);

insert into shipped_view 
	(ordnum, partnum, value)
values
(100,1,(select cost from parts where partnum = 1));

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[SQL] permissions from system catalogs

2006-06-26 Thread Kyle Bateman
Can anyone point me where to look in the system catalogs (or information 
schema) to determine in advance whether a given user (or the current 
user) will have select permission to a given table or view?


Kyle


---(end of broadcast)---
TIP 6: explain analyze is your friend


[SQL] Schema management tool

2006-06-27 Thread Kyle Bateman




I just saw this very old post and thought I'd respond for the benefit
of the list:

Patrick,

> Could someone give me some advice or point me to an article that would help
> manage views?  I have a view that I use in conjunction with other views,
> and since I have to drop the view and then recreate it, I'm force to drop
> and recreate all other views that the original view was used in.
> TIA

As I recall, the IT staff at Action Target, Inc. two years ago developed a 
non-free tool for this purpose.   They were planning to start a spin-off 
company to market PostgreSQL admin tools, but I don't know how far they got 
with the idea.   

Try contacting Action Target's webmaster to see if you can find anything out:
webmaster ( at ) actiontarget ( dot ) com


We have a free (GPL) tool called "wyseman" that is a framework for
authoring schemas in postgres.  You create a meta-sql structure for
each database object you want in the database.  Then you call wyseman
to drop and/or re-create the object, along with all its dependencies.

This is a great tool for working on a live database.  You can maintain
a pristine, authoritative definition of your schema, and still migrate
your live database along as you make changes to the design.  When you
want to modify a section of the schema that is deep within the tree of
dependencies, wyseman will take care of dropping, rebuilding, and
granting permissions on all the necessary objects in order to get into
the part you want to change.

Wyseman also takes care of building a "data dictionary" where you can
query for things like printable titles, and pop-up helps for tables and
columns in multiple languages.  This is a nice layer to put on top of
postgres and just under your ERP.

These tools are accessible on wyatterp.com.  Due to lack of time, I
have not kept current software in the download area.  We are using much
later stuff in-house.  But I would be happy to share more recent
versions if there is any interest out there.  It would be particularly
helpful if anyone were willing to help maintain the website and
(currently a sad excuse for) documentation.

There is also a run-time library (wylib) for rapid construction of ERP
applications.  We have about 30 applications built in the framework
that run our entire business.

Kyle Bateman
Action Target Inc.





[SQL] clock command regression in pltcl?

2010-01-22 Thread Kyle Bateman

I have the following function defined:

create function _date_week(int4,int4,int4) returns text language pltcl 
immutable as $$

return [clock format [clock scan "$2/$3/$1"] -format "%U"]
$$;

It worked fine in 8.3 but in 8.4 now I try to build an index using the 
function and get:


SQL ERROR: In database query: begin;
create index i_pay_req_empl_id_week on pay_req 
(empl_id,(date_week(wdate)));: ERROR:  invalid command name "clock"

CONTEXT:  invalid command name "clock"
invoked from within
"clock scan "$2/$3/$1""
(procedure "__PLTcl_proc_12360682" line 3)
invoked from within
"__PLTcl_proc_12360682 2003 12 20"
in PL/Tcl function "_date_week"
PL/pgSQL function "date_week" line 13 at assignment

Is this a regression or is there a reason the clock command is no longer 
accessible?


Kyle


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Using bitmap index scans-more efficient

2006-08-13 Thread Kyle Bateman
um)::text = 
"inner"."?column10?"))
  ->  Index Scan using apinv_hdr_pkey on apinv_hdr h  
(cost=0.00..3148.16 rows=51016 width=21)
  ->  Sort  (cost=13543.42..13693.47 rows=60020 width=55)
Sort Key: i.vendid, (i.invnum)::text
->  Seq Scan on apinv_items i  (cost=0.00..7197.27 
rows=60020 width=55)
  Filter: (((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar)) AND (proj = ANY ($0)))
->  Index Scan using vend_org_pkey on vend_org v  (cost=0.00..145.52 
rows=1799 width=26)
  ->  Materialize  (cost=3.54..3.55 rows=1 width=4)
->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)
  Filter: ((code)::text = 'ap'::text)

---

The worst case is the following types of queries (about 5 seconds):

 select * from ledger where proj in (select prog_id from proj_prog where 
proj_id = 4737);
 select l.* from ledger l, proj_prog p where l.proj = p.prog_id and p.proj_id = 
4737;

---
Hash Join  (cost=19032.47..23510.23 rows=6 width=85)
  Hash Cond: ("outer".proj = "inner".prog_id)
  ->  Nested Loop Left Join  (cost=18994.38..23378.41 rows=1700 width=85)
->  Hash Join  (cost=18990.84..23340.87 rows=1700 width=81)
  Hash Cond: ("outer".vendid = "inner".org_id)
  ->  Merge Join  (cost=18935.35..23255.64 rows=1700 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = 
"inner"."?column10?"))
->  Index Scan using apinv_hdr_pkey on apinv_hdr h  
(cost=0.00..3148.16 rows=51016 width=21)
->  Sort  (cost=18935.35..19235.45 rows=120041 width=55)
  Sort Key: i.vendid, (i.invnum)::text
  ->  Seq Scan on apinv_items i  (cost=0.00..4152.99 
rows=120041 width=55)
Filter: ((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar))
  ->  Hash  (cost=50.99..50.99 rows=1799 width=26)
->  Seq Scan on vend_org v  (cost=0.00..50.99 rows=1799 
width=26)
->  Materialize  (cost=3.54..3.55 rows=1 width=4)
  ->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
  ->  Hash  (cost=38.04..38.04 rows=21 width=4)
->  Index Scan using proj_prog_pkey on proj_prog p  (cost=0.00..38.04 
rows=21 width=4)
  Index Cond: (proj_id = 4737)
---

I would like to be able to get the best performance like in the first query but 
without having to enumerate the projects (i.e. using a single query).

The secret seems to be the bitmap index scans.

Any ideas about whether/how this can be done?

Thanks!

Kyle Bateman



---
BTW, The ledger view is built roughly as follows:

create view rp_v_api as
   select
   h.adate as adate,
   (i.price * i.quant)::numeric(14,2)  as amount,
   substring(v.org_name from 1 for 40) as descr, 


   i.proj  as proj,
   i.acct  as acct,
   1   as cr_proj,
   a.acct_id   as cr_acct 


   from (
   apinv_hdr   h
   join apinv_itemsi   on i.vendid = h.vendid and i.invnum 
= h.invnum
   join vend_org   v   on v.org_id = h.vendid
   left join acct  a   on a.code   = 'ap'
   )
   where i.status in ('en','cl','pd');



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] Using bitmap index scans-more efficient

2006-08-14 Thread Kyle Bateman

Florian Weimer wrote:


* Kyle Bateman:

 


Any ideas about whether/how this can be done?
   



If the project tree is fairly consistent, it's convenient to encode it
using intervals instead of parent/child intervals.  IIRC, Celko's "SQL
for smarties" explains how to do this, and Kristian Koehntopp has
written some PHP code to implement it.

 

I agree that this produces a more efficient query for finding the 
projects that are the progeny of another project, but I'm trying to 
figure out how that helps me select the right project transactions from 
my ledger efficiently.


This query produces wonderful results (very fast):

select * from ledger where proj >= 4737 and proj <= 4740;

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:


select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;


The problem (at least according to my initial testing) is that this 
forces a join of the entire ledger and I get my slowest performance 
group (5 seconds).


What am I missing?

Kyle


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] Using bitmap index scans-more efficient

2006-08-15 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:
   



 

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;
   



btree has no idea about the constraint (that I imagine exists) that left
<= right.  If you're just doing a simple index on (left, right) then the
above query requires scanning all index entries with left > 1234.  It
would probably help to say

select * from ledger l, proj p where p.proj_id = l.proj and
 p.left > 1234 and p.left < 2345 and p.right < 2345;

so that you can constrain the range of "left" values scanned.
 


Thanks for the replies, Tom and Florian.

My problem is not that it is difficult (or costly) to determine the 
progeny of a given project.  I can determine this in about 90 msec 
regardless of whether I use an adjacency model, interval-encoding, or 
materialized path (current implementation).  The problem is, when I try 
to extract the ledger entries belonging to that progeny from a set of a 
million records, it seems to need to process all million records rather 
than being able to index right into the ones I want.


I'm not very good at reading explain output, but I tried to set up the 
query Tom suggests by creating an interval-encoded project table 
(proj_int) and then joining it to my ledger like so:


select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft 
>= 5283 and i.lft < 5300 and i.rgt <= 5300;


On my mini-test-ledger of 100,000 entries, this takes the longest time 
(5 seconds) with the following explain output:


Hash Join  (cost=19018.46..23411.52 rows=14 width=85)
  Hash Cond: ("outer".proj = "inner".proj_id)
  ->  Nested Loop Left Join  (cost=18994.38..23378.41 rows=1700 width=85)
->  Hash Join  (cost=18990.84..23340.87 rows=1700 width=81)
  Hash Cond: ("outer".vendid = "inner".org_id)
  ->  Merge Join  (cost=18935.35..23255.64 rows=1700 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND 
(("outer".invnum)::text = "inner"."?column10?"))
->  Index Scan using apinv_hdr_pkey on apinv_hdr h  
(cost=0.00..3148.16 rows=51016 width=21)
->  Sort  (cost=18935.35..19235.45 rows=120041 
width=55)

  Sort Key: i.vendid, (i.invnum)::text
  ->  Seq Scan on apinv_items i  
(cost=0.00..4152.99 rows=120041 width=55)
Filter: ((status = 'en'::bpchar) OR 
(status = 'cl'::bpchar) OR (status = 'pd'::bpchar))

  ->  Hash  (cost=50.99..50.99 rows=1799 width=26)
->  Seq Scan on vend_org v  (cost=0.00..50.99 
rows=1799 width=26)

->  Materialize  (cost=3.54..3.55 rows=1 width=4)
  ->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
  ->  Hash  (cost=24.06..24.06 rows=10 width=4)
->  Bitmap Heap Scan on proj_int i  (cost=2.26..24.06 rows=10 
width=4)
  Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 
5300))
  ->  Bitmap Index Scan on i_proj_int_lft_rgt  
(cost=0.00..2.26 rows=10 width=0)
Index Cond: ((lft >= 5283) AND (lft < 5300) AND 
(rgt <= 5300))


That is roughly equivalent to my materialized path method:

select l.* from ledger l where l.projin (select proj_id from proj_v 
where 4737 = any(ppath));



And is quite slow compared to 150 msec when enumerating the progeny 
projects like so:


select l.* from ledger l where l.proj in 
(4737,4789,4892,4893,4894,4895,4933,4934,4935);


Nested Loop Left Join  (cost=19.73..4164.10 rows=7 width=85)
  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)
->  Nested Loop  (cost=19.73..4100.07 rows=7 width=63)
  ->  Bitmap Heap Scan on apinv_items i  
(cost=19.73..1185.71 rows=487 width=55)
Recheck Cond: ((proj = 4737) OR (proj = 4789) OR 
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR 
(proj = 4933) OR (proj = 4934) OR (proj = 4935))
Filter: ((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar))

->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

Index Cond: (proj = 4737)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

   

Re: [SQL] Using bitmap index scans-more efficient

2006-08-15 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:
   

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;
   



Here's an interesting result:

I created a function proj_left(int4) that returns the left interval 
number for a given project.  Then I created an index on the underlying 
table for the ledger  view(which took forever to build) like so:


create index i_test on apinv_items (proj_left(proj));

Now my query:

select * from ledger where proj_left(dr_proj) >= 5283 and 
proj_left(dr_proj) < 5300;


is very speedy.  Problem is, I had to mark the function proj_left() as 
immutable, which it can not be since the left and right values for a 
given project will change any time a project is added, removed, or moved 
around the hierarchy :(


So is there any good way to tell the planner to do several individual 
index scans for the projects involved in the desired progeny, or the 
results together and return the result?  This is what it seems to be 
choosing in the case of the query:


 select * from ledger where proj in 
(4737,4789,4892,4893,4894,4895,4933,4934,4935);




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] Using bitmap index scans-more efficient

2006-08-16 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 

I'm wondering if this might expose a weakness in the optimizer having to 
do with left joins.
   



Before 8.2 the optimizer has no ability to rearrange the order of outer
joins.  Do you have time to try your test case against CVS HEAD?
 



OK, I figured it out--grabbed the latest snapshot (hope that is what you 
need).


My results are similar:

select l.* from ledg_v1 l, proj p where l.proj = p.proj_id and 5 = 
p.par; (24 msec)

Nested Loop  (cost=0.00..1991.93 rows=480 width=23)
  ->  Nested Loop  (cost=0.00..4.68 rows=6 width=8)
->  Seq Scan on acct a  (cost=0.00..1.12 rows=1 width=4)
  Filter: ((code)::text = 'ap'::text)
->  Index Scan using i_proj_par on proj p  (cost=0.00..3.49 
rows=6 width=4)

  Index Cond: (5 = par)
  ->  Index Scan using i_ledg_proj on ledg l  (cost=0.00..330.17 
rows=83 width=19)

Index Cond: (l.proj = "outer".proj_id)

select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = 
p.par; (1.25 sec)

Hash Join  (cost=4.63..16768.43 rows=480 width=23)
  Hash Cond: ("outer".proj = "inner".proj_id)
  ->  Nested Loop Left Join  (cost=1.13..14760.13 rows=40 width=23)
->  Seq Scan on ledg l  (cost=0.00..6759.00 rows=40 width=19)
->  Materialize  (cost=1.13..1.14 rows=1 width=4)
  ->  Seq Scan on acct a  (cost=0.00..1.12 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
  ->  Hash  (cost=3.49..3.49 rows=6 width=4)
->  Index Scan using i_proj_par on proj p  (cost=0.00..3.49 
rows=6 width=4)

  Index Cond: (5 = par)


---(end of broadcast)---
TIP 6: explain analyze is your friend


[SQL] Finding context for error log

2006-09-26 Thread Kyle Bateman

I get the following message in my postgres log:

ERROR:  42601: syntax error at or near "%" at character 269
LOCATION:  yyerror, scan.l:761

I have a fairly complex schema (lots of database objects) and many users 
hitting the db with lots of connections, so I'm not really sure what bit 
of code is causing the problem.  Is there an easy way to get postgres to 
spit out the SQL statement it was parsing when it generated the error?


I've experimented with log_error_verbosity and log_min_messages but 
still haven't seen any SQL in the log.


Kyle


---(end of broadcast)---
TIP 6: explain analyze is your friend


[SQL] PG 8.2beta reordering working for this case?

2006-10-07 Thread Kyle Bateman

Kyle Bateman wrote:


Tom Lane wrote:



Before 8.2 the optimizer has no ability to rearrange the order of outer
joins.  Do you have time to try your test case against CVS HEAD?
 

I've done some more refinement on my accounting ledger system that has 
clarified some of the problems I was having with performance joining to 
a union.  Here's an interesting test case.  I just tried it with PG 
8.2beta1.  I don't claim to understand the new feature of reordering 
queries very well, but it seems like this is related to that feature.  
Since its still a performance problem in 8.2, I thought this might be a 
helpful test case during beta.


To see this demo, run create.sql on a clean database.  It will create 
all the needed tables, views and test data.


Then run q1 and q2 which are very efficient.  Then q3 is the slow one.  
The reason is, it does the union, producing 300,000 records before 
trying to select by project.  It seems like the optimizer should 
internally rewrite the query to look more like what is in q4 (which is 
very fast).


Is there a way to make the optimizer do this?

Kyle Bateman

-- Make test schema for demonstrating how the postgres optimizer might improve
-- performance on joins with unions

-- Contains a record for each project (job-costing) code
-- Projects are arranged in a hierarchical structure (parent/child)
-- --
create table proj (
proj_id		int4 primary key,
title		varchar,
par			int4 references proj on update cascade
);
create index i_proj_par on proj (par);

-- Contains a record for every 2 combinations of projects which are related
-- to each other in the hierarchical project tree 
-- (parent/child, ancestor/progenitor, etc.)
-- --
create table proj_rel (
anst_id		int4 references proj on update cascade on delete cascade,	-- ancestor project number
prog_id		int4 references proj on update cascade on delete cascade,	-- progenitor project number
rel			int4,-- 0=self, 1=child, 2=grandchild, etc.
primary key (anst_id, prog_id)
);

-- Contains a record for each account number and an optional alpha code to identify a sub-ledger
-- --
create table acct (
acct_id		int4 primary key,	-- account number
title		varchar,		-- name of the account
code		varchar			-- alpha code for the account
);
create index i_acct_code on acct (code);

-- Each sub-ledger contains transactions unique to a certain part of the business
-- In addiiton to the standard fields, they all share in common, each sub-ledger
-- contains additional fields that are unique to it (so they can not all be
-- stored in a single table).  In our actual implementation, these sub-ledgers
-- are actually implemented as views joining even lower level tables.
-- --
create table subledg_A (
rid			int4 primary key,		-- record ID
amount		numeric(14,2),
proj		int4 references proj on update cascade on delete cascade,
unique_A		varchar-- some other data
);
create index i_subledg_A_proj on subledg_A (proj);
-- --
create table subledg_B (
rid			int4 primary key,		-- record ID
amount		numeric(14,2),
proj		int4 references proj on update cascade on delete cascade,
unique_B		varchar-- some other data
);
create index i_subledg_B_proj on subledg_B (proj);
-- --
create table subledg_C (
rid			int4 primary key,		-- record ID
amount		numeric(14,2),
proj		int4 references proj on update cascade on delete cascade,
unique_C		varchar-- some other data
);
create index i_subledg_C_proj on subledg_C (proj);

-- These views allow a standard account code to presented in the appropriate ledgers
-- --
create view subview_A as select
   'AP ' || rid as trans_id,
l.amount, l.proj,
a.acct_id as acct
from	subledg_A	l
join	acct		a on a.code = 'ap';
-- --
create view subview_B as select
   'AR ' || rid as trans_id,
l.amount, l.proj,
a.acct_id as acct
from	subledg_B	l
join	acct		a on a.code = 'ar';
-- --
create view subview_C as select
   'PR ' || rid as trans_id,
l.amount, l.proj,
a.acct_id as acct
from	subledg_C	l
join	acct		a on a.code = 'pr';

-- General ledger - this should contain all transactions from all subledgers
-- --
create view gen_ledg as
select trans_id, amount, proj, acct from subview_A
union 
select trans_id, amo

Re: [SQL] PG 8.2beta reordering working for this case?

2006-10-08 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 


Is there a way to make the optimizer do this?
   



Sorry, that's not happening for 8.2.  Consider using a union all (not
union) across the subledg_N tables directly and then joining to that.
That boils down to being a partitioning case and I think probably will
be covered by the 8.2 improvements.

Yup, union all is much more efficient.  It hadn't really occurred to me 
the difference between union and union all.  But it makes sense to 
eliminate the need for a unique sort.  The q3 query went from 10 seconds 
to 1 second with just the addition of union all in the general ledger.


BTW, explain analyze still says 10 seconds of run time (and takes 10 
seconds to run), but when I remove the explain analyze, the query runs 
in about a second.  What's that all about?


Also, I came up with the view shown in the attachment.  It is still much 
faster than joining to the union-all ledger (40 ms).  I'm not sure why 
because I'm not sure if explain analyze is telling me the real story (I 
see a sequential scan of the ledgers in there when it runs 10 seconds).  
I'm not sure what it's doing when it runs in 1 second.


Kyle

-- This view is a possible workaround for the problem
drop view gen_ledg_pr;

--explain analyze
create view gen_ledg_pr as
  select lg.*, pr.anst_id
from 	subview_A	lg
join	proj_rel	pr	on pr.prog_id = lg.proj

  union all select lg.*, pr.anst_id
from 	subview_B	lg
join	proj_rel	pr	on pr.prog_id = lg.proj

  union all select lg.*, pr.anst_id
from 	subview_C	lg
join	proj_rel	pr	on pr.prog_id = lg.proj
;

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] olympics ranking query

2007-03-29 Thread Kyle Bateman
Several years ago someone posted a question about how to achieve a 
running total of columns in sql.  I have been trying to find a solution 
to a similar problem for some time and just came up with something that 
works great for me so I thought I'd post it for the benefit of the list.


The problem is to produce a query like the following:

select date,amount,run_sum(amount) from ledger order by date;

DateAmount Sum
--- -- -
2007-Jan-01  10.00 10.00
2007-Jan-02  20.00 30.00
2007-Jan-05   5.00 35.00
2007-Jan-10  -3.00 32.00
.
.
.

Using pl/tcl, I made the following function:

#Keep a running sum on a column where tag and trans are used to keep the
#results distinct from other calls to the same function
#Usage: run_sum(value,tag,trans)
#---
function run_sum(numeric,text,text) {} {
 returns numeric language pltcl called on null input as $$
   if {![info exists GD(sum.$2.$3)]} {
   set GD(sum.$2.$3) 0.00
   }
   if {[argisnull 1]} {
   return $GD(sum.$2.$3)
   } else {
   return [set GD(sum.$2.$3) [expr $GD(sum.$2.$3) + $1]]
   }
$$;}

Then I added a wrapper function to automatically produce a separate 
namespace for each statement instance that uses the query:


#Keep a running sum on a column where tag is used to keep the results 
distinct

#Automatically keeps results distinct for each separate statement
#Usage: run_sum(value,tag)
#---
function run_sum(numeric,text) {run_sum(numeric,text,text)} {
 returns numeric language sql as $$
   select run_sum($1,$2,statement_timestamp()::text);
$$;}

Now you can do:

select date,amount,run_sum(amount,'amount') from ledger;

to get an initial result.  The only problem is now ordering the data.  
If you just put an 'order by' clause on the end, you don't get what you 
might expect because the ordering happens after the function has 
produced its result.  So I do the following to order and sum it correctly:


select date,amount,run_sum(amount,'amount') from (select * from ledger 
order by date) ss;


The use of the "tag" parameter allows you to use this on multiple 
columns such as:


select date,debit,credit,
   run_sum(debit,'debit')::numeric(14,2) as debits,
   run_sum(credit,'credit')::numeric(14,2) as credits
   from (select * from ledger order by date) ss;

Enjoy,

Kyle


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[SQL] Clarified Question

2000-05-24 Thread Kyle Bateman



 
How can I create a function that will take in two variables and return
an integer, when one of the variables is the tablename ?!
I have tried :
   
create function tst_func(text, varchar(16))
   
as
   
'BEGIN
   
result=select max(history_id)+1 from $1
   
where client_id = $2;
   
return result;

END;'
 
This kind of thing does work in pltcl but AFAIK you can't specify a table
name with a parameter in plpgsql.  This may give you the idea:
create function myfunc(text,text) returns result as '
    set res [spi_exec "select f1 from $1 where f2
= \'$2\']
    return $res
    ' language 'pltcl';
 

begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard



[SQL] create constraint trigger

2000-05-29 Thread Kyle Bateman


Hi Jan:
In response to your suggestion about possibly being able to use
"create constraint trigger," I have tried the following:
I have the following trigger function (probably not all that important
what it does...):
create function prd_parm_tf_iu ()
   returns opaque as '
 declare
    trec record;
    baseholder varchar;
    begin
    select into trec min(pos)
as n,max(pos) as m,count(*) as c
   
from prd_parm where new.base = base;
    if trec.n != 1 or trec.m
!= trec.c then
   raise
exception \'Parameters for base % not in order\', new.base;
    end if;
    return new;
    end;' language 'plpgsql';
This seems to work pretty good:
create trigger prd_part_tr_iu after insert or update on prd_parm
    for each row execute procedure prd_parm_tf_iu();
But when I create it with "create constraint trigger" as shown next,
the trigger doesn't seem to be invoked.  It says it is created, but
it allows data to pass that would not be allowed with the "create trigger." 
So I'm assuming I'm doing something wrong.
create constraint trigger prd_part_tr_iu after insert or update
on prd_parm
   deferrable initially deferred
   for each row execute procedure prd_parm_tf_iu();
Any ideas?
Also, its not clear to me how to drop a
trigger created with "create constrataint trigger."
It would be very helpful if I could figure
out how to make my triggers deferred.  There are things stored in
one table that have to be a certain way if other things are a certain way
in a related table.  It is more than what foreign key relationships
can handle.  It would really be great if I could figure out how to
have the backend monitor the consistency of these details.
Without deferral, I can check the consistency
in one direction.  But if I try to constrain it in both directions,
I get into a problem where I can't update either table without breaking
the constraints.  I'm guessing that if I can get deferral to work,
I should be able to update both tables in a transaction and be OK.
Kyle
 
 

begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard



[SQL] Re: create constraint trigger

2000-05-31 Thread Kyle Bateman


Kyle Bateman wrote:
Hi Jan:
But when I create it with "create constraint trigger" as shown next,
the trigger doesn't seem to be invoked.  It says it is created, but
it allows data to pass that would not be allowed with the "create trigger." 
So I'm assuming I'm doing something wrong.
create constraint trigger prd_part_tr_iu after insert or update
on prd_parm
   deferrable initially deferred
   for each row execute procedure prd_parm_tf_iu();
OK, here's a reply to my own post.  I've got the constraint trigger
working now.  The problem was that it was never getting dropped. 
I still don't know how to drop the constraint.  I thought by dropping
the trigger function it would get the idea.  When I ran the create
constraint again, it said it was created (no errors) so I figured everything
was OK, but it was still running the previously defined constraint (apparently).
I dropped the table and started from scratch and it seems to work fine
now.
So is there a way to "drop constraint trigger" without having to drop
the table?
Kyle
 

begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard



[SQL] Deferral of primary key constraint

2000-06-02 Thread Kyle Bateman


Is it possible to defer the check on a primary key constraint (or a check
constraint, for that matter).  Here is an example that shows why it
would be nice to be able to do so.  We have a real-life scenario that
is similar (but much more complex).
drop table btab;
create table btab (
    base   
varchar,
    pos
int4,
    cmt
varchar,
   primary key (base, pos)
);
insert into btab (base,pos,cmt) values ('aa',1,'The');
insert into btab (base,pos,cmt) values ('aa',2,'quick');
insert into btab (base,pos,cmt) values ('aa',3,'grey');
insert into btab (base,pos,cmt) values ('aa',4,'fox');
insert into btab (base,pos,cmt) values ('bb',3,'dog');
insert into btab (base,pos,cmt) values ('bb',2,'brown');
insert into btab (base,pos,cmt) values ('bb',1,'The');
select * from btab order by base,pos;
begin;
delete from btab where base = 'aa' and pos = 2;
update btab set pos = pos - 1 where pos > 2 and base = 'aa';
commit;
select * from btab order by base,pos;
begin;
update btab set pos = pos + 1 where pos >= 2 and base = 'bb';
insert into btab (base,pos,cmt) values ('bb',2,'slow');
commit;
select * from btab order by base,pos;
The last transaction fails (on my box, anyway) because of the primary
key index.
We would like to be able to do inserts/deletes to a list of records
and still ensure that they are in contiguous sequential order so we want
to renumber higher records if a new record is inserted in the middle. 
The sequence is part of the primary key and we want to ensure uniqueness. 
Some renumbering will work (by chance) if the records happen to get adjusted
in the right order.  But if one of the updates tries to rename to
an already existing record, it fails.
How hard is it to take the deferral mechanism you have for foreign key
references and apply it to the primary key too?  It would also be
handy to be able to defer a check constraint.
 

begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard



[SQL] References and privileges

2000-06-07 Thread Kyle Bateman

[EMAIL PROTECTED] (Jan Wieck) writes:
>> If one does:
>>
>> [...]
>> grant select on master to a_user;
>> grant select, insert, update, delete on detail to a_user;
>>
>> then if login as "a_user" and does:
>>
>> insert into detail (id, master_id) values (1, 10);
>>
>> this will result in: "ERROR:   master: Permission denied".
>>
>> This seems a bug to me ? Isn't it ?

> Outch,

> yes,  we  missed  something  here.  Peter,  you  said  you'll
> probably work  on  the  ACL  stuff  after  7.0.  We  need  to
> coordinate that work with the function manager redesign to go
> for SETUID triggers and functions.

Hey Jan:

This one just bit me hard.  We're trying to implement RI so a foreign
key (employee ID number) is verified to exist in the employee table.
Problem is, we have to grant everyone and their dog access to the
employee table (select,update, no less) which contains all kinds of
sensitive data about employees.

We even tried making a limited view of the employee table and
referencing that.  No luck, the RI triggers try to find an OID column in
the referenced table (view).  If we could name a column "oid" in a view,
we might have a chance with this approach.

How hard would it be to get the "grant references" thing going?  Is
there anything I could do to help?

Grant permissions on columns would also solve this problem.  Is that one
huge or not so huge?

Kyle



begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard



[SQL] finding (and recycling) holes in sequences

2000-06-26 Thread Kyle Bateman



 
If one has a unique-id generating sequence that sometimes is bound
to
have holes in it (ie: it could happen that a nextval(seq) happens
without a corresponding INSERT in the table), then how could one
efficiently scan for these holes to recycle them in subsequent INSERTs?
I'm just looking for a "standard" way of doing this if such a thing
exists.
Louis-David Mitterrand - [EMAIL PROTECTED] - http://www.apartia.fr
 
If you want to create a sequence of numbers with no holes, you might consider
NOT using sequences at all (the are really meant for something different). 
Rather, each time you wish to do an insert, query the existing data to
see what the maximum value is existing in the database.  Add 1 to
that and use this as your new value.  This is guaranteed to not have
any holes in the sequence.  To make inserts easier, you can define
an sql function that will return the max value.  Here's an example:
create function maxid () returns int4 as '
    declare
    mv int4;
    begin
    select max(id) into
mv from mytab;
    if mv is null then return
0; end if;
    return mv;
    end;
' language 'plpgsql';
create table mytab (
    id int4 primary key default (maxid() + 1),
    data text
);
insert into mytab (data) values ('Hi there');
insert into mytab (data) values ('Howdy');
insert into mytab (data) values ('Guten Tag');
insert into mytab (data) values ('Terve');
select * from mytab;
 
 
BTW hackers, this is a common enough request, I wonder if there should
be a built-in feature to make it easier for people to put this into the
default value in the table definition i.e.:
create table mytab (
    keycolumn int4 primary key default coalesce(max(keycolumn),0)+1,
    data text
);
Wouldn't this eliminate the need for rollback-able sequences?
 

begin:vcard 
n:Bateman;Kyle
tel;fax:801-377-8096
tel;work:801-377-8033x101
x-mozilla-html:FALSE
url:www.actiontarget.com
org:Action Target Inc
adr:;;PO Box 636;Provo;UT;84603;US
version:2.1
email;internet:[EMAIL PROTECTED]
title:President
x-mozilla-cpt:;-15520
fn:Kyle Bateman
end:vcard