Re: [SQL] pl/pgsql - code review + question

2001-07-19 Thread Richard Huxton

From: "Tom Lane" <[EMAIL PROTECTED]>

> "Richard Huxton" <[EMAIL PROTECTED]> writes:
> > Don't know why the parser for RAISE doesn't like string concat.
>
> Laziness ;-).  Someone should fix plpgsql so that RAISE does take
> expressions, not just literals and simple variables.  It probably
> wouldn't be a big change, but I've no time to look at it myself;
> any volunteers out there?

False Laziness perhaps (spot my Perl background)

OK - stick me down for having a look at it. Had a quick peek and I think
it's within my abilities. Give me a couple of weeks, because I haven't
looked at a YACC file since my university days (gram.y is YACC isn't it?)

I'll read up the rules for submitting patches and get something workable
back by August 3rd.

Josh - if I try and do OFFSET at the same time (presumably it's the same
change) do you fancy acting as a sanity test site?

- Richard Huxton


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[SQL] Query optimizing - paradox behave

2001-07-19 Thread David M. Richter

Hallo !

I want to tune a database. There a many redundant datas in the database
, because of all the relations were consider as n:m relations. But the
most of them are 1:n Relations. So my approach was to cut the
redundancies to get more performance. But .. happens!

The query with the 3 tables is faster than the query with 2 tables. 

That is paradox to the Explain output.
And: the real database functions like dbPG95GetIndex and all functions
defined by me are slower.
The whole program is slower than before.
I disabled all the index.(since with index the behavior is the same) The
database pacs ist only restructured. They have the same data. With
database pacs and compare a vacuum was made.
I looked at the user time , since system time is faked because my
testprogram hands over the control to the postmaster and the postmaster
is doing his own work. So I made a lot of tests to get a average
usertime. So escapes will be catched and eliminated.

Here are the tabledescriptions for the original database "compare":

tables i.e.
There is a n:m relationship between patient and study realized with
relpatient_study000 relationtable.

Table "patient"
  Attribute   |  Type  | Modifier 
--++--
 chilioid | character varying(80)  | 
 name | text   | 
 id   | character varying(256) | 
 birthdate| date   | 
 birthtime| time   | 
 sex  | character(1)   | 
 medicalrecordlocator | character varying(128) | 

   Table "study"
   Attribute|  Type  | Modifier 
++--
 chilioid   | character varying(80)  | 
 instanceuid| character varying(64)  | 
 id | character varying(64)  | 
 studydate  | date   | 
 studytime  | time   | 
 modality   | character varying(2)   | 
 manufacturer   | character varying(128) | 
 referingphysician  | text   | 
 description| character varying(128) | 
 manufacturersmodelname | character varying(128) | 
 importtime | double precision   | 
 chilisenderid  | character varying(80)  | 
 accessionnumber| character varying(64)  | 
 institutionname| character varying(128) | 
 workflowstate  | character varying(8)   | 
 flags  | character varying(8)   | 
 performingphysician| character varying(128) | 
 reportingphysician | character varying(128) | 


 Table "relpatient_study000"
 Attribute | Type  | Modifier 
---+---+--
 chilioid  | character varying(80) | 
 parentoid | character varying(80) | 
 childoid  | character varying(80) | 

 parentoid is here the oid of the patient and childoid is here the oid
of the study.

Thats the query with the original database "compare":

time psql -d compare -c "SELECT patient.*,study.* FROM
patient,study,relpatient_study000 r0 WHERE
(patient.chiliOID=r0.parentOID AND study.chiliOID=r0.childOID) order by
patient.name using <" > 3tableni
1.671u 0.130s 0:11.14 16.4% 0+0k 0+0io 208pf+0w

psql -d compare -c "EXPLAIN SELECT patient.*,study.* FROM
patient,study,relpatient_study000 r0 WHERE
(patient.chiliOID=r0.parentOID AND study.chiliOID=r0.childOID) order by
patient.name using <" > 3tableni
NOTICE:  QUERY PLAN:

Sort  (cost=1135170635.79..1135170635.79 rows=748802386 width=296)
  ->  Merge Join  (cost=1025510.64..1057837.48 rows=748802386 width=296)
->  Sort  (cost=1017989.22..1017989.22 rows=2556861 width=96)
  ->  Merge Join  (cost=4287.84..4763.21 rows=2556861
width=96)
->  Sort  (cost=990.43..990.43 rows=8725 width=72)
  ->  Seq Scan on patient  (cost=0.00..212.25
rows=8725 width=72)
->  Sort  (cost=3297.40..3297.40 rows=29305
width=24)
  ->  Seq Scan on relpatient_study000 r0 
(cost=0.00..774.05 rows=29305 width=24)
->  Sort  (cost=7521.42..7521.42 rows=29286 width=200)
  ->  Seq Scan on study  (cost=0.00..1116.86 rows=29286
width=200)

-



Thats the query with the new restructured database "pacs":

time psql -d pacs -c "SELECT patient.*,study.* FROM patient,study WHERE
(patient.chiliOID=study.patientOID ) order by patient.name using <" >
2tableni
1.730u 0.120s 0:09.44 19.5% 0+0k 0+0io 208pf+0w


psql -d pacs -c "EXPLAIN SELECT patient.*,study.* FROM patient,study
WHERE (patient.chiliOID=study.patientOID ) order by patient.name using
<" > 2tableni   NOTICE:  QUERY PLAN:

Sort  (cost=2194791.19..2194791.19 rows=2555204 width=284)
  ->  Merge Join  (cost=8978.44..9453.57 rows=2555204 width=284)

Re: [SQL] pl/pgsql - code review + question

2001-07-19 Thread Richard Huxton

From: "Richard Huxton" <[EMAIL PROTECTED]>

> False Laziness perhaps (spot my Perl background)
>
> OK - stick me down for having a look at it. Had a quick peek and I think
> it's within my abilities. Give me a couple of weeks, because I haven't
> looked at a YACC file since my university days (gram.y is YACC isn't it?)
>
> I'll read up the rules for submitting patches and get something workable
> back by August 3rd.
>
> Josh - if I try and do OFFSET at the same time (presumably it's the same
> change) do you fancy acting as a sanity test site?

What am I talking about - OFFSET is going to be parsed by the SQL parser not
the plpgsql parser.

Ignore me I'm blathering

- Richard Huxton



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] Query optimizing - paradox behave

2001-07-19 Thread Stephan Szabo


What version are you using? (dbPG95GetIndex?)

On Thu, 19 Jul 2001, David M. Richter wrote:

> Hallo !
> 
> I want to tune a database. There a many redundant datas in the database
> , because of all the relations were consider as n:m relations. But the
> most of them are 1:n Relations. So my approach was to cut the
> redundancies to get more performance. But .. happens!
> 
> The query with the 3 tables is faster than the query with 2 tables. 
> 
> That is paradox to the Explain output.
> And: the real database functions like dbPG95GetIndex and all functions
> defined by me are slower.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] pl/pgsql - code review + question

2001-07-19 Thread Josh Berkus

Richard,

> > Josh - if I try and do OFFSET at the same time (presumably it's the
> same
> > change) do you fancy acting as a sanity test site?
> 
> What am I talking about - OFFSET is going to be parsed by the SQL
> parser not
> the plpgsql parser.

Not so, not so!

Try the following two PL/pgSQL functions:

DECLARE
h INT;
k INT;
a_row a%ROWTYPE;
BEGIN
h := 1;
k := 3;
SELECT * FROM a INTO a_row
ORDER BY a.1
LIMIT 1 OFFSET (h + k);
END;

... blows up, but  ...


DECLARE
h INT;
k INT;
a_row a%ROWTYPE;
BEGIN
h := 1;
k := (3 + h);
SELECT * FROM a INTO a_row
ORDER BY a.1
LIMIT 1 OFFSET k;
END;

 ... works.


As far as I can tell, PL/pgSQL is not evaluating the expression before
passing it on to the SQL parser.  Or is the SQL parser supposed to
accept (1 + 3) 

Oh, I see what you mean.   Sorry!  Should I bug-traq this problem?  

-Josh


__AGLIO DATABASE SOLUTIONS___
   Josh Berkus
  Complete information technology  [EMAIL PROTECTED]
   and data management solutions   (415) 565-7293
  for law firms, small businessesfax 621-2533
and non-profit organizations.  San Francisco












---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] Query optimizing - paradox behave

2001-07-19 Thread Josh Berkus

David,

You will no doubt hear later from the tuning experts on the list.
However, let me save everybody some time by verifying some basics:

1. When you restructured the database, you ran VACUUM ANALYZE on the new
database (pacs)?

2. You said that you "eliminated the indexes" because they weren't
helping performance.  Is this right?  It seems a little hard to figure
from here.

3. General Advice:  If you're concerned about query performance, get rid
of those VARCHAR(80) primary keys and replace them with INT4 or INT8!
The math is easy to do:
If you're processing INT8 keys for 1,000,000 table rows that's
8,000,000 (roughly 8mb) data on disk and data being processed.
If you're processing VARCHAR(80) keys for 1,000,000 table rows, thats
82,000,000 bytes (82 mb) on disk and in ram to be processed.
In theory, you could get a 10-fold increase in JOIN performance by
switching to INT8 keys.  In practice, its probably more like double or
triple but that ain't bad, either.

-Josh Berkus



__AGLIO DATABASE SOLUTIONS___
   Josh Berkus
  Complete information technology  [EMAIL PROTECTED]
   and data management solutions   (415) 565-7293
  for law firms, small businessesfax 621-2533
and non-profit organizations.  San Francisco












---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[SQL] nextval on insert by arbitrary sequence

2001-07-19 Thread Dado Feigenblatt


I'm not sure if I worded the subject right, but my problem is this:

I have a few entries in one table. Each row is the parent of many 
entries in a second table.
In the second table I have a lot of entries referencing the entries on 
the first table.
So far so good. Basic foreign key thing.
The entries on the second table need to be numbered, but instead of a 
single sequence for all rows,
I need a sequence per group of rows, according to their parent record.

I have some ideas but they are not coming together.
I thought of creating individual sequences for each new parent row, and 
maybe store its name in the parent row itself, so it can be accessed by 
it's children row.
But then, how do I get a field on the second table to DEFAULT to 
nextval() on it's parent's sequence?

INSERT using SELECT?
TRIGGER?
RULE?
FUNCTION?
Is it best to handle this things entirely on the client ?
What is the approach for this problem?

Thanks.


PS: Hmmm... It doesn't look like I got my terminology right. Sorry for 
that.

-- 
Dado Feigenblatt Wild Brain, Inc.   
Technical Director   (415) 553-8000 x???
[EMAIL PROTECTED]   San Francisco, CA.




---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [SQL] nextval on insert by arbitrary sequence

2001-07-19 Thread Josh Berkus

Dado,

> I'm not sure if I worded the subject right, but my problem is this:
> 
> I have a few entries in one table. Each row is the parent of many 
> entries in a second table.
> In the second table I have a lot of entries referencing the entries
> on 
> the first table.
> So far so good. Basic foreign key thing.
> The entries on the second table need to be numbered, but instead of a
> single sequence for all rows,
> I need a sequence per group of rows, according to their parent
> record.

You *can* do this through PL/pgSQL triggers.  *however*, there's a
couple of problems with that idea:
1. It would be fairly elaborate for a trigger (i.e. lots of debugging).
2. It would only work for ON INSERT.  Deleting one row in the middle
could not reasonably be made to make all the rest re-number.
3. None of this makes sense if you intend to re-arrange the rows
according to some external criteria.

If it were me, I'd do it through interface (or better) middleware code,
disabling the user's ability to insert or delete rows directly and
forcing them to push inserts and deletes through some kind of function,
whether PL/pgSQL or Java-ORB middleware or whatever.

-Josh

__AGLIO DATABASE SOLUTIONS___
   Josh Berkus
  Complete information technology  [EMAIL PROTECTED]
   and data management solutions   (415) 565-7293
  for law firms, small businessesfax 621-2533
and non-profit organizations.  San Francisco












---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl