Re: [SQL] Does Postgresql have a similar pseudo-column "ROWNUM" as

2005-05-17 Thread Alain

Andrew Sullivan escreveu:
On Thu, May 12, 2005 at 01:07:00PM -0600, [EMAIL PROTECTED] wrote:
Does Postgresql have a similar pseudo-column "ROWNUM" as Oracle? If
so, we can write the following query:

No.  What is the purpose of your query?  You could use ORDER BY and
LIMIT..OFFSET to do what you want. I think.
The problem is probably speed. I have done a lot of tests, and when 
OFFSET gets to a few thousands on a multimega-recs database, it gets 
very very slow... Is there any other to work around that?

Alain
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [SQL] Does Postgresql have a similar pseudo-column "ROWNUM" as

2005-05-17 Thread Alain
No.  What is the purpose of your query?  You could use ORDER BY and
LIMIT..OFFSET to do what you want. I think.
The problem is probably speed. I have done a lot of tests, and when 
OFFSET gets to a few thousands on a multimega-recs database, it gets 
very very slow... 

is there not a similar loss of speed using ROWNUM on oracle?

... Is there any other to work around that?

if you are ordering by a unique key, you can use the key value
in a WHERE clause.
select ... where ukey>? order by ukey limit 100 offset 100;
(the ? is placeholder for the last value of ukey returned
from previous select)
I tried that. It does not work in the generic case: 6 MegaRec, telephone 
listing, alphabetical order. The problem is that somewhere there is a 
single user with too many entries (over 1000). I even tried to filter 
the repetitions, but somewhere I get stuck if one guy has too mny 
entries (one for each phone number).

I tried using both the name and the primary key (with a combined index), 
to get faster to the record I want, but I was not sucessfull in building 
a where clause.

I would appreciate any help, in fact this is my primary reason for 
joining this list ;-)

Alain
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [SQL] Does Postgresql have a similar pseudo-column "ROWNUM" as

2005-05-17 Thread Alain

Ragnar Hafstað escreveu:
[how to solve the get next 100 records problem]

I tried that. It does not work in the generic case: 6 MegaRec, telephone 
listing, alphabetical order.
lets say pkey is your primary key and skey is your sort key, and 
there exists an index on (skey,pkey)

your first select is
select ... from tab ORDER by skey,pkey LIMIT 100;
your subsequent selects are
select ... from tab WHERE skey>skey_last 
   OR (skey=skey_last AND pkey>pkey_last) 
ORDER BY skey,pkey
LIMIT 100 OFFSET 100;
I tied that, it is veeery slow, probably due to the OR operand :(
BUT, I think that this is close to a final solution, I made some 
preliminary test ok. Please tell me what you think about this.

Fisrt let's state that I am reading records to put on a screen (in a 
Table/Grid). I separated the problem is *3* parts

-first select is as above:
select ... from tab ORDER by skey,pkey LIMIT 100;
-second method for next 100:
select ... from tab WHERE skey>=skey_last
ORDER BY skey,pkey
LIMIT 100;
but here I test for repetitions using pkey and discard them
-now if I get all repetitions or the last 100 have the same skey with 
the second method, I use
select ... from tab WHERE skey=skey_last AND pkey>pkey_last
ORDER BY skey,pkey
LIMIT 100;
until I get an empty response, then I go back to the second method.

All queries are extremely fast with 600 records and it looks like 
the few redundant or empty queries (but very fast) will not be a problem.

What is your opinion about this (apart that it is a bit complex :) ??
Alain
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[SQL] Changed to: how to solve the get next 100 records problem

2005-05-18 Thread Alain
Hi Ragnar (and others),
I found something that is both fast and simple (program side):
Allways assuming that: pkey is a primary key and skey is a sort key, and
there exists an index on (skey,pkey)
first select is
select ... from tab ORDER by skey,pkey LIMIT 100;
subsequent selects are
(select ... from tab WHERE skey=skey_last AND pkey>pkey_last
 ORDER BY skey,pkey LIMIT 100)
UNION
(select ... from tab WHERE skey>skey_last
 ORDER BY skey,pkey LIMIT 100)
ORDER BY skey,pkey LIMIT 100;
The catch is that if the first select would have more than 100 records 
and was limited to 100, the second select's data is completeply 
discarted by the 3rd limit!

The only strange thing is that without the 3rd order by, the order is 
wrong. I didn't expect it because each select is created ordered. Is it 
expected that UNION mixes it all up? (using postgre 7.4.1)

The 3rd order by is not indexed, but it operates in a memory table of no 
more than 200 so it is fast too.

Please comment on this. I tested  and it worked but I really new to sql 
and I feel insecure...

Thanks,
Alain



[how to solve the get next 100 records problem]

BUT, I think that this is close to a final solution, I made some 
preliminary test ok. Please tell me what you think about this.

Fisrt let's state that I am reading records to put on a screen (in a 
Table/Grid). I separated the problem is *3* parts

-first select is as above:
select ... from tab ORDER by skey,pkey LIMIT 100;
-second method for next 100:
select ... from tab WHERE skey>=skey_last
ORDER BY skey,pkey
LIMIT 100;
but here I test for repetitions using pkey and discard them
-now if I get all repetitions or the last 100 have the same skey with 
the second method, I use
select ... from tab WHERE skey=skey_last AND pkey>pkey_last
ORDER BY skey,pkey
LIMIT 100;
until I get an empty response, then I go back to the second method.

if your distribution is such that those skeys that have > 100 records
tend to have a lot more, you might have a higher limit for this case.

All queries are extremely fast with 600 records and it looks like 
the few redundant or empty queries (but very fast) will not be a problem.

What is your opinion about this (apart that it is a bit complex :) ??

looks fine
gnari


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[SQL] Tip ?

2005-05-24 Thread Alain


This tip was at the end of a message (from Szűcs Gábor).


TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


it looks very important, but I cannot understand it. Sound as a small 
and easy mistake that can make things go sour...


Can someone explain it please?

thanks,
Alain

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] SELECT very slow

2005-06-08 Thread Alain



Tom Lane escreveu:

Thomas Kellerer <[EMAIL PROTECTED]> writes:

Is there anything I can do, to convince PG to return the first row more 
quickly?


Are you now looking for the LIMIT ?

SELECT * FROM table LIMIT 1;

and when when you wnat the rest of it:

SELECT * FROM table OFFSET 1;

Alain

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: Re(2): Re(2): [SQL] 7.0.3 BUG

2000-11-27 Thread Alain Toussaint

> >That's a rather interesting version report, seeing as how there is
> >no such gcc release as 2.95.3 according to the GCC project's homepage.

quick tidbit,there's no gcc 2.95.3 from GCC's but there's is one from
Pentium gcc (see <http://goof.com/pcg/> ),the pentium gcc group in this
case used gcc 2.95.2,applied their pentium patches and released the thing
as pgcc 2.95.3,that's the stock compiled used by mandrake.

Alain Toussaint




[SQL] SQL question

2001-01-14 Thread Alain Lavigne

I'm trying to extract references (relationships) between tables for the
purpose of reverse/forward engineer from a modeling tool called
PowerDesigner.  

Here is the sql:

select u.usename,
   p.relname,
   v.usename,
   c.relname,
   t.tgconstrname,
   dumpref(t.tgargs, 4),
   dumpref(t.tgargs, 5)
from   pg_trigger t,
   pg_proc f,
   pg_class p,
   pg_class c,
   pg_user u,
   pg_user v 
where  1=1 
andf.proname='RI_FKey_check_ins'
andt.tgfoid=f.oid 
andc.oid=t.tgrelid 
andp.oid=t.tgconstrrelid 
andu.usesysid=p.relowner 
andv.usesysid=c.relowner; 

I always get the following message: 

<< Error while executing the query; ERROR: Function 'dumpref(bytea, int4)'
does not exist Unable to identify a function that satisfies the given
argument types You may need to add explicit typecasts >>

What am I doing wrong ?? 


Alain Lavigne - Data Administrator - ZAQ.iTv  - E-Mail: [EMAIL PROTECTED]
297 St-Paul, West - Montreal, Quebec, Canada  - H2Y 2A5
Phone: 514-282-7073 ext: 371
Fax: 514-282-8011




[SQL] SQL question

2001-02-06 Thread Alain Lavigne

I'm trying to extract references (relationships) between tables for the
purpose of reverse/forward engineer from a modeling tool called
PowerDesigner.

Here is the sql:

select u.usename,
   p.relname,
   v.usename,
   c.relname,
   t.tgconstrname,
   dumpref(t.tgargs, 4),  ** (I know this function does not exist in
postgresql)
   dumpref(t.tgargs, 5)
from   pg_trigger t,
   pg_proc f,
   pg_class p,
   pg_class c,
   pg_user u,
   pg_user v
where  1=1
andf.proname='RI_FKey_check_ins'
andt.tgfoid=f.oid
andc.oid=t.tgrelid
andp.oid=t.tgconstrrelid
andu.usesysid=p.relowner
andv.usesysid=c.relowner;

I always get the following message:

** << Error while executing the query; ERROR: Function 'dumpref(bytea,
int4)'
does not exist Unable to identify a function that satisfies the given
argument types You may need to add explicit typecasts >>

Since integrity constraints are done using triggers, i makes sense that I
can reverse engineer those FK constraint from the pg_trigger table.
Unfortunately I don't know how to extract the information from the "tgargs"
field.

CAN ANYONE HELP ??





[SQL] Need help on a troublesome query plan

2002-07-16 Thread Alain Lavigne

On PostgreSQL Version 7.2.1 on Redhat Linux 7.1

Table bld_tb_bus_fact definition


 Column |   Type   | Modifiers 
-+-- +---
 bus_fact_id| bigint| not null
 bus_fact_ts| timestamp with time zone | not null
 party_id | bigint| 
 svc_id| bigint| not null
 bus_fact_data | text  | not null
 bus_fact_typ_cd | character(10)| not null
 bus_fact_kywrd   | character varying(300)   | 
 cont_id| bigint   | 
 perfby_id | bigint   | 
 
Index "bld_x1_tb_bus_fact"
 Column  | Type  
-+---
 party_id| bigint
 bus_fact_typ_cd | character(10)
 cont_id | bigint
btree

With the following query on 5 records:

explain
SELECT  bld_TB_BUS_FACT.BUS_FACT_ID AS id
FROMbld_TB_BUS_FACT
WHERE   bld_TB_BUS_FACT.PARTY_ID=1320677
AND bld_TB_BUS_FACT.BUS_FACT_TYP_CD='MSG_SENT'
AND bld_TB_BUS_FACT.CONT_ID=786448
AND bld_TB_BUS_FACT.BUS_FACT_KYWRD ILIKE '%MT-ID=3407979%'
AND bld_TB_BUS_FACT.BUS_FACT_KYWRD ILIKE '%S-ID=1310723%'
limit 1;

psql:test.sql:9: NOTICE:  QUERY PLAN:

Limit  (cost=0.00..2264.16 rows=1 width=8)
  ->  Seq Scan on bld_tb_bus_fact  (cost=0.00..2264.16 rows=1 width=8)

EXPLAIN

I don't understand why it's not using the defined index, even after performing VACUUM 
FULL ANALYZE on the table.
I tried disabling seqscan but that didn't change anything.

I'm open to suggestions anyone

Thanks!

----
Alain Lavigne - Data Administrator - ZAQ Interactive Solutions  E-Mail: 
[EMAIL PROTECTED]
297 St-Paul, West - Montreal, Quebec, Canada  - H2Y 2A5
Phone: 514-282-7073 ext: 371 - Fax: 514-282-8011


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] Need help on a troublesome query plan

2002-07-17 Thread Alain Lavigne

Thanks that worked, but why does that happen or maybe you could point to the proper 
thread so I read up on it.




Alain Lavigne - Data Administrator - ZAQ Interactive Solutions  E-Mail: 
[EMAIL PROTECTED]
297 St-Paul, West - Montreal, Quebec, Canada  - H2Y 2A5
Phone: 514-282-7073 ext: 371 - Fax: 514-282-8011


-Original Message-
From: Stephan Szabo [mailto:[EMAIL PROTECTED]]
Sent: July 16, 2002 14:52
To: Alain Lavigne
Cc: [EMAIL PROTECTED]
Subject: Re: [SQL] Need help on a troublesome query plan



On Tue, 16 Jul 2002, Alain Lavigne wrote:

> Index "bld_x1_tb_bus_fact"
>  Column  | Type
> -+---
>  party_id| bigint
>  bus_fact_typ_cd | character(10)
>  cont_id | bigint
> btree
>
> With the following query on 5 records:
>
> explain
> SELECT  bld_TB_BUS_FACT.BUS_FACT_ID AS id
> FROMbld_TB_BUS_FACT
> WHERE   bld_TB_BUS_FACT.PARTY_ID=1320677
> AND bld_TB_BUS_FACT.BUS_FACT_TYP_CD='MSG_SENT'
> AND bld_TB_BUS_FACT.CONT_ID=786448
> AND bld_TB_BUS_FACT.BUS_FACT_KYWRD ILIKE '%MT-ID=3407979%'
> AND bld_TB_BUS_FACT.BUS_FACT_KYWRD ILIKE '%S-ID=1310723%'
> limit 1;

You'll need to either single quote or explicitly cast the
constants you're comparing to the bigint columns.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[SQL] Database structure

2004-05-04 Thread Alain Reymond
Hello,

I would like an advise on the following problem :

I have a table of patients. 
Each patient can make different biological assessments. 
Each assessment is always decomposed into different laboratory tests.
A laboratory test is made of a test number and two values coming from analysers.

The schema is :
Patients(#patient_nr,name,etc...)
Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values).
Assessment_types(assessment_type, labtest_nr)
An assessment is composed of different tests, let's say assessment type 1 is 
composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70.

I have an assessment with 60 different lab tests (always the same). I have two ways 
for storing the values :

1 - a table with 120 columns for the two values.
results(#assessment_nr, p10,d10, p11,d11, .,p70,d70).
where 10 to 70 represents the lab test number.

2 - a table with 60 rows for one assessment :
results(#assessment_nr, labtest_nr, p, d) where p and d are my two results.

Here comes my question. Which of the two would you choose?

The firsrt solution has the advantage of returning one single row for one complete 
assessment. If I have to make statistics, it is easy. But, if I have to modify the 
composition of an assessment (which occurs very rarely), I shall have to use an alter 
table instruction. As I have 4 different assessment types, I have to create five 
different tables, one per assessment type.

The second solution is normalized and more elegant. But I am preoccupied by the 
size of the table. For one assessment, I'll store 60 rows with only two useful 
integers 
in it. And you must add the size of the index. With 25.000 assessments a year, it 
makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2 
for identification. I would like to store 10 years online, so 15.000.000 rows. What 
about the size of index ?

Any advise ? I thank you in advance.


Alain Reymond

(I hope that it is clear enough with my bad English).


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Database structure

2004-05-04 Thread Alain Reymond
I thank you for your answer. 

The more I think about it, the more I find the second option better. Just one 
precision. 
All tests are always done, so I always hae all columns filled with a result.

My only trouble was about size and performance. I store only a few byte with a lot of 
overhead (#assessment_nr, labtest_nr) for only one integer and one real per row. And I 
can have up to 1.500.000 rows per year with at least 10 years on line... It means big 
indexes.

Regards.

Alain

> I would go for the second one. I think the size of the table is not a
> problem. You will have just to write the right indexes for easy joins.
> 
> OBS: " For one assessment, I'll store 60 rows with only two useful
> integers in it" ... why? Better make a "lab_test" table where you have
> the tab tests and you write in the results(#assessment_nr, labtest_nr,
> p, d) only those datas that you have. For example if you have the
> assesment no. 3000 and you have only the results for lab_test 10->40
> then why to write in the DB also the lab_test from 40->70(if you don't
> have it)??? (if I didn't understand this clear, sorry for the
> observation).
> 
> 
> The second option is better if you change one time the lab_test
> list(have to think also this option --- if making the database for at
> least 10 years). Because in the first solution you will have to add
> always a new column... and that is not the "best" option. In the
> second way you just add a new ID in the lab_test list and finish. No
> problems.
> 
> If you go for the first option and you have to change something in the
> result table... it won't be easy.
> 
> The alter table is not so tragical as it seems... use
> constrains...don't ever erase from DB.
> 
> So... my final answer: the second option.
Alain Reymond
CEIA
Bd Saint-Michel 119
1040 Bruxelles
Tel: +32 2 736 04 58
Fax: +32 2 736 58 02
[EMAIL PROTECTED]
PGP key sur http://pgpkeys.mit.edu:11371



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[SQL] degradation in performance

2004-09-21 Thread Alain Reymond
Good afternoon,

I created a database with Postgres 7.3.4 under Linux RedHat 7.3 on a 
Dell PowerEdge server. 

One of the table is 
resultats(numbil, numpara, mesure, deviation)
with an index on numbil.

Each select on numbil returns up to 60 rows (that means 60 rows for 
one numbil with 60 different numpara) for example
(20,1,500,3.5)
(20,2,852,4.2)
(20,12,325,2.8)
(21,1,750,1.5)
(21,2,325,-1.5)
(21,8,328,1.2)
etc..

This table contains now more than 6.500.000 rows and grows from 
6000 rows a day. I have approximatively 1.250.000 rows a year. So I 
have 5 years of data online.
Now, an insertion of 6000 lasts very lng, up to one hour...
I tried to insert 100.000 yesterday evening and it was not done in 8 
hours.

Do you have any idea how I can improve speed - apart from splitting 
the table every 2 or 3 years which is the the aim of a database!

I thank you for your suggestions.

Regards.

Alain Reymond
CEIA
Bd Saint-Michel 119
1040 Bruxelles
Tel: +32 2 736 04 58
Fax: +32 2 736 58 02
[EMAIL PROTECTED]
PGP key sur http://pgpkeys.mit.edu:11371



---(end of broadcast)---
TIP 8: explain analyze is your friend


[SQL] How do write a query...

2005-06-05 Thread Alain Reymond
Hello,

I have the following problem :

I have a table like
IdNum  Date   AValue
1  10 01/01/2005   50
2  10 31/05/2005   60
3  25 02/02/2005   55
4  25 15/03/2005   43
5  25 28/05/2005   62
etc..

Id is unique, Num is an identification number with duplicates possible,
date is a ... date and Avalue... a value!

If we have
IdNum  Date   AValue
Id1  Num1Date1  AValue1
Id2  Num1Date2  AValue2

The table is ordered on Num+Date.
What I would like to calculate is (AValue2-AValue1) for a given Num
(here num1).

In this case, I would have to calculate
60-50 for Num 10
and
43-55, 62-43 for Num 25.

Do you have any idea if it can be done simply with a request...

I thank you

Regards.

Alain Reymond



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])