Re: [SQL] Check a value in array
hi, Marco Lazzeri wrote: Il gio, 2004-04-29 alle 19:56, Rod Taylor ha scritto: On Thu, 2004-04-29 at 13:37, Marco Lazzeri wrote: Hi all. I have to check if a value is in an array. I've got a date array in a table and I would like to perform queries like: SELECT * FROM table WHERE date IN dates_array; If you're using 7.4 or later, try: SELECT * FROM table WHERE date = ANY(dates_array); This will work without the contrib package. Thank you Rod, I forgot to specify I'm using PostgreSQL 7.3.5. or, you can write a procedure, and make and it immutable:) CREATE OR REPLACE FUNCTION in_array (numeric [], numeric) RETURNS boolean AS' declare array_to alias for $1; array_value alias for $2; i integer default 1; begin while array_to[i] is not null loop if array_value = array_to[i] then return true; end if; i := i+1; end loop; return false; end; 'LANGUAGE 'plpgsql' immutable RETURNS NULL ON NULL INPUT SECURITY INVOKER; change the numeric to your specified type and: WHERE in_array(dates_array,date) = true C. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Use arrays or not?
Hi, Thanks for your fast response. But I'm afraid I stated some things unclear. > >>The data consists of XML-records, with a lot of XML-fields. I want to > >>store > >>the XML as it is, so taking the information from the XML-records and then > >>storing it in a different-from-XML-format is not an option. > > Actually, your data consists of data. The XML is just scaffolding to > enable you to locate and understand your data. Since you are storing it > in a relational database, that will use relational scaffolding for its > organisation instead. At least partially, you will *have* to parse the > values out and organise them differently. I do, and I know. But I want to store the XML-records as they are. So given an author, title and an XML-record that are related to that author and title, how to store that. That's the question. I have good reasons to store the XML-records as they are, without further parsing them. > >>Each XML-record describes data about one book. If an update of bookdata > >>comes, the XML itself is not changed, but a new XML-record is stored with > >>the updated data. Via a complex scheme of combining a base record and its > >>updates, the final dataset is produced that is used in the application. > >> > >>Searching is done via a simple table lookup on three different fields: > >>title, author and subject. The data for these fields is extracted from the > >>database. Each book has a unique identifier (EAN13, derivative of ISBN). > >> > >>Here is one way to organize the database: > >>table title: > >>TITLE | EAN13, indexing on TITLE > >> > >>table author: > >>AUTHOR | EAN13, indexing on AUTHOR > >> > >>table subject: > >>SUBJECT | EAN13, indexing on SUBJECT. > > > > > >This is a *very* strange way of setting up your database. Are you new to > >Relational Databases and SQL? If so, I'd recommend starting with a book > >on relational database design. > I agree with Josh - think about a book. Thank your for the recommendations. But the above thing is just background information, it will not be stored as such. The important question for me is the question whether to use arrays or not. With index in the above examples I do not mean the actual postgres-index, I mean that those are the fields that are used in searching. One never searches on an EAN13-number, only on author, title and subject. And one never, by the way, searches for a specific XML-record, only the total of the stored XML-records per book should be retrieved. > > >If only one author, title and subject are allowed per book, you should > >have: > > > >table books > > EAN13 | TITLE | AUTHOR | SUBJECT > > If, on the other hand you can have multiple authors (likely) you'll want > something like: > > CREATE TABLE author ( > ean13varchar(13), -- Guessing ean13 format > author_num int4, > author_name text, > PRIMARY KEY (ean13, author_num) > ); > > Then you can have rows like: > > ('my-ean-number-here', 1, 'Aaron Aardvark') > ('my-ean-number-here', 2, 'Betty Bee') > etc. Yes, I have such a thing. There can be multiple titles, multiple authors and multiple keywords per book. > > > >>Finally: > >>table record: > >>EAN13 | ARRAY OF XML-records. > >> > >>It's the last table that I am most curious (and worried) about, the > >>question being mainly what the optimal way of structuring that table is. > >>Option 1 is the given option: adding/deleting an XML-record for the same > >>book requires adding/deleting it to/from the array of XML-records. > >> > >>Option 2 would be something like this: > >>EAN13 | XML-record > >>where, if a book has several records describing it, there are multiple > >>entries of the EAN13|XML-record - pair. Adding an XML-record for the same > >>book, requires adding a new entry to the table as a whole. > > > > > >In my mind, there is no question that this is the best way to do things. > >It is a normalized data structure, as opposed to the arrays, which are now. > > Although your option 2 doesn't go quite far enough. You'll also want to > know what order these come in. So, assuming you can't have two updates > at the same time: > > CREATE TABLE book_history ( > ean13 varchar(13), -- Guessing ean13 format > ts timestamp with time zone, > xmltext, > PRIMARY KEY (ean13, ts) > ); The order is not important; the interpretation of the XML-records is done by an external module. The order is determined upon the content of the XML-records, because they can come from different sources and can be combined in different ways, depending on the application processing the XML-records. Order is not determined at the moment that the records are stored, but at the moment the records are interpreted. > As for your other concerns: > >>Indexing: > >>Database growth: > >>Integrity: > Just worry about the integrity - if you keep the design simple, > PostgreSQL will manage quite large growth on quite small hardware. What would be a situation in which one should use arrays
[SQL] Database structure
Hello, I would like an advise on the following problem : I have a table of patients. Each patient can make different biological assessments. Each assessment is always decomposed into different laboratory tests. A laboratory test is made of a test number and two values coming from analysers. The schema is : Patients(#patient_nr,name,etc...) Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values). Assessment_types(assessment_type, labtest_nr) An assessment is composed of different tests, let's say assessment type 1 is composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70. I have an assessment with 60 different lab tests (always the same). I have two ways for storing the values : 1 - a table with 120 columns for the two values. results(#assessment_nr, p10,d10, p11,d11, .,p70,d70). where 10 to 70 represents the lab test number. 2 - a table with 60 rows for one assessment : results(#assessment_nr, labtest_nr, p, d) where p and d are my two results. Here comes my question. Which of the two would you choose? The firsrt solution has the advantage of returning one single row for one complete assessment. If I have to make statistics, it is easy. But, if I have to modify the composition of an assessment (which occurs very rarely), I shall have to use an alter table instruction. As I have 4 different assessment types, I have to create five different tables, one per assessment type. The second solution is normalized and more elegant. But I am preoccupied by the size of the table. For one assessment, I'll store 60 rows with only two useful integers in it. And you must add the size of the index. With 25.000 assessments a year, it makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2 for identification. I would like to store 10 years online, so 15.000.000 rows. What about the size of index ? Any advise ? I thank you in advance. Alain Reymond (I hope that it is clear enough with my bad English). ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[SQL] an indexing problem on postgreSQL
All, I have an data-driven web application with Java and PostgreSQL. We have a few tables which hold quite large amount of data. Therefore we put indexing (B-tree) on them to make query faster. In addition, we are also doing data synchronization on these tables with another microsoft FoxPro application in real-time. Normally the indexing works just fine without data synchronization. But with data synchronization, after a period of time, the indexing did not take effect on the table so that the application goes extremely slow. Does anyone there has an idea why and have a solution to the problem? Your help is greatly appreciated. Sincerely, -John ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[SQL] Bacula
Hi Kent, Did you have a look at Bacula (www.bacula.org) or at Amanda? They store the catalog containing what they did in a relational database. Bacula now also works with Postgresql. It still needs to be optimized for speed, but it works well. It is possible to backup Linux, Solaris, BSD and Windows clients to tape or disk (can be external, volume files can be copied to DVD±R or CD-RW as well.) Jo ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] invalid input syntax for integer: ""
Hello :-) got a little prob... so my table create table warengruppen ( kennung char (1), holder int, HauptWarenGruppe int, BezHWG varchar (50), WarenGruppe int, BezWG varchar (50)); the copy syntax copy warengruppen FROM '/usr/pgsql/datanorm.wrg' with DELIMITER ';'; and a smal part of my 'datanorm.wrg' S;;011;Dachsteine+Formst.;;;S;;011;;0111;Dachst.glatt(Biber);S;;011;;0112;Dachst.glatt(Tegal.);S;;011;;0114;Dachst.mulde(Donau);S;;011;;0116;Dachst.symm.(Frankf);S;;011;;0118;Dachst.asym.(Dop.-S);S;;011;;0119;Dachst.Sonstige;S;;012;Dachziegel+Formst.;;;S;;012;;0121;Biberschwanzziegel;S;;012;;0122;Hohlz.+Krempz.;S;;012;;0123;Mnch + Nonne;S;;012;;0124;Strangfalzziegel;S;;012;;0125;Falzz.(Doppelmulde);S;;012;;0126;Flachdachziegel;S;;012;;0127;Verschiebeziegel;S;;012;;0129;Ziegel Sonstige;S;;013;Faserzementplatten;;;S;;013;;0131;Dach+Fassadenplatten;S;;013;;0133;Wellplatten;S;;013;;0135;Tafeln;S;;013;;0137;Elemente;S;;014;Fassadenpl.speziell;;;S;;014;;0141;Asphalt; and the error message ERROR: invalid input syntax for integer: ""CONTEXT: COPY warengruppen, line 1, column holder: "" i know it's hard to fill the integer colume holder with nothing out of csv table.. but it's the same with the colume WarenGruppe so i can't use a char type... so mybe someone know a litte trick to save me ?? don't like to reformat 10MB datasource ... thank's Sebastian
[SQL] Rank
Dear All, I Have below table id | site_name | point+---+--- 1 | Site A | 40 2 | Site B | 90 3 | Site D | 22 4 | Site X | 98 Would like to calc that Rank for each site, and look like id | site_name | point | rank+---+---+-- 1 | Site A | 40 |3 2 | Site B | 90 |2 3 | Site D | 22 |4 4 | Site X | 98 | 1 What can I do to get result like that
Re: [SQL] isnumeric() function?
hi, Yudie wrote: > What is isnumeric function in postgresql? > I'm using psql version 7.2.2 > thanks probably somebody write a function called isnumeric for you :) So it must be a user defined function. C. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] Rank
Dear All, I Have below table id | site_name | point+---+--- 1 | Site A | 40 2 | Site B | 90 3 | Site D | 22 4 | Site X | 98 Would like to calc that Rank for each site, and look like id | site_name | point | rank+---+---+-- 1 | Site A | 40 |3 2 | Site B | 90 |2 3 | Site D | 22 |4 4 | Site X | 98 | 1 What can I do to get result like that
[SQL] typecasting numeric(18,4) to varchar/text
hi how can i typecast a numeric(18,4) value (that i'm extracting from a table) into a a varchar (which i'll be inserting into another table) that is: create table a (a numeric(18,4)); create table b (b varchar(25)); insert into a values(12000.43); insert into b select (a.a)::varchar; the above gives the followig error ERROR: Cannot cast type 'numeric' to 'character varying' alternately try (this is my real requirement) insert into b select 'ID'||((a.a)::varchar); there has been no results on google thanks in advance ashok mail2web - Check your email from the web at http://mail2web.com/ . ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Rank
On Sun, 2 May 2004 02:22:37 +0800 "Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> threw this fish to the penguins: > I Have below table > > id | site_name | point > +---+--- > 1 | Site A|40 > 2 | Site B|90 > 3 | Site D|22 > 4 | Site X|98 > > Would like to calc that Rank for each site, and look like > > id | site_name | point | rank > +---+---+-- > 1 | Site A|40 |3 > 2 | Site B|90 |2 > 3 | Site D|22 |4 > 4 | Site X|98 |1 Well, a simple minded solution would be: select id,site_name,point,(select count(*)from mytable t2 where t2.point >= t1.point) as rank from mytable t1; id | site_name | point | rank +---+---+-- 4 | Site X|98 |1 2 | Site B|90 |2 1 | Site A|40 |3 3 | Site D|22 |4 (4 rows) If mytable is huge this may be prohibitively slow, but it's worth a try. There's probably a self join that would be faster. Hmm... in fact: select t1.id,t1.site_name,t1.point,count(t2.point) as rank from mytable t1,mytable t2 where t2.point >=t1.point group by t1.id,t1.site_name,t1.point; id | site_name | point | rank +---+---+-- 3 | Site D|22 |4 2 | Site B|90 |2 4 | Site X|98 |1 1 | Site A|40 |3 (4 rows) -- George Young -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] start
You say that you're running Suse 8. You can also start the process with that Yast. Start Yast2 -> System -> Runlevel Editor-> and then you can select the runlevel when you want that postgre starts. I start it at runlevel 3. If you installed postgre it manually you have a little work to do, probably you will have to make your own start script. In my situation(also manual installation) I just copied the script which came with suse and edited it for my configuration... it worked for me. Good luck! - Original Message - From: "H.J. Sanders" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, May 03, 2004 10:36 PM Subject: [SQL] start > > Hello list. > > > I have difficulties starting the postmaster automatically at boot time > (everything I tried is done by 'root'). > > Can someone give me an example for LINUX (SUSE 8). > > Many thanks. > > > Henk Sanders > > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] typecasting numeric(18,4) to varchar/text
On Tue, May 04, 2004 at 09:35:31AM -0400, [EMAIL PROTECTED] wrote: > create table a (a numeric(18,4)); > create table b (b varchar(25)); > insert into a values(12000.43); > insert into b select (a.a)::varchar; Which version is that? Here's my session: andrewtest=# create table a (a numeric(18,4)); CREATE TABLE andrewtest=# create table b (b varchar(25)); CREATE TABLE andrewtest=# insert into a values(12000.43); INSERT 17168 1 andrewtest=# insert into b select (a.a)::varchar; INSERT 17169 1 That's on 7.4.2. You might want to try casting to text first. A -- Andrew Sullivan | [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Database structure
I would go for the second one. I think the size of the table is not a problem. You will have just to write the right indexes for easy joins. OBS: " For one assessment, I'll store 60 rows with only two useful integers in it" ... why? Better make a "lab_test" table where you have the tab tests and you write in the results(#assessment_nr, labtest_nr, p, d) only those datas that you have. For example if you have the assesment no. 3000 and you have only the results for lab_test 10->40 then why to write in the DB also the lab_test from 40->70(if you don't have it)??? (if I didn't understand this clear, sorry for the observation). The second option is better if you change one time the lab_test list(have to think also this option --- if making the database for at least 10 years). Because in the first solution you will have to add always a new column... and that is not the "best" option. In the second way you just add a new ID in the lab_test list and finish. No problems. If you go for the first option and you have to change something in the result table... it won't be easy. The alter table is not so tragical as it seems... use constrains...don't ever erase from DB. So... my final answer: the second option. Best regards, Andy. - Original Message - From: "Alain Reymond" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, April 30, 2004 6:06 PM Subject: [SQL] Database structure > Hello, > > I would like an advise on the following problem : > > I have a table of patients. > Each patient can make different biological assessments. > Each assessment is always decomposed into different laboratory tests. > A laboratory test is made of a test number and two values coming from analysers. > > The schema is : > Patients(#patient_nr,name,etc...) > Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values). > Assessment_types(assessment_type, labtest_nr) > An assessment is composed of different tests, let's say assessment type 1 is > composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70. > > I have an assessment with 60 different lab tests (always the same). I have two ways > for storing the values : > > 1 - a table with 120 columns for the two values. > results(#assessment_nr, p10,d10, p11,d11, .,p70,d70). > where 10 to 70 represents the lab test number. > > 2 - a table with 60 rows for one assessment : > results(#assessment_nr, labtest_nr, p, d) where p and d are my two results. > > Here comes my question. Which of the two would you choose? > > The firsrt solution has the advantage of returning one single row for one complete > assessment. If I have to make statistics, it is easy. But, if I have to modify the > composition of an assessment (which occurs very rarely), I shall have to use an alter > table instruction. As I have 4 different assessment types, I have to create five > different tables, one per assessment type. > > The second solution is normalized and more elegant. But I am preoccupied by the > size of the table. For one assessment, I'll store 60 rows with only two useful integers > in it. And you must add the size of the index. With 25.000 assessments a year, it > makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2 > for identification. I would like to store 10 years online, so 15.000.000 rows. What > about the size of index ? > > Any advise ? I thank you in advance. > > > Alain Reymond > > (I hope that it is clear enough with my bad English). > > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] invalid input syntax for integer: ""
On Sat, 1 May 2004, Sebastian Tewes wrote: > Hello :-) > > got a little prob... > > so my table > > create table warengruppen ( kennung char (1), > holder int, > HauptWarenGruppe int, > BezHWG varchar (50), > WarenGruppe int, > BezWG varchar (50)); > > > the copy syntax > > copy warengruppen FROM '/usr/pgsql/datanorm.wrg' with DELIMITER ';'; > > and a smal part of my 'datanorm.wrg' > > > S;;011;Dachsteine+Formst.;;; > S;;011;;0111;Dachst.glatt(Biber); > S;;011;;0112;Dachst.glatt(Tegal.); the problem here is that "" is not an integer. 0 is. If you were to write a simple script file to process the input file and turn the empty fields into 0s where they should be because they're integers you're set. OR, you can build a temp table with all text columns, import into that, then use a select query to change the blank text columns to the character 0, then import that into the target table. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] typecasting numeric(18,4) to varchar/text
Am 04.05.2004 16:37 schrieb Andrew Sullivan: > On Tue, May 04, 2004 at 09:35:31AM -0400, [EMAIL PROTECTED] wrote: > >>create table a (a numeric(18,4)); >>create table b (b varchar(25)); >>insert into a values(12000.43); >>insert into b select (a.a)::varchar; > > > Which version is that? Here's my session: > > andrewtest=# create table a (a numeric(18,4)); > CREATE TABLE > andrewtest=# create table b (b varchar(25)); > CREATE TABLE > andrewtest=# insert into a values(12000.43); > INSERT 17168 1 > andrewtest=# insert into b select (a.a)::varchar; > INSERT 17169 1 > > That's on 7.4.2. You might want to try casting to text first. > Yes, it also works for me with Version 7.4.2. I found an older postgres DB with Version 7.2 and I get the same error as ashok demo=# select a::varchar from a; ERROR: Cannot cast type 'numeric' to 'character varying' demo=# select a::text from a; ERROR: Cannot cast type 'numeric' to 'text' demo=# select '\'' || a || '\'' from a;; ERROR: Unable to identify an operator '||' for types 'unknown' and 'numeric' You will have to retype this query using an explicit cast Greetings, Martin -- Martin Knipper www : http://www.mk-os.de Mail : [EMAIL PROTECTED] Random Signature: - while( !asleep() ) sheep++; ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Rank
Am 04.05.2004 16:11 schrieb george young: > On Sun, 2 May 2004 02:22:37 +0800 > "Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> threw this fish to the penguins: > >>I Have below table >> >> id | site_name | point >>+---+--- >> 1 | Site A|40 >> 2 | Site B|90 >> 3 | Site D|22 >> 4 | Site X|98 >> >> Would like to calc that Rank for each site, and look like >> >> id | site_name | point | rank >>+---+---+-- >> 1 | Site A|40 |3 >> 2 | Site B|90 |2 >> 3 | Site D|22 |4 >> 4 | Site X|98 |1 > > > Well, a simple minded solution would be: > > select id,site_name,point,(select count(*)from mytable t2 > where t2.point >= t1.point) as rank from mytable t1; > > id | site_name | point | rank > +---+---+-- > 4 | Site X|98 |1 > 2 | Site B|90 |2 > 1 | Site A|40 |3 > 3 | Site D|22 |4 > (4 rows) > > If mytable is huge this may be prohibitively slow, but it's worth a try. > There's probably a self join that would be faster. Hmm... in fact: > > select t1.id,t1.site_name,t1.point,count(t2.point) as rank from mytable t1,mytable t2 > where t2.point >=t1.point group by t1.id,t1.site_name,t1.point; > > id | site_name | point | rank > +---+---+-- > 3 | Site D|22 |4 > 2 | Site B|90 |2 > 4 | Site X|98 |1 > 1 | Site A|40 |3 > (4 rows) > > > -- George Young Another possibilty is to use a sequence: demo=# create temporary sequence ranking; demo=# select *,nextval('ranking') as rank from yourTable order by site_name asc; Greetins, Martin -- Martin Knipper www : http://www.mk-os.de Mail : [EMAIL PROTECTED] Random Signature: - while( !asleep() ) sheep++; ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Database structure
I thank you for your answer. The more I think about it, the more I find the second option better. Just one precision. All tests are always done, so I always hae all columns filled with a result. My only trouble was about size and performance. I store only a few byte with a lot of overhead (#assessment_nr, labtest_nr) for only one integer and one real per row. And I can have up to 1.500.000 rows per year with at least 10 years on line... It means big indexes. Regards. Alain > I would go for the second one. I think the size of the table is not a > problem. You will have just to write the right indexes for easy joins. > > OBS: " For one assessment, I'll store 60 rows with only two useful > integers in it" ... why? Better make a "lab_test" table where you have > the tab tests and you write in the results(#assessment_nr, labtest_nr, > p, d) only those datas that you have. For example if you have the > assesment no. 3000 and you have only the results for lab_test 10->40 > then why to write in the DB also the lab_test from 40->70(if you don't > have it)??? (if I didn't understand this clear, sorry for the > observation). > > > The second option is better if you change one time the lab_test > list(have to think also this option --- if making the database for at > least 10 years). Because in the first solution you will have to add > always a new column... and that is not the "best" option. In the > second way you just add a new ID in the lab_test list and finish. No > problems. > > If you go for the first option and you have to change something in the > result table... it won't be easy. > > The alter table is not so tragical as it seems... use > constrains...don't ever erase from DB. > > So... my final answer: the second option. Alain Reymond CEIA Bd Saint-Michel 119 1040 Bruxelles Tel: +32 2 736 04 58 Fax: +32 2 736 58 02 [EMAIL PROTECTED] PGP key sur http://pgpkeys.mit.edu:11371 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Database structure
I would definately say solution two. As you point out yourself, there are only for int4s (propably even int2s), that is 8 bytes each for the int4 (if I remeber corretly), which equals something in the 40-50 bytes range for the row w/o index. For 15m rows, thats not much more than 750 megabytes without the indexes, which I believe take up roughly the same amount of space. That might be around 1.5 GB of data, which I still consider a reasonably sized database. What I work on daily is in the 1 GB range already and grows 25-30 megabytes/day and we use large (3000 bytes or longer), complex (with more than 10 subselects and utilizing inner joins, outer joins, cross joins) queries, who are returning around 3000 rows each. This runs in a matter of 2 minutes on a single cpu 2ghz system with ide disk (no raid) and only half a gig of memory. A good starting point for help would be data sizes of each tables (in my opinion the number of digits is usually close enough), complexity of querys (how many tables, subselects, types of joins,uses of aggregates and so on) and finally what is considered "fast enough" - for a website 2 seconds may be fast enough, for a croned job once a month, the same might be true for 3 days. In the long run, being correct is usually better than being fast (at the point of the implementation), as new hardware easily solves bottlenecks for problems not scaling exponentially. Svenne Alain Reymond wrote: Hello, I would like an advise on the following problem : I have a table of patients. Each patient can make different biological assessments. Each assessment is always decomposed into different laboratory tests. A laboratory test is made of a test number and two values coming from analysers. The schema is : Patients(#patient_nr,name,etc...) Assessment(#assessment_nr, #patient_nr, assessment_type, other usefull values). Assessment_types(assessment_type, labtest_nr) An assessment is composed of different tests, let's say assessment type 1 is composed of lab test 1,2,3,5 and assessment type 2 of lab test number 10 to 70. I have an assessment with 60 different lab tests (always the same). I have two ways for storing the values : 1 - a table with 120 columns for the two values. results(#assessment_nr, p10,d10, p11,d11, .,p70,d70). where 10 to 70 represents the lab test number. 2 - a table with 60 rows for one assessment : results(#assessment_nr, labtest_nr, p, d) where p and d are my two results. Here comes my question. Which of the two would you choose? The firsrt solution has the advantage of returning one single row for one complete assessment. If I have to make statistics, it is easy. But, if I have to modify the composition of an assessment (which occurs very rarely), I shall have to use an alter table instruction. As I have 4 different assessment types, I have to create five different tables, one per assessment type. The second solution is normalized and more elegant. But I am preoccupied by the size of the table. For one assessment, I'll store 60 rows with only two useful integers in it. And you must add the size of the index. With 25.000 assessments a year, it makes 1.500.000 rows with only 4 columns amoung them 2 only for the results and 2 for identification. I would like to store 10 years online, so 15.000.000 rows. What about the size of index ? Any advise ? I thank you in advance. Alain Reymond (I hope that it is clear enough with my bad English). ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Database structure
On Tue, May 04, 2004 at 18:10:13 +0200, Svenne Krap <[EMAIL PROTECTED]> wrote: > > In the long run, being correct is usually better than being fast (at the > point of the implementation), as new hardware easily solves bottlenecks > for problems not scaling exponentially. And it isn't even clear that denormalizing the schema will result in an increase in speed. If at some point the tests in various assessments can overlap you may not want an assessment for each table. I also noticed that the schema isn't enforcing consistancy between the tests done and the assessment type being done. This may not really be a business rule as much as something that might be flagged by the application for attention as I can see cases where in reality the wrong test is done and recording its results might be better than throwing the data away. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Rank
Martin Knipper wrote: Am 04.05.2004 16:11 schrieb george young: On Sun, 2 May 2004 02:22:37 +0800 "Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> threw this fish to the penguins: I Have below table id | site_name | point +---+--- 1 | Site A|40 2 | Site B|90 3 | Site D|22 4 | Site X|98 Would like to calc that Rank for each site, and look like id | site_name | point | rank +---+---+-- 1 | Site A|40 |3 2 | Site B|90 |2 3 | Site D|22 |4 4 | Site X|98 |1 Well, a simple minded solution would be: select id,site_name,point,(select count(*)from mytable t2 where t2.point >= t1.point) as rank from mytable t1; id | site_name | point | rank +---+---+-- 4 | Site X|98 |1 2 | Site B|90 |2 1 | Site A|40 |3 3 | Site D|22 |4 (4 rows) If mytable is huge this may be prohibitively slow, but it's worth a try. There's probably a self join that would be faster. Hmm... in fact: select t1.id,t1.site_name,t1.point,count(t2.point) as rank from mytable t1,mytable t2 where t2.point >=t1.point group by t1.id,t1.site_name,t1.point; id | site_name | point | rank +---+---+-- 3 | Site D|22 |4 2 | Site B|90 |2 4 | Site X|98 |1 1 | Site A|40 |3 (4 rows) -- George Young Another possibilty is to use a sequence: demo=# create temporary sequence ranking; demo=# select *,nextval('ranking') as rank from yourTable order by site_name asc; Greetins, Martin wouldn't it have to be: select *, nextval('ranking') as rank from yourTable order by point desc; for the ranking to work? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Rank
Am 04.05.2004 18:47 schrieb Rob: > Martin Knipper wrote: >>Am 04.05.2004 16:11 schrieb george young: >>>On Sun, 2 May 2004 02:22:37 +0800 >>>"Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> threw this fish to the penguins: >>> I Have below table id | site_name | point +---+--- 1 | Site A|40 2 | Site B|90 3 | Site D|22 4 | Site X|98 Would like to calc that Rank for each site, and look like id | site_name | point | rank +---+---+-- 1 | Site A|40 |3 2 | Site B|90 |2 3 | Site D|22 |4 4 | Site X|98 |1 >>> >>> [...] >> >>Another possibilty is to use a sequence: >> >>demo=# create temporary sequence ranking; >>demo=# select *,nextval('ranking') as rank from yourTable order by >>site_name asc; >> > > wouldn't it have to be: > > select *, nextval('ranking') as rank >from yourTable > order by point desc; > > for the ranking to work? > Yes, you are right. I guess I was a little confused with the ordering in the example table Muhyiddin provided. Greetings, Martin -- Martin Knipper www : http://www.mk-os.de Mail : [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] returning multiple values and ref cursors
Dennis writes: What are your thoughts on the best way to approach this? Break up the stored proc into ten different stored procs that return ref cursors? I guess another question here is whether I can return a row type or record type with (integer, varchar, refcursor) and then make use of the refcursor from JDBC. I'm not the one doing the Java development on this project and its been a while since I've used JDBC. Tom, Bruce, I still owe you guys a test program that gives the error where it looks like the temp schema for the connection is getting destroyed before the connection is closed. Dennis ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster