Re: [SQL] index problem
I forget: select version(); version - PostgreSQL 7.1.3 on i386-unknown-freebsd4.3, compiled by GCC 2.95.3 It seems that there are index using problems in 7.1.3 ? (checkin same problem in comp.databases.postgresql.bugs msg from Orion) thx CoL Szabo Zoltan wrote: > Hi, > > I have that: > > 1) > db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid=12121; > NOTICE: QUERY PLAN: > > Group (cost=0.00..29970.34 rows=921 width=4) > -> Index Scan using progdgyxy_idx2 on prog_dgy_xy > (cost=0.00..29947.32 rows=9210 width=4) > > than: > 2) > db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid>12121; > NOTICE: QUERY PLAN: > > Group (cost=66927.88..67695.39 rows=30700 width=4) > -> Sort (cost=66927.88..66927.88 rows=307004 width=4) > -> Seq Scan on prog_dgy_xy (cost=0.00..32447.66 rows=307004 > width=4) > > I making some banchmarks on: oracle vs postgres vs mysql. And this is > breaking me now;) Mysql and oracle width same table and index use that > index on pxygy_pid; > I had vacuum before. > > Time with mysql: > > bash-2.04$ time echo " select count(*) from PROG_DGY_XY where > pxygy_pid>12121;" | mysql -uuser -ppasswd db > count(*) > 484984 > > real0m13.761s > user0m0.008s > sys 0m0.019s > > Time with postgres: > bash-2.04$ time echo "select count(*) from PROG_DGY_XY where > pxygy_pid>12121 " | psql -Uuser db > count > > 484984 > (1 row) > > > real0m22.480s > user0m0.011s > sys 0m0.021s > > And this is just a little part of another selects joining tables, but > because this index is not used, selecting from 2 tables (which has > indexes, and keys on joining collumns) takes extrem time for postgres: > 2m14.978s while for mysql it takes: 0m0.578s !!! > > this select is: select distinct > PROG_ID,PROG_FTYPE,PROG_FCASTHOUR,PROG_DATE from PROG_DATA, PROG_DGY_XY > where prog_id=pxygy_pid order by prog_date,prog_ftype,prog_fcasthour > > indexes: > PROG_DATA: > create index prod_data_idx1 on prog_data > (prog_date,prog_ftype,prog_fcasthour); > prog_id is primary key > > PROG_DGY_XY: > create unique index progdgyxy_idx1 on PROG_DGY_XY > (PXYGY_PID,PXYGY_X,PXYGY_Y); > create index progdgyxy_idx2 on PROG_DGY_XY (PXYGY_PID); > > > Thx > CoL > -- [ Szabo Zoltan ] [ software fejleszto ] [econet.hu Informatikai Rt. ] [ 1117 Budapest, Hauszmann A. u. 3. ] [ tel.: 371 2100 fax: 371 2101] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] index problem
Hi, I did not make vacuum analyze ;), the vacuum once now: vacuumdb -Uuser -ddb -v -tprog_dgy_xy NOTICE: --Relation prog_dgy_xy-- NOTICE: Pages 20935: Changed 0, reaped 0, Empty 0, New 0; Tup 921013: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 178, MaxLen 184; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 2.71s/0.32u sec. NOTICE: Index progdgyxy_idx1: Pages 6679; Tuples 921013. CPU 1.41s/1.40u sec. NOTICE: Index progdgyxy_idx2: Pages 2019; Tuples 921013. CPU 0.28s/1.28u sec. I make it with -z too. So this table has more 921013 rows. The query show the same as bellow. The version is 7.1.3. - One more interesting: the insering of these rows. Postgres: bash-2.04$ time psql -q -Uuser -f prog_dgy_xy.dump db real131m50.006s user3m21.838s sys 1m20.963s Mysql: bash-2.04$ time cat prog_dgy_xy.dump | mysql -uuser -ppass db real24m50.137s user2m6.629s sys 1m37.757s the dump file was: insert into table (...) values (...); I tried with copy, and to add begin; inserts; commit; , but the result with same time :( [For Oracle 8.1.6 sqlloader it takes 450 sec ;) ] --- The 2 table query, where prog_data has ~8800 rowsn and index on prog_id: bash-2.04$ time echo "explain select distinct prog_id,prog_ftype,prog_fcasthour,prog_date from prog_dgy_xy,prog_data where pxygy_pid=prog_id " | psql -Uuser db NOTICE: QUERY PLAN: Unique (cost=7432549.69..7680455.07 rows=2479054 width=32) -> Sort (cost=7432549.69..7432549.69 rows=24790538 width=32) -> Merge Join (cost=148864.65..161189.33 rows=24790538 width=32) -> Index Scan using prog_data_pkey on prog_data (cost=0.00..701.12 rows=8872 width=28) -> Sort (cost=148864.65..148864.65 rows=921013 width=4) -> Seq Scan on prog_dgy_xy (cost=0.00..30145.13 rows=921013 width=4) Time: !!! real2m3.620s the same query with mysql (i did explain in mysql, and says it use the indexes): real0m1.998s !!! I just askin why? and why just using the index on releation "=". (same table, same index, vacuumed) (made the test more than twice) It seams to be a 7.1.3 bug? i do not test yet with 7.1.2 but tomorrow i will. CoL Stephan Szabo wrote: > On Mon, 15 Oct 2001, Szabo Zoltan wrote: > > >>Hi, >> >>I have that: >> >>1) >>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid=12121; >>NOTICE: QUERY PLAN: >> >>Group (cost=0.00..29970.34 rows=921 width=4) >> -> Index Scan using progdgyxy_idx2 on prog_dgy_xy >>(cost=0.00..29947.32 rows=9210 width=4) >> >>than: >>2) >>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid>12121; >>NOTICE: QUERY PLAN: >> >>Group (cost=66927.88..67695.39 rows=30700 width=4) >> -> Sort (cost=66927.88..66927.88 rows=307004 width=4) >> -> Seq Scan on prog_dgy_xy (cost=0.00..32447.66 rows=307004 >>width=4) >> >>I making some banchmarks on: oracle vs postgres vs mysql. And this is >>breaking me now;) Mysql and oracle width same table and index use that >>index on pxygy_pid; >>I had vacuum before. >> > > I assume you mean you did a vacuum analyze (a plain vacuum isn't > sufficient). If you did just do a regular vacuum, do a vacuum analyze > to get the updated statistics. > > How many rows actually match pxygy_pid>12121? Is 307000 rows a reasonable > estimate? How many rows are in the table? > > > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] nvl() function
On Tue, 16 Oct 2001, Steven Dahlin wrote: > I am trying to find the equivalent in Postgresql to the Oracle sql function > nvl(). With nvl() you give two parameters. The first may be a field/column > or variable. If the value is not null then it is returned by the function. > For example the with query below if the :ClientParameter is passed then only > those rows which have a clientdesc matching the parameter are returned. If > the :ClientParameter is null then those rows which have clientdesc = > clientdesc are returned (all rows): > > selectclientid, > clientdesc > from clients > where ( clientdesc = nvl( :ClientParameter, clientdesc ) ) > > I have looked thru all the documentation I could find but nowhere were any > built-in SQL functions delineated. Does anyone know where the documentation > can be found? COALESCE is the SQL standard name for this. You'll find details in the documentation, in the Conditional Expressions section (4.10). Copy at: http://candle.pha.pa.us/main/writings/pgsql/sgml/functions-conditional.html -- Joel BURTON | [EMAIL PROTECTED] | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Indexes on functions
I'm going to write a function in C that parses XML. Is there any way to index the output of this function? I've got 10,000,000 rows that contain XML data and I need to efficiently find the ones that contain the proper keys. I tried pulling the values I want from the data and putting it in its own table but with an average of 20 items that just grows to an unmanageable size. -Michael _ http://fastmail.ca/ - Fast Free Web Email for Canadians ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] referencial conlumn contraints and inheritance
Stephan Szabo and Josh Berkus, Thanks for your input. The question at hand seemed to provide an appealing argument for providing inheritance support for referencial constraints, but if its support is as rare as it seems to be then that could produce problems if I want the ability to be able to manage the data with tools on other platforms. I guess I will have to give potential alternatives more consideration. Thanks again, Stuart ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Performance problems - Indexes and VACUUM
Who is this? - Original Message - From: Josh Berkus <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, October 17, 2001 8:59 AM Subject: [SQL] Performance problems - Indexes and VACUUM > Tom, Folks: > > I am having a rather interesting time getting performance out of my > database. I'd really appreciate some feedback from the list on this. > > As you may recall, I've gotten around Postgres' lack of rowset-returning > stored procedures by constructing "pointer tables" which simply hold > lists of primary keys related to the user's current search. This is an > excellent approach for a browser-based application, and I have since > used this idea on other databases, even one that supports stored > procedures. > > However, this means that I clear all of these pointer tables on a > periodic basis (how frequently depends on usage). Just clearing the > records didn't work, because of the Postgres "padded index" problem > where eventually the indexes on these tables becomes full of deleted > rows. Which gives me problem 1: > > 1. INDEXES: I discovered, the hard way, a peculiar problem. If you drop > and re-create a table within the same transaction (in a function, for > example) the indexes do not get dropped completely. Doing this to > several tables, I had the disturbing experience of seeing incorrect rows > in response to some queries. Specifically dropping each of the indexes, > dropping the tables, re-creating the tables, and re-creating the indexes > seems to work. However, this seems to me to indicate a potential > problem with DDL commands within transactions. > > The second problem is giving me severe grief right now: > > 2. I have a very complex view designed for browsing client information. > This view involves 2 other views, and two custom aggregates which are > based on sub-queries (could only do it in Postgres!). The query plan is > as long as this e-mail, but thanks to optimization and good indexing it > runs in about 2 seconds right after a VACUUM. > Unfortunately, 6 hours after a VACUUM, the query bogs down. The query > plan does not seem to have changed much, but somehow what took 50% of > the processor for 2 seconds at 8:30AM flattens the processor for a full > 45 seconds at 3:30 pm. > Once VACUUM can be run in the background, I suppose that this can be > dealt with, but until then does anyone have any suggestions? > > -Josh Berkus > > > > __AGLIO DATABASE SOLUTIONS___ >Josh Berkus > Complete information technology [EMAIL PROTECTED] >and data management solutions (415) 565-7293 > for law firms, small businessesfax 621-2533 > and non-profit organizations. San Francisco > > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[SQL] nvl() function
I am trying to find the equivalent in Postgresql to the Oracle sql function nvl(). With nvl() you give two parameters. The first may be a field/column or variable. If the value is not null then it is returned by the function. For example the with query below if the :ClientParameter is passed then only those rows which have a clientdesc matching the parameter are returned. If the :ClientParameter is null then those rows which have clientdesc = clientdesc are returned (all rows): selectclientid, clientdesc from clients where ( clientdesc = nvl( :ClientParameter, clientdesc ) ) I have looked thru all the documentation I could find but nowhere were any built-in SQL functions delineated. Does anyone know where the documentation can be found? Thanks, Steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[SQL] Multiple Parameters to an Aggregate Function
I have a sub in a Perl script that loops execution the following statement: SELECT url,name FROM links And formats it like so: $name A variable ($link_list) is in the loop, holding the concatenated last statement with the new one. I would rather do this with FUNCTIONS (and/or AGGREGATES). So, I CREATEd a FUNCTION: CREATE FUNCTION format_link(text,text) RETURNS text AS ' return qq[http://www.domain.com/$_[0]"; class="body_link">$_[1]]; ' LANGUAGE 'plperl'; So in my Perl script, the SQL statement becomes: SELECT format_link(url,name) FROM links However, I still have to loop with Perl -- it would be nice to use an AGGREGATE to do some of this for me. I can create an AGGREGATE but from what I can tell, the format of the sfunc can only have two parameters like so: sfunc_name(type,type) Where the first parameter is what was passed before, and the second parameter is the 'new' info to do stuff to. Is it not possible to do something similar to?: sfunc_name(type,type,type) So that I can pass the url and name to the AGGREGATE (so it can in turn pass it to the sfunc)? Where the sfunc could be something like so: CREATE FUNCTION link_agg (text,text,text) RETURNS text AS ' return $_[0] . qq[http://www.domain.com/$_[0]"; class="body_link">$_[1]]; ' LANGUAGE 'plperl'; Because then I gain benefit of a stored procedure and cut the SQL in the script down to: SELECT link_agg(url,name) FROM link; Which will return the entire list at once, instead of needing the script to loop through multiple fetches. ... Of course, I may be going about this in an entirely incorrect manner. Telling me so, with a bit of direction, would also be greatly appreciated. BTW: I tried searching the archives, but there is a database error ("PQconnectPoll() -- connect() failed: Connection refused Is the postmaster running (with -i) at 'db.hub.org' and accepting connections on TCP/IP port 5439?"), just so someone knows. Thanks In Advance, Anthony Bouvier ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Performance problems - Indexes and VACUUM
"Josh Berkus" <[EMAIL PROTECTED]> writes: > 1. INDEXES: I discovered, the hard way, a peculiar problem. If you drop > and re-create a table within the same transaction (in a function, for > example) the indexes do not get dropped completely. Kinda hard to believe; even if the old indexes were still around, they wouldn't be considered to apply to the new table. I think the problem is something else. Can you provide a reproducible example of what you're seeing? > runs in about 2 seconds right after a VACUUM. > Unfortunately, 6 hours after a VACUUM, the query bogs down. What has been changing in the meantime? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Why would this slow the query down so much?
On Tue, 16 Oct 2001 17:58:32 +0100 Stuart Grimshaw wrote: > On Monday 15 October 2001 16:12 pm, Tom Lane wrote: > > Stuart Grimshaw <[EMAIL PROTECTED]> writes: > > > SELECT a.category, b.headline, b.added, c.friendlyname > > > FROM caturljoin as a > > > INNER JOIN stories as b ON (a.url = b.source) > > > INNER JOIN urllist as c ON (a.url = d.urn) > > > WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1; > > > > (I assume "d.urn" is a typo for "c.urn"...) > > > > The query plan you show looks pretty reasonable if the planner's row > > count estimates are in the right ballpark. How many caturljoin rows > > have category = 93? How many stories rows will match each caturljoin > > row? How many urllist rows ditto? > > There are 194 rows in caturljoin where url = 93, 29806 rows in stories will > match those 194 rows and only 1 row in urllist will match. > If it's convenient, would you try to delete some indices of the "stories" table? the total number of sorts on the QUERY PLAN might decrease. However, this trial may be a vain effort. I can't expect the result of the QUERY PLAN. :-) The indices: "stories_source", "stories_unique_story", and "stories_urn_key" Regards, Masaru Sugawara ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Triggers do not fire
Reiner Dassing <[EMAIL PROTECTED]> writes: > I have a table which has a lot of entries (some x millions) of the kind > (id, timestamp, value) > The access (selects) is concentrated to the timely last some thousands > entries. > To adapt this fact I want to setup a "virtual" table - test in my > example - which > is accessed by the clients but in reality the entries are separated to > different small > tables. These table are dynamically created to hold the values > distinguished by years. Why bother? Seems like you are just making life complicated for yourself. One big table with a suitable index ought to work fine. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Triggers do not fire
Hello Tom! Thank you for your help! Your hints did solve the asked problem. The update trigger is fired if there is a row in the table test to be updated. But, behind my question there was another design (philosophie) which I am trying to solve by the means of different triggers. This is off topic, but ... Maybe, you or somebody on the list can give a hint how solve the following task: I have a table which has a lot of entries (some x millions) of the kind (id, timestamp, value) The access (selects) is concentrated to the timely last some thousands entries. To adapt this fact I want to setup a "virtual" table - test in my example - which is accessed by the clients but in reality the entries are separated to different small tables. These table are dynamically created to hold the values distinguished by years. By the use of triggers I can redirect inserts from the table test to small tables, called test_x_y with x = id, y=year. To update an entry the value in the main table must exist, therefore, this approach does not work. Rules do also not work as I must dynamically build table names. And the execution of pgsql functions is not possible in rules. Correct? Another possible approach would be to make selects which give back results to be used as table names. I.e., it would be necessary to have something like create table f(NEW.val) as select where f(...) gives back a name of a table. Tom Lane wrote: > > Reiner Dassing <[EMAIL PROTECTED]> writes: > > I have written a very small test procedure to show a possible error > > on PostgreSQL V7.1.1. > > The error is yours: you set up the trigger function to return NULL, > which means it's telling the system not to allow the INSERT or UPDATE. > > > INSERT INTO test VALUES(1,'2000-10-11 12:00:00',-20.2); > > NOTICE: Fired INSERT > > INSERT 0 0 > > Note the summary line saying that zero rows were inserted. > > > UPDATE test SET value = 1000.0 WHERE epoch = '2000-10-11 12:10:00' AND > > sensor_id = 1; > > UPDATE 0 > > Here, zero rows were updated, so of course there was nothing to fire > the trigger on. > > regards, tom lane -- Mit freundlichen Gruessen / With best regards Reiner Dassing ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Performance problems - Indexes and VACUUM
"Josh Berkus" <[EMAIL PROTECTED]> writes: > This is on 7.1.2 (SuSE 7.2, ReiserFS, PG built from source). Explicitly > dropping the indexes before dropping the tables seems to have solved the > problem. My guess, without understanding the guts of the thing at all, > is that the transactional nature of the drop and re-create causes the > indexes not to be fully cleared before they are re-built. Maybe it's > even a reaction to the journaling file system. I don't believe a single word of that explanation ... whatever is going on here, that ain't it. A new table is going to have a new OID, and so will its indexes; there is no way that Postgres will confuse it with the old one, even if bits of the old one were still hanging around somehow (which I don't believe either). One thing to think about, if you are dropping and recreating tables in a plpgsql function, is that you probably need to do it with EXECUTE to avoid plan caching. > BTW, any issues with PostgreSQL and DMA disk access? Not unless your kernel or hardware are broken. But I have seen cases where hardware problems (eg, bogus DMA controller) manifested themselves only as database errors. Evidently Postgres was pushing the disk harder than anything else on the system, so it was more likely to get bit by a sporadic hardware booboo. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Variables.
Aasmund, > I do not know whether it is part of the SQL standard. > > Variables... Nope. > CREATE VARIABLE foobar INTEGER DEFAULT 1 NOT NULL; > > SELECT * FROM thebar WHERE id = foobar; > > CREATE TEMPORARY VARIABLE... > CREATE CONSTANT Extensive support for a variety of variable types and constructions is available in FUNCTIONS. PL/pgSQL, PL/TCL, PL/Perl ... take your pick. For the raw command-line SQL, variables, constants, and other procedural language elements are not appropriate. This is best done in procedures, functions, and middleware. Of course, it is an Open-Source project, so if you hire your own programmer, you can do anything you want. -Josh Berkus __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Performance problems - Indexes and VACUUM
Tom, > I don't believe a single word of that explanation ... whatever is > going > on here, that ain't it. A new table is going to have a new OID, and > so will its indexes; there is no way that Postgres will confuse it > with > the old one, even if bits of the old one were still hanging around > somehow (which I don't believe either). You're the expert. All I know for a fact is that when I didn't explicitly drop the indexes, I got weird results; when I did explicitly drop them, I didn't. The whole system is complex enough that the problem is hard to reproduce without reproducing the whole system (which I'd be happy to do for you, only it contains confidential data). From the sound of it, I'm the only one who's encountered this. > One thing to think about, if you are dropping and recreating tables > in > a plpgsql function, is that you probably need to do it with EXECUTE > to avoid plan caching. OK. Will do. Thanks. This may also cure the intermittent index/whatever issue. > Not unless your kernel or hardware are broken. But I have seen cases > where hardware problems (eg, bogus DMA controller) manifested > themselves > only as database errors. Evidently Postgres was pushing the disk > harder > than anything else on the system, so it was more likely to get bit by > a sporadic hardware booboo. Thanks! -Josh ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] Identifying obsolete values
First of all, thanks to Philip Hallstrom for the quick reply. Consider the following tables CREATE TABLE advncd_onfvalue (timepoint DATETIMENOT NULL, midINTEGER NOT NULL,/* measurement id */ lidINTEGER NOT NULL,/* location id */ sidINTEGER NOT NULL,/* source id */ entrancetime DATETIMENOT NULL DEFAULT NOW(), value FLOAT NOT NULL /* float value, not unique */ ) ; CREATE TABLE advncd_tempreftime (timepoint DATETIMENOT NULL, midINTEGER NOT NULL,/* measurement id */ lidINTEGER NOT NULL,/* location id */ sidINTEGER NOT NULL,/* source id */ entrancetime DATETIMENOT NULL ) ; I use the second table to identify the actual resp. obsolete ones within the first table. DELETE FROM advncd_tempreftime; INSERT INTO advncd_tempreftime SELECT timepoint,mid,lid,sid,MAX(entrancetime) FROM advncd_onfvalue GROUP BY timepoint,mid,lid,sid ; SELECT o.sid,o.timepoint,o.lid,o.mid,o.value FROM advncd_onfvalue o WHERE EXISTS (SELECT * FROM advncd_tempreftime t WHERE o.timepoint= t.timepointAND o.mid = t.mid AND o.lid = t.lid AND o.sid = t.sid AND o.entrancetime = t.entrancetime ) ; SELECT o.sid,o.timepoint,o.lid,o.mid,o.value FROM advncd_onfvalue o WHERE NOT EXISTS (SELECT * FROM advncd_tempreftime t WHERE o.timepoint= t.timepointAND o.mid = t.mid AND o.lid = t.lid AND o.sid = t.sid AND o.entrancetime = t.entrancetime ) ; It works fine, but it's a pain how long it takes. I tried to improve the speed by CREATE /* NOT UNIQUE */ INDEX advncd_onfvalue_idx_tmlse ON advncd_onfvalue (timepoint, mid, lid, sid, entrancetime) ; CREATE /* NOT UNIQUE */ INDEX advncd_tempreftime_idx_tmlse ON advncd_tempreftime (timepoint, mid, lid, sid, entrancetime) ; vacuum advncd_onfvalue \g vacuum advncd_tempreftime \g Some effect, but still too slow. Does anybody know alternatives? What about SELECT DISTINCT ON (sid,timepoint,lid,mid) sid,timepoint,lid,mid,value FROM advncd_onfvalue ORDER BY sid,timepoint,lid,mid,entrancetime DESC ; My bad luck is, I cannot test DISTINCT ON (multiple columns) at the moment, because my system admin did not yet install the up-to-date postgres version. Regards, Christoph ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Performance problems - Indexes and VACUUM
Tom, > Kinda hard to believe; even if the old indexes were still around, > they > wouldn't be considered to apply to the new table. I think the > problem > is something else. Can you provide a reproducible example of what > you're seeing? Wish I could; it only seems to happen on the production machine ... that is, you need a heavy load of daily use to make it happen. But, to reduce the issue to its essentials: 1. Create a "pointer table" as I discussed in the last e-mail. 2. Run a bunch of queries that will store several thousand records in this pointer table, referencing the PK's of more than one data table. 3. In a function, drop the table and re-create it and its indexes. 4. In the same function, reset the sequence you use to identify each unique user-query to 1. 5. Performing some queries using the pointer tables, some of the references will mysteriously point to the wrong rows in the data tables. Some will work correctly. This is on 7.1.2 (SuSE 7.2, ReiserFS, PG built from source). Explicitly dropping the indexes before dropping the tables seems to have solved the problem. My guess, without understanding the guts of the thing at all, is that the transactional nature of the drop and re-create causes the indexes not to be fully cleared before they are re-built. Maybe it's even a reaction to the journaling file system. BTW, any issues with PostgreSQL and DMA disk access? > > runs in about 2 seconds right after a VACUUM. > > Unfortunately, 6 hours after a VACUUM, the query bogs down. > > What has been changing in the meantime? Lots of data edits and adds. This particularly seems to happen on days where the users are changing dozens to hundreds of records that affect one of the custom aggregate subqueries. I'm not surprised things slow down in these circumstances, it's just the amount of slowdown -- 25 to 1 over a mere 6 hours -- that surprised me. But the more we talk about this, the more I think I should stop bugging you and let you finish 7.2 so I can just do background VACUUMing. -Josh Berkus __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Multiple Parameters to an Aggregate Function
On Tue, 16 Oct 2001, Anthony Bouvier wrote: > I can create an AGGREGATE but from what I can tell, the format of the > sfunc can only have two parameters like so: > > sfunc_name(type,type) > > Where the first parameter is what was passed before, and the second > parameter is the 'new' info to do stuff to. Is it not possible to do > something similar to?: > > sfunc_name(type,type,type) > > So that I can pass the url and name to the AGGREGATE (so it can in turn > pass it to the sfunc)? Where the sfunc could be something like so: > > CREATE FUNCTION link_agg (text,text,text) > RETURNS text AS ' > return $_[0] . qq[http://www.domain.com/$_[0]"; > class="body_link">$_[1]]; > ' LANGUAGE 'plperl'; > > Because then I gain benefit of a stored procedure and cut the SQL in the > script down to: > > SELECT link_agg(url,name) FROM link; > > Which will return the entire list at once, instead of needing the script > to loop through multiple fetches. I have a techdoc about using aggregate functions to create faster web looping at http://www.zope.org/Members/pupq/pg_in_aggregates It was written w/examples in DTML, Zope's scripting language, rather than in Perl/DBI, but you should be able to easily follow it. Essentially, what you want to end up with is something like this: SELECT make_into_li ( make_into_text ( url, name ) ); where make_into_text is the aggregate, and make_into_text is the formatting function. -- Joel BURTON | [EMAIL PROTECTED] | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] index problem
> > Hmm, does the explain change if you vacuum analyze the other table > > (prog_data)? If not, what does explain show if you do a > > set enable_seqscan='off'; > > before it? Did you do the vacuum analyze on the other table (prog_data) as well? It seems to be overestimating the number of joined rows, and I wonder if it would choose a different plan if it had the correct number. > The result: > db=>set enable_seqscan='off'; > db=>explain select distinct prog_id,prog_ftype,prog_fcasthour,prog_date > from prog_dgy_xy,prog_data where pxygy_pid=prog_id; > NOTICE: QUERY PLAN: > > Unique (cost=7606982.10..7854887.48 rows=2479054 width=32) >-> Sort (cost=7606982.10..7606982.10 rows=24790538 width=32) > -> Merge Join (cost=0.00..335621.73 rows=24790538 width=32) >-> Index Scan using progdgyxy_idx2 on prog_dgy_xy > (cost=0.00..323297.05 rows=921013 width=4) >-> Index Scan using prog_data_pkey on prog_data > (cost=0.00..701.12 rows=8872 width=28) > > It "seems" index is used, but the same result :(((, and bigger execution > time: real 3m41.830s Well, that means the plan it chose before was better, so enable_seqscan isn't a win here. > And why: > POSTGRES: > set enable_seqscan ='off'; select count(*) from prog_dgy_xy where > pxygy_pid<13161; > count > > 900029 > real2m34.340s > explain: > Aggregate (cost=327896.89..327896.89 rows=1 width=0) >-> Index Scan using progdgyxy_idx2 on prog_dgy_xy > (cost=0.00..325594.54 rows=920940 width=0) It's estimating the entire table will be seen (or most of it anyway), so it would choose Seq Scan as faster, but you've basically disallowed that with the enable_seqscan='off'. Is it faster without the explicit hint (it probably will be). Index Scans are not always better than Sequence Scans (especially when traversing most of the table as in the above) and you don't want to use the enable_* unless it actually is giving you a performance increase. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Triggers do not fire
I can vouch for that. I have several tables with 10 to 16 million entries in much the same sort of setup as you are describing (primary key, timestamp, value). PostgreSQL is will quite happily use the timestamp indexes when accessing this table, and it doesn't default to a sequential scan until a considerable number of tuples would be searched. For example: processdata=> explain select count(*) from caseweights1 where dt > '2001-10-11'; NOTICE: QUERY PLAN: Aggregate (cost=255053.37..255053.37 rows=1 width=0) -> Index Scan using caseweights1_dt_idx on caseweights1 (cost=0.00..254827.01 rows=90544 width=0) EXPLAIN processdata=> select count(*) from caseweights1 where dt > '2001-10-11'; count 146773 (1 row) processdata=> select count(*) from caseweights1; count -- 14984087 (1 row) As you can see, even though my table is fairly large PostgreSQL will happily use indexes for queries even when there is a significant number of tuples that are to be accessed. The count command with the index took perhaps a second on my 400MHz 128M ram normal IDE hard drive test server. The count of all the records, on the other hand, triggered a sequential scan that took a long time to complete. In other words, chances are good that PostgreSQL will handle your data without special modification. Jason --- Tom Lane <[EMAIL PROTECTED]> wrote: > Reiner Dassing <[EMAIL PROTECTED]> writes: > > I have a table which has a lot of entries (some x > millions) of the kind > > (id, timestamp, value) > > The access (selects) is concentrated to the timely > last some thousands > > entries. > > To adapt this fact I want to setup a "virtual" > table - test in my > > example - which > > is accessed by the clients but in reality the > entries are separated to > > different small > > tables. These table are dynamically created to > hold the values > > distinguished by years. > > Why bother? Seems like you are just making life > complicated for > yourself. One big table with a suitable index ought > to work fine. > > regards, tom lane > > ---(end of > broadcast)--- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html __ Do You Yahoo!? Make a great connection at Yahoo! Personals. http://personals.yahoo.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[SQL] SQL parser and/or optimizer
Hi everybody, I have a question, the answer to which may not directly relate to PostreSQL. Sorry about that. I am working on the problem of incremental view maintenance and need to implement the strategies I came up with. I am looking for an existing implementation of an SQL parser and/or optimizer, which I can modify to suite my needs. It's preferrable if the code is written in Java, though C and C++ implementations are also welcome. Please, let me know if you can think of anything that relates to the subject. Thanks a lot, Oleg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly