date:20011017

Re: [SQL] index problem

2001-10-17 Thread Szabo Zoltan


I forget:
select version();
version
-
  PostgreSQL 7.1.3 on i386-unknown-freebsd4.3, compiled by GCC 2.95.3

It seems that there are index using problems in 7.1.3 ?
(checkin same problem in comp.databases.postgresql.bugs msg from Orion)

thx
CoL

Szabo Zoltan wrote:

> Hi,
> 
> I have that:
> 
> 1)
> db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid=12121;
> NOTICE:  QUERY PLAN:
> 
> Group  (cost=0.00..29970.34 rows=921 width=4)
>   ->  Index Scan using progdgyxy_idx2 on prog_dgy_xy 
> (cost=0.00..29947.32 rows=9210 width=4)
> 
> than:
> 2)
> db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid>12121;
> NOTICE:  QUERY PLAN:
> 
> Group  (cost=66927.88..67695.39 rows=30700 width=4)
>   ->  Sort  (cost=66927.88..66927.88 rows=307004 width=4)
> ->  Seq Scan on prog_dgy_xy  (cost=0.00..32447.66 rows=307004 
> width=4)
> 
> I making some banchmarks on: oracle vs postgres vs mysql. And this is 
> breaking me now;) Mysql and oracle width same table and index use that 
> index on pxygy_pid;
> I had vacuum before.
> 
> Time with mysql:
> 
> bash-2.04$ time echo " select count(*) from PROG_DGY_XY where 
> pxygy_pid>12121;" | mysql -uuser -ppasswd db
> count(*)
> 484984
> 
> real0m13.761s
> user0m0.008s
> sys 0m0.019s
> 
> Time with postgres:
> bash-2.04$ time echo "select count(*) from PROG_DGY_XY where 
> pxygy_pid>12121 " | psql -Uuser db
>  count
> 
>  484984
> (1 row)
> 
> 
> real0m22.480s
> user0m0.011s
> sys 0m0.021s
> 
> And this is just a little part of another selects joining tables, but 
> because this index is not used, selecting from 2 tables (which has 
> indexes, and keys on joining collumns) takes extrem time for postgres: 
> 2m14.978s while for mysql it takes: 0m0.578s !!!
> 
> this select is: select distinct 
> PROG_ID,PROG_FTYPE,PROG_FCASTHOUR,PROG_DATE from PROG_DATA, PROG_DGY_XY 
>  where prog_id=pxygy_pid  order by prog_date,prog_ftype,prog_fcasthour
> 
> indexes:
> PROG_DATA:
> create index prod_data_idx1 on prog_data 
> (prog_date,prog_ftype,prog_fcasthour);
> prog_id is primary key
> 
> PROG_DGY_XY:
> create unique index progdgyxy_idx1 on PROG_DGY_XY 
> (PXYGY_PID,PXYGY_X,PXYGY_Y);
> create index progdgyxy_idx2 on PROG_DGY_XY (PXYGY_PID);
> 
> 
> Thx
> CoL
> 


-- 
[ Szabo Zoltan  ]
[   software fejleszto  ]
[econet.hu Informatikai Rt. ]
[ 1117 Budapest, Hauszmann A. u. 3. ]
[   tel.: 371 2100 fax: 371 2101]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [SQL] index problem

2001-10-17 Thread CoL


Hi,

I did not make vacuum analyze ;), the vacuum  once now:

vacuumdb -Uuser -ddb -v -tprog_dgy_xy

NOTICE:  --Relation prog_dgy_xy--
NOTICE:  Pages 20935: Changed 0, reaped 0, Empty 0, New 0; Tup 921013: 
Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 178, MaxLen 184; 
Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 
2.71s/0.32u sec.
NOTICE:  Index progdgyxy_idx1: Pages 6679; Tuples 921013. CPU 
1.41s/1.40u sec.
NOTICE:  Index progdgyxy_idx2: Pages 2019; Tuples 921013. CPU 
0.28s/1.28u sec.

I make it with -z too.
So this table has more 921013 rows.
The query show the same as bellow. The version is 7.1.3.

-
One more interesting: the insering of these rows.
Postgres:
bash-2.04$ time  psql -q -Uuser -f prog_dgy_xy.dump  db
real131m50.006s
user3m21.838s
sys 1m20.963s

Mysql:
bash-2.04$ time cat prog_dgy_xy.dump | mysql -uuser -ppass db
real24m50.137s
user2m6.629s
sys 1m37.757s

the dump file was: insert into table (...) values (...);

I tried with copy, and to add begin; inserts; commit; , but the result 
with same time :(
[For Oracle 8.1.6 sqlloader it takes 450 sec ;) ]

---
The 2 table query, where prog_data has ~8800 rowsn and index on prog_id:
bash-2.04$ time echo "explain select distinct 
prog_id,prog_ftype,prog_fcasthour,prog_date from prog_dgy_xy,prog_data 
where pxygy_pid=prog_id " | psql -Uuser db
NOTICE:  QUERY PLAN:

Unique  (cost=7432549.69..7680455.07 rows=2479054 width=32)
   ->  Sort  (cost=7432549.69..7432549.69 rows=24790538 width=32)
 ->  Merge Join  (cost=148864.65..161189.33 rows=24790538 width=32)
   ->  Index Scan using prog_data_pkey on prog_data 
(cost=0.00..701.12 rows=8872 width=28)
   ->  Sort  (cost=148864.65..148864.65 rows=921013 width=4)
 ->  Seq Scan on prog_dgy_xy  (cost=0.00..30145.13 
rows=921013 width=4)

Time: !!!
real2m3.620s

the same query with mysql (i did explain in mysql, and says it use the 
indexes):
real0m1.998s !!!

I just askin why? and why just using the index on releation "=".
(same table, same index, vacuumed) (made the test more than twice)
It seams to be a 7.1.3 bug? i do not test yet with 7.1.2 but tomorrow i 
will.

CoL

Stephan Szabo wrote:

> On Mon, 15 Oct 2001, Szabo Zoltan wrote:
> 
> 
>>Hi,
>>
>>I have that:
>>
>>1)
>>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid=12121;
>>NOTICE:  QUERY PLAN:
>>
>>Group  (cost=0.00..29970.34 rows=921 width=4)
>>   ->  Index Scan using progdgyxy_idx2 on prog_dgy_xy 
>>(cost=0.00..29947.32 rows=9210 width=4)
>>
>>than:
>>2)
>>db=> explain select pxygy_pid from prog_dgy_xy where pxygy_pid>12121;
>>NOTICE:  QUERY PLAN:
>>
>>Group  (cost=66927.88..67695.39 rows=30700 width=4)
>>   ->  Sort  (cost=66927.88..66927.88 rows=307004 width=4)
>> ->  Seq Scan on prog_dgy_xy  (cost=0.00..32447.66 rows=307004 
>>width=4)
>>
>>I making some banchmarks on: oracle vs postgres vs mysql. And this is 
>>breaking me now;) Mysql and oracle width same table and index use that 
>>index on pxygy_pid;
>>I had vacuum before.
>>
> 
> I assume you mean you did a vacuum analyze (a plain vacuum isn't
> sufficient).  If you did just do a regular vacuum, do a vacuum analyze
> to get the updated statistics.
> 
> How many rows actually match pxygy_pid>12121?  Is 307000 rows a reasonable
> estimate?  How many rows are in the table?
> 
> 
> 




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] nvl() function

2001-10-17 Thread Joel Burton


On Tue, 16 Oct 2001, Steven Dahlin wrote:

> I am trying to find the equivalent in Postgresql to the Oracle sql function
> nvl().  With nvl() you give two parameters.  The first may be a field/column
> or variable.  If the value is not null then it is returned by the function.
> For example the with query below if the :ClientParameter is passed then only
> those rows which have a clientdesc matching the parameter are returned.  If
> the :ClientParameter is null then those rows which have clientdesc =
> clientdesc are returned (all rows):
>
> selectclientid,
>  clientdesc
>  from clients
>  where   ( clientdesc = nvl( :ClientParameter, clientdesc ) )
>
> I have looked thru all the documentation I could find but nowhere were any
> built-in SQL functions delineated.  Does anyone know where the documentation
> can be found?

COALESCE is the SQL standard name for this. You'll find details in the
documentation, in the Conditional Expressions section (4.10). Copy at:

http://candle.pha.pa.us/main/writings/pgsql/sgml/functions-conditional.html

-- 

Joel BURTON  |  [EMAIL PROTECTED]  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] Indexes on functions

2001-10-17 Thread Michael Richards


I'm going to write a function in C that parses XML. Is there any way 
to index the output of this function? I've got 10,000,000 rows that 
contain XML data and I need to efficiently find the ones that contain 
the proper keys. I tried pulling the values I want from the data and 
putting it in its own table but with an average of 20 items that just 
grows to an unmanageable size.

-Michael
_
 http://fastmail.ca/ - Fast Free Web Email for Canadians
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] referencial conlumn contraints and inheritance

2001-10-17 Thread Stuart


Stephan Szabo and Josh Berkus,

Thanks for your input.  The question at hand seemed to provide an appealing 
argument for providing inheritance support for referencial constraints, but 
if its support is as rare as it seems to be then that could produce problems 
if I want the ability to be able to manage the data with tools on other 
platforms.  I guess I will have to give potential alternatives more 
consideration.


Thanks again,

Stuart 



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] Performance problems - Indexes and VACUUM

2001-10-17 Thread Kusuma


Who is this?
- Original Message -
From: Josh Berkus <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, October 17, 2001 8:59 AM
Subject: [SQL] Performance problems - Indexes and VACUUM


> Tom, Folks:
>
> I am having a rather interesting time getting performance out of my
> database.   I'd really appreciate some feedback from the list on this.
>
> As you may recall, I've gotten around Postgres' lack of rowset-returning
> stored procedures by constructing "pointer tables" which simply hold
> lists of primary keys related to the user's current search.  This is an
> excellent approach for a browser-based application, and I have since
> used this idea on other databases, even one that supports stored
> procedures.
>
> However, this means that I clear all of these pointer tables on a
> periodic basis (how frequently depends on usage). Just clearing the
> records didn't work, because of the Postgres "padded index" problem
> where eventually the indexes on these tables becomes full of deleted
> rows.  Which gives me problem 1:
>
> 1. INDEXES: I discovered, the hard way, a peculiar problem.  If you drop
> and re-create a table within the same transaction (in a function, for
> example) the indexes do not get dropped completely.  Doing this to
> several tables, I had the disturbing experience of seeing incorrect rows
> in response to some queries.  Specifically dropping each of the indexes,
> dropping the tables, re-creating the tables, and re-creating the indexes
> seems to work.  However, this seems to me to indicate a potential
> problem with DDL commands within transactions.
>
> The second problem is giving me severe grief right now:
>
> 2. I have a very complex view designed for browsing client information.
> This view involves 2 other views, and two custom aggregates which are
> based on sub-queries (could only do it in Postgres!).  The query plan is
> as long as this e-mail, but thanks to optimization and good indexing it
> runs in about 2 seconds right after a VACUUM.
> Unfortunately, 6 hours after a VACUUM, the query bogs down.  The query
> plan does not seem to have changed much, but somehow what took 50% of
> the processor for 2 seconds at 8:30AM flattens the processor for a full
> 45 seconds at 3:30 pm.
> Once VACUUM can be run in the background, I suppose that this can be
> dealt with, but until then does anyone have any suggestions?
>
> -Josh Berkus
>
>
>
> __AGLIO DATABASE SOLUTIONS___
>Josh Berkus
>   Complete information technology  [EMAIL PROTECTED]
>and data management solutions   (415) 565-7293
>   for law firms, small businessesfax 621-2533
> and non-profit organizations.  San Francisco
>






>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[SQL] nvl() function

2001-10-17 Thread Steven Dahlin


I am trying to find the equivalent in Postgresql to the Oracle sql function
nvl().  With nvl() you give two parameters.  The first may be a field/column
or variable.  If the value is not null then it is returned by the function.
For example the with query below if the :ClientParameter is passed then only
those rows which have a clientdesc matching the parameter are returned.  If
the :ClientParameter is null then those rows which have clientdesc =
clientdesc are returned (all rows):

selectclientid,
 clientdesc
 from clients
 where   ( clientdesc = nvl( :ClientParameter, clientdesc ) )

I have looked thru all the documentation I could find but nowhere were any
built-in SQL functions delineated.  Does anyone know where the documentation
can be found?

Thanks,
Steve



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[SQL] Multiple Parameters to an Aggregate Function

2001-10-17 Thread Anthony Bouvier


I have a sub in a Perl script that loops execution the following
statement:

  SELECT url,name FROM links

And formats it like so:

  $name

A variable ($link_list) is in the loop, holding the concatenated last
statement with the new one.

I would rather do this with FUNCTIONS (and/or AGGREGATES).

So, I CREATEd a FUNCTION:

CREATE FUNCTION format_link(text,text)
RETURNS text AS '
  return qq[http://www.domain.com/$_[0]";
class="body_link">$_[1]];
' LANGUAGE 'plperl';

So in my Perl script, the SQL statement becomes:

  SELECT format_link(url,name) FROM links

However, I still have to loop with Perl -- it would be nice to use an
AGGREGATE to do some of this for me.

I can create an AGGREGATE but from what I can tell, the format of the
sfunc can only have two parameters like so:

sfunc_name(type,type)

Where the first parameter is what was passed before, and the second
parameter is the 'new' info to do stuff to.  Is it not possible to do
something similar to?:

sfunc_name(type,type,type)

So that I can pass the url and name to the AGGREGATE (so it can in turn
pass it to the sfunc)?  Where the sfunc could be something like so:

CREATE FUNCTION link_agg (text,text,text)
RETURNS text AS '
  return $_[0] . qq[http://www.domain.com/$_[0]";
class="body_link">$_[1]];
' LANGUAGE 'plperl';

Because then I gain benefit of a stored procedure and cut the SQL in the
script down to:

  SELECT link_agg(url,name) FROM link;

Which will return the entire list at once, instead of needing the script
to loop through multiple fetches.

...

Of course, I may be going about this in an entirely incorrect manner.
Telling me so, with a bit of direction, would also be greatly
appreciated.

BTW:  I tried searching the archives, but there is a database error
("PQconnectPoll() -- connect() failed: Connection refused Is the
postmaster running (with -i) at 'db.hub.org' and accepting connections
on TCP/IP port 5439?"), just so someone knows.

Thanks In Advance,

Anthony Bouvier


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] Performance problems - Indexes and VACUUM

2001-10-17 Thread Tom Lane

"Josh Berkus" <[EMAIL PROTECTED]> writes:
> 1. INDEXES: I discovered, the hard way, a peculiar problem.  If you drop
> and re-create a table within the same transaction (in a function, for
> example) the indexes do not get dropped completely.

Kinda hard to believe; even if the old indexes were still around, they
wouldn't be considered to apply to the new table.  I think the problem
is something else.  Can you provide a reproducible example of what
you're seeing?

> runs in about 2 seconds right after a VACUUM.
>   Unfortunately, 6 hours after a VACUUM, the query bogs down.

What has been changing in the meantime?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] Why would this slow the query down so much?

2001-10-17 Thread Masaru Sugawara


On Tue, 16 Oct 2001 17:58:32 +0100
Stuart Grimshaw wrote:

> On Monday 15 October 2001 16:12 pm, Tom Lane wrote:
> > Stuart Grimshaw <[EMAIL PROTECTED]> writes:
> > > SELECT a.category, b.headline, b.added, c.friendlyname
> > > FROM caturljoin as a
> > > INNER JOIN stories as b ON (a.url = b.source)
> > > INNER JOIN urllist as c ON (a.url = d.urn)
> > > WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1;
> >
> > (I assume "d.urn" is a typo for "c.urn"...)
> >
> > The query plan you show looks pretty reasonable if the planner's row
> > count estimates are in the right ballpark.  How many caturljoin rows
> > have category = 93?  How many stories rows will match each caturljoin
> > row?  How many urllist rows ditto?
> 
> There are 194 rows in caturljoin where url = 93, 29806 rows in stories will 
> match those 194 rows and only 1 row in urllist will match.
> 

 If it's convenient, would you try to delete some indices 
 of the "stories" table?  the total number of sorts on the
 QUERY PLAN might decrease.  However, this trial may be a 
 vain effort. I can't expect the result of the QUERY PLAN.  :-)
 
 
 The indices: "stories_source",
  "stories_unique_story",
  and "stories_urn_key"


Regards,
Masaru Sugawara


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] Triggers do not fire

2001-10-17 Thread Tom Lane


Reiner Dassing <[EMAIL PROTECTED]> writes:
> I have a table which has a lot of entries (some x millions) of the kind
> (id, timestamp, value)
> The access (selects) is concentrated to the timely last some thousands
> entries. 
> To adapt this fact I want to setup a "virtual" table - test in my
> example - which
> is accessed by the clients but in reality the entries are separated to
> different small
> tables. These table are dynamically created to hold the values
> distinguished by years.

Why bother?  Seems like you are just making life complicated for
yourself.  One big table with a suitable index ought to work fine.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] Triggers do not fire

2001-10-17 Thread Reiner Dassing

Hello Tom!

Thank you for your help!
Your hints did solve the asked problem. 
The update trigger is fired if there is a row in the table test to be
updated. 

But, behind my question there was another design (philosophie) which I
am trying to
solve by the means of different triggers.
This is off topic, but ...

Maybe, you or somebody on the list can give a hint how solve the
following task:

I have a table which has a lot of entries (some x millions) of the kind
(id, timestamp, value)
The access (selects) is concentrated to the timely last some thousands
entries. 
To adapt this fact I want to setup a "virtual" table - test in my
example - which
is accessed by the clients but in reality the entries are separated to
different small
tables. These table are dynamically created to hold the values
distinguished by years.
By the use of triggers I can redirect inserts from the table test to
small tables,
called test_x_y with x = id, y=year.
To update an entry the value in the main table must exist, therefore,
this approach does not work.

Rules do also not work as I must dynamically build table names.
And the execution of pgsql functions is not possible in rules. Correct?

Another possible approach would be to make selects which give back
results to be
used as table names.
I.e., it would be necessary to have something like 
create  table f(NEW.val) as select  
where f(...) gives back a name of a table. 

Tom Lane wrote:
> 
> Reiner Dassing <[EMAIL PROTECTED]> writes:
> > I have written a very small test procedure to show a possible error
> > on PostgreSQL V7.1.1.
> 
> The error is yours: you set up the trigger function to return NULL,
> which means it's telling the system not to allow the INSERT or UPDATE.
> 
> > INSERT INTO test VALUES(1,'2000-10-11 12:00:00',-20.2);
> > NOTICE:  Fired INSERT
> > INSERT 0 0
> 
> Note the summary line saying that zero rows were inserted.
> 
> > UPDATE test SET value = 1000.0 WHERE epoch = '2000-10-11 12:10:00'  AND
> > sensor_id = 1;
> > UPDATE 0
> 
> Here, zero rows were updated, so of course there was nothing to fire
> the trigger on.
> 
> regards, tom lane

--
Mit freundlichen Gruessen / With best regards
   Reiner Dassing

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [SQL] Performance problems - Indexes and VACUUM

2001-10-17 Thread Tom Lane

"Josh Berkus" <[EMAIL PROTECTED]> writes:
> This is on 7.1.2 (SuSE 7.2, ReiserFS, PG built from source).  Explicitly
> dropping the indexes before dropping the tables seems to have solved the
> problem.  My guess, without understanding the guts of the thing at all,
> is that the transactional nature of the drop and re-create causes the
> indexes not to be fully cleared before they are re-built.  Maybe it's
> even a reaction to the journaling file system.

I don't believe a single word of that explanation ... whatever is going
on here, that ain't it.  A new table is going to have a new OID, and
so will its indexes; there is no way that Postgres will confuse it with
the old one, even if bits of the old one were still hanging around
somehow (which I don't believe either).

One thing to think about, if you are dropping and recreating tables in
a plpgsql function, is that you probably need to do it with EXECUTE
to avoid plan caching.

> BTW, any issues with PostgreSQL and DMA disk access?

Not unless your kernel or hardware are broken.  But I have seen cases
where hardware problems (eg, bogus DMA controller) manifested themselves
only as database errors.  Evidently Postgres was pushing the disk harder
than anything else on the system, so it was more likely to get bit by
a sporadic hardware booboo.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] Variables.

2001-10-17 Thread Josh Berkus


Aasmund,

> I do not know whether it is  part of the SQL standard.
> 
> Variables...

Nope.
 
> CREATE VARIABLE foobar INTEGER DEFAULT 1 NOT NULL;
> 
> SELECT * FROM thebar WHERE id = foobar;
> 
> CREATE TEMPORARY VARIABLE...
> CREATE CONSTANT

Extensive support for a variety of variable types and constructions is
available in FUNCTIONS.  PL/pgSQL, PL/TCL, PL/Perl ... take your pick.

For the raw command-line SQL, variables, constants, and other procedural
language elements are not appropriate.  This is best done in procedures,
functions, and middleware.

Of course, it is an Open-Source project, so if you hire your own
programmer, you can do anything you want.

-Josh Berkus

__AGLIO DATABASE SOLUTIONS___
   Josh Berkus
  Complete information technology  [EMAIL PROTECTED]
   and data management solutions   (415) 565-7293
  for law firms, small businessesfax 621-2533
and non-profit organizations.  San Francisco












---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] Performance problems - Indexes and VACUUM

2001-10-17 Thread Josh Berkus


Tom,

> I don't believe a single word of that explanation ... whatever is
> going
> on here, that ain't it.  A new table is going to have a new OID, and
> so will its indexes; there is no way that Postgres will confuse it
> with
> the old one, even if bits of the old one were still hanging around
> somehow (which I don't believe either).

You're the expert.  All I know for a fact is that when I didn't
explicitly drop the indexes, I got weird results; when I did explicitly
drop them, I didn't.  

The whole system is complex enough that the problem is hard to reproduce
without reproducing the whole system (which I'd be happy to do for you,
only it contains confidential data).  From the sound of it, I'm the only
one who's encountered this.  

> One thing to think about, if you are dropping and recreating tables
> in
> a plpgsql function, is that you probably need to do it with EXECUTE
> to avoid plan caching.

OK.  Will do.  Thanks.  This may also cure the intermittent
index/whatever issue.

> Not unless your kernel or hardware are broken.  But I have seen cases
> where hardware problems (eg, bogus DMA controller) manifested
> themselves
> only as database errors.  Evidently Postgres was pushing the disk
> harder
> than anything else on the system, so it was more likely to get bit by
> a sporadic hardware booboo.

Thanks!

-Josh












---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[SQL] Identifying obsolete values

2001-10-17 Thread Haller Christoph


First of all, thanks to Philip Hallstrom for the quick reply. 

Consider the following tables 
CREATE TABLE advncd_onfvalue 
(timepoint  DATETIMENOT NULL,
 midINTEGER NOT NULL,/* measurement id */
 lidINTEGER NOT NULL,/* location id */
 sidINTEGER NOT NULL,/* source id */
 entrancetime   DATETIMENOT NULL DEFAULT NOW(), 
 value  FLOAT   NOT NULL /* float value, not unique */
) ;
CREATE TABLE advncd_tempreftime
   (timepoint  DATETIMENOT NULL,
midINTEGER NOT NULL,/* measurement id */
lidINTEGER NOT NULL,/* location id */
sidINTEGER NOT NULL,/* source id */
entrancetime   DATETIMENOT NULL
) ;
I use the second table to identify the actual resp. obsolete ones within the first 
table. 

DELETE FROM advncd_tempreftime;
INSERT INTO advncd_tempreftime 
SELECT timepoint,mid,lid,sid,MAX(entrancetime) FROM advncd_onfvalue 
 GROUP BY timepoint,mid,lid,sid ;

SELECT o.sid,o.timepoint,o.lid,o.mid,o.value FROM advncd_onfvalue o 
WHERE EXISTS 
(SELECT * FROM advncd_tempreftime t WHERE 
o.timepoint= t.timepointAND
o.mid  = t.mid  AND
o.lid  = t.lid  AND
o.sid  = t.sid  AND
o.entrancetime = t.entrancetime 
) ; 
SELECT o.sid,o.timepoint,o.lid,o.mid,o.value FROM advncd_onfvalue o 
WHERE NOT EXISTS 
(SELECT * FROM advncd_tempreftime t WHERE 
o.timepoint= t.timepointAND
o.mid  = t.mid  AND
o.lid  = t.lid  AND
o.sid  = t.sid  AND
o.entrancetime = t.entrancetime 
) ; 
It works fine, but it's a pain how long it takes. 
I tried to improve the speed by 
CREATE /* NOT UNIQUE */ INDEX advncd_onfvalue_idx_tmlse ON advncd_onfvalue
(timepoint, mid, lid, sid, entrancetime) ;
CREATE /* NOT UNIQUE */ INDEX advncd_tempreftime_idx_tmlse ON advncd_tempreftime
(timepoint, mid, lid, sid, entrancetime) ;
vacuum advncd_onfvalue \g
vacuum advncd_tempreftime \g
Some effect, but still too slow. 
Does anybody know alternatives? 
What about 
SELECT DISTINCT ON (sid,timepoint,lid,mid) 
 sid,timepoint,lid,mid,value FROM advncd_onfvalue 
ORDER BY sid,timepoint,lid,mid,entrancetime DESC ; 
My bad luck is, I cannot test DISTINCT ON (multiple columns) at the moment, 
because my system admin did not yet install the up-to-date postgres version. 

Regards, Christoph 

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [SQL] Performance problems - Indexes and VACUUM

2001-10-17 Thread Josh Berkus


Tom,

> Kinda hard to believe; even if the old indexes were still around,
> they
> wouldn't be considered to apply to the new table.  I think the
> problem
> is something else.  Can you provide a reproducible example of what
> you're seeing?

Wish I could; it only seems to happen on the production machine ... that
is, you need a heavy load of daily use to make it happen.  But, to
reduce the issue to its essentials:

1. Create a "pointer table" as I discussed in the last e-mail.
2. Run a bunch of queries that will store several thousand records in
this pointer table, referencing the PK's of more than one data table.
3. In a function, drop the table and re-create it and its indexes.
4. In the same function, reset the sequence you use to identify each
unique user-query to 1.
5. Performing some queries using the pointer tables, some of the
references will mysteriously point to the wrong rows in the data tables.
Some will work correctly.

This is on 7.1.2 (SuSE 7.2, ReiserFS, PG built from source).  Explicitly
dropping the indexes before dropping the tables seems to have solved the
problem.  My guess, without understanding the guts of the thing at all,
is that the transactional nature of the drop and re-create causes the
indexes not to be fully cleared before they are re-built.  Maybe it's
even a reaction to the journaling file system.

BTW, any issues with PostgreSQL and DMA disk access?

> > runs in about 2 seconds right after a VACUUM.
> >  Unfortunately, 6 hours after a VACUUM, the query bogs down.
> 
> What has been changing in the meantime?

Lots of data edits and adds.  This particularly seems to happen on days
where the users are changing dozens to hundreds of records that affect
one of the custom aggregate subqueries.

I'm not surprised things slow down in these circumstances, it's just the
amount of slowdown -- 25 to 1 over a mere 6 hours -- that surprised me.
But the more we talk about this, the more I think I should stop bugging
you and let you finish 7.2 so I can just do background VACUUMing.

-Josh Berkus

__AGLIO DATABASE SOLUTIONS___
   Josh Berkus
  Complete information technology  [EMAIL PROTECTED]
   and data management solutions   (415) 565-7293
  for law firms, small businessesfax 621-2533
and non-profit organizations.  San Francisco












---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [SQL] Multiple Parameters to an Aggregate Function

2001-10-17 Thread Joel Burton


On Tue, 16 Oct 2001, Anthony Bouvier wrote:

> I can create an AGGREGATE but from what I can tell, the format of the
> sfunc can only have two parameters like so:
>
> sfunc_name(type,type)
>
> Where the first parameter is what was passed before, and the second
> parameter is the 'new' info to do stuff to.  Is it not possible to do
> something similar to?:
>
> sfunc_name(type,type,type)
>
> So that I can pass the url and name to the AGGREGATE (so it can in turn
> pass it to the sfunc)?  Where the sfunc could be something like so:
>
> CREATE FUNCTION link_agg (text,text,text)
> RETURNS text AS '
>   return $_[0] . qq[http://www.domain.com/$_[0]";
> class="body_link">$_[1]];
> ' LANGUAGE 'plperl';
>
> Because then I gain benefit of a stored procedure and cut the SQL in the
> script down to:
>
>   SELECT link_agg(url,name) FROM link;
>
> Which will return the entire list at once, instead of needing the script
> to loop through multiple fetches.

I have a techdoc about using aggregate functions to create faster
web looping at

  http://www.zope.org/Members/pupq/pg_in_aggregates

It was written w/examples in DTML, Zope's scripting language, rather than
in Perl/DBI, but you should be able to easily follow it.

Essentially, what you want to end up with is something like this:

SELECT make_into_li ( make_into_text ( url, name ) );

where make_into_text is the aggregate, and make_into_text is the
formatting function.

-- 

Joel BURTON  |  [EMAIL PROTECTED]  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] index problem

2001-10-17 Thread Stephan Szabo


> > Hmm, does the explain change if you vacuum analyze the other table
> > (prog_data)?  If not, what does explain show if you do a
> > set enable_seqscan='off';
> > before it?

Did you do the vacuum analyze on the other table (prog_data) as well?
It seems to be overestimating the number of joined rows, and I wonder
if it would choose a different plan if it had the correct number.

> The result:
> db=>set enable_seqscan='off';
> db=>explain select distinct prog_id,prog_ftype,prog_fcasthour,prog_date 
> from prog_dgy_xy,prog_data  where pxygy_pid=prog_id;
> NOTICE:  QUERY PLAN:
> 
> Unique  (cost=7606982.10..7854887.48 rows=2479054 width=32)
>->  Sort  (cost=7606982.10..7606982.10 rows=24790538 width=32)
>  ->  Merge Join  (cost=0.00..335621.73 rows=24790538 width=32)
>->  Index Scan using progdgyxy_idx2 on prog_dgy_xy 
> (cost=0.00..323297.05 rows=921013 width=4)
>->  Index Scan using prog_data_pkey on prog_data 
> (cost=0.00..701.12 rows=8872 width=28)
> 
> It "seems" index is used, but the same result :(((, and bigger execution 
> time: real   3m41.830s

Well, that means the plan it chose before was better, so enable_seqscan
isn't a win here.

> And why:
> POSTGRES:
> set enable_seqscan ='off'; select count(*) from prog_dgy_xy where 
> pxygy_pid<13161;
>   count
> 
>   900029
> real2m34.340s
> explain:
> Aggregate  (cost=327896.89..327896.89 rows=1 width=0)
>->  Index Scan using progdgyxy_idx2 on prog_dgy_xy 
> (cost=0.00..325594.54 rows=920940 width=0)

It's estimating the entire table will be seen (or most of it anyway),
so it would choose Seq Scan as faster, but you've basically disallowed
that with the enable_seqscan='off'.  Is it faster without the explicit
hint (it probably will be).  Index Scans are not always better than
Sequence Scans (especially when traversing most of the table as in the
above) and you don't want to use the enable_* unless it actually
is giving you a performance increase.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [SQL] Triggers do not fire

2001-10-17 Thread Jason Earl

I can vouch for that.  I have several tables with 10
to 16 million entries in much the same sort of setup
as you are describing (primary key, timestamp, value).

PostgreSQL is will quite happily use the timestamp
indexes when accessing this table, and it doesn't
default to a sequential scan until a considerable
number of tuples would be searched.  For example:

processdata=> explain select count(*) from
caseweights1 where dt > '2001-10-11'; 
NOTICE:  QUERY PLAN:

Aggregate  (cost=255053.37..255053.37 rows=1 width=0)
  ->  Index Scan using caseweights1_dt_idx on
caseweights1  (cost=0.00..254827.01 rows=90544
width=0)

EXPLAIN
processdata=> select count(*) from caseweights1 where
dt > '2001-10-11';
 count  

 146773
(1 row)

processdata=> select count(*) from caseweights1;
  count   
--
 14984087
(1 row)

As you can see, even though my table is fairly large
PostgreSQL will happily use indexes for queries even
when there is a significant number of tuples that are
to be accessed.  The count command with the index took
perhaps a second on my 400MHz 128M ram normal IDE hard
drive test server.  The count of all the records, on
the other hand, triggered a sequential scan that took
a long time to complete.

In other words, chances are good that PostgreSQL will
handle your data without special modification.

Jason

--- Tom Lane <[EMAIL PROTECTED]> wrote:
> Reiner Dassing <[EMAIL PROTECTED]> writes:
> > I have a table which has a lot of entries (some x
> millions) of the kind
> > (id, timestamp, value)
> > The access (selects) is concentrated to the timely
> last some thousands
> > entries. 
> > To adapt this fact I want to setup a "virtual"
> table - test in my
> > example - which
> > is accessed by the clients but in reality the
> entries are separated to
> > different small
> > tables. These table are dynamically created to
> hold the values
> > distinguished by years.
> 
> Why bother?  Seems like you are just making life
> complicated for
> yourself.  One big table with a suitable index ought
> to work fine.
> 
>   regards, tom lane
> 
> ---(end of
> broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html

__
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[SQL] SQL parser and/or optimizer

2001-10-17 Thread Oleg Lebedev


Hi everybody,
I have a question, the answer to which may not directly relate to
PostreSQL. Sorry about that.
I am working on the problem of incremental view maintenance and need to
implement the strategies I came up with. I am looking for an existing
implementation of an SQL parser and/or optimizer, which I can modify to
suite my needs. It's preferrable if the code is written in Java, though
C and C++ implementations are also welcome.
Please, let me know if you can think of anything that relates to the
subject.
Thanks a lot,

Oleg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [SQL] index problem

Re: [SQL] index problem

Re: [SQL] nvl() function

Re: [SQL] Indexes on functions

Re: [SQL] referencial conlumn contraints and inheritance

Re: [SQL] Performance problems - Indexes and VACUUM

[SQL] nvl() function

[SQL] Multiple Parameters to an Aggregate Function

Re: [SQL] Performance problems - Indexes and VACUUM

Re: [SQL] Why would this slow the query down so much?

Re: [SQL] Triggers do not fire

Re: [SQL] Triggers do not fire

Re: [SQL] Performance problems - Indexes and VACUUM

Re: [SQL] Variables.

Re: [SQL] Performance problems - Indexes and VACUUM

[SQL] Identifying obsolete values

Re: [SQL] Performance problems - Indexes and VACUUM

Re: [SQL] Multiple Parameters to an Aggregate Function

Re: [SQL] index problem

Re: [SQL] Triggers do not fire

[SQL] SQL parser and/or optimizer

21 matches

Site Navigation

Mail list logo

Footer information