Re: [PERFORM] Is This My Speed Limit?

2003-10-03 Thread Tomasz Myrta
 Hi!
 
 It's just my curiosity. I wonder if there is any way to break my speed
 limit on AMD 450Mhz:

  Hash Join  (cost=189.79..1508.67 rows=11203 width=48) (actual
  time=129.20..1780.53 rows=9912 loops=1)

  Hash Join  (cost=208.74..1751.68 rows=11203 width=58) (actual
  time=135.87..1113.69 rows=9912 loops=1)

Well, it looks like a speed limit. I wouldn't expect better speed for
queries returning 1 rows.

Regards,
Tomasz Myrta


---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] A Basic Question

2003-10-03 Thread shyamperi
12:28p
Dear All,
This question is regarding the performance of queries in general.
The performance of the queries wud varying depending on the no. Of tuples it is 
returning, and the sort of alogorithm that will be implemented or the retrieval. 
Now if the relation returns zero tuples.. (the seq, and the index scan is the best 
option) and if there are 1 or more then rest PG-supported scans will be the best. 
Now here is where I am having a bit of considerations. My relation works fast, when it 
returns more than on tuple. But get's slow when it returns zero tuple.
Now how shud I got abt it.

-
Warm Regards
Shÿam Peri

II Floor, Punja Building,
M.G.Road,
Ballalbagh,
Mangalore-575003 
Ph : 91-824-2451001/5
Fax : 91-824-2451050 


DISCLAIMER: This message contains privileged and confidential information and is
intended only for the individual named.If you are not the intended
recipient you should not disseminate,distribute,store,print, copy or
deliver this message.Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from
your system.12:28pDear All,
This question is regarding the performance of queries in general.
The performance of the queries wud varying depending on the no. Of tuples it is returning, and the sort of alogorithm that will be implemented or the retrieval. Now if the relation returns zero tuples.. (the seq, and the index scan is the best option) and if there are 1 or more then rest PG-supported scans will be the best. 
Now here is where I am having a bit of considerations. My relation works fast, when it returns more than on tuple. But get's slow when it returns zero tuple.
Now how shud I got abt it.-
Warm Regards
Shÿam Peri

II Floor, Punja Building,
M.G.Road,
Ballalbagh,
Mangalore-575003 
Ph : 91-824-2451001/5
Fax : 91-824-2451050 




DISCLAIMER: This message contains privileged and confidential information and is
intended only for the individual named.If you are not the intended
recipient you should not disseminate,distribute,store,print, copy or
deliver this message.Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from
your system.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] A Basic Question

2003-10-03 Thread Richard Huxton
On Friday 03 October 2003 07:34, [EMAIL PROTECTED] wrote:
 12:28p
 Dear All,
 This question is regarding the performance of queries in general.
 The performance of the queries wud varying depending on the no. Of tuples
 it is returning, and the sort of alogorithm that will be implemented or the
 retrieval. Now if the relation returns zero tuples.. (the seq, and the
 index scan is the best option) and if there are 1 or more then rest
 PG-supported scans will be the best. Now here is where I am having a bit of
 considerations. My relation works fast, when it returns more than on tuple.
 But get's slow when it returns zero tuple. Now how shud I got abt it.

If PG has to examine a lot of tuples to rule them out, then returning no rows 
can take longer.

If you post EXPLAIN ANALYSE output for both queries, someone will be able to 
explain why in your case.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] count(*) slow on large tables

2003-10-03 Thread Christopher Browne
Oops! [EMAIL PROTECTED] (Dror Matalon) was seen spray-painting on a wall:
 I smell a religious war in the aii:-). 
 Can you go several days in a row without doing select count(*) on any
 of your tables? 

I would be more likely, personally, to run VACUUM VERBOSE ANALYZE,
which has useful side-effects :-).

 I suspect that this is somewhat a domain specific issue. In some
 areas you don't need to know the total number of rows in your
 tables, in others you do.

Relationship tables, which don't contain data in their own right,
but which, instead, link together records in other tables, are likely
to have particularly useless COUNT(*)'s.

 I also suspect that you're right, that end user applications don't
 use this information as often as DBAs would. On the other hand, it
 seems whenever you want to optimize your app (something relevant to
 this list), one of the things you do need to know is the number of
 rows in your table.

Ah, but in the case of optimization, there's little need for
transactionally safe, MVCC-managed, known-to-be-exact values.
Approximations are plenty good enough to get the right plan.

Furthermore, it's not the number of rows that is most important when
optimizing queries; the number of pages are more relevant to the
matter, as that's what the database is slinging around.
-- 
(reverse (concatenate 'string ac.notelrac.teneerf @ 454aa))
http://www3.sympatico.ca/cbbrowne/multiplexor.html
Rules of  the Evil Overlord #134. If  I am escaping in  a large truck
and the hero is pursuing me in  a small Italian sports car, I will not
wait for the hero to pull up along side of me and try to force him off
the road  as he attempts to climb  aboard. Instead I will  slam on the
brakes  when he's  directly behind  me.  (A  rudimentary  knowledge of
physics can prove quite useful.) http://www.eviloverlord.com/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] count(*) slow on large tables

2003-10-03 Thread Jeff
On Thu, 2 Oct 2003, Christopher Browne wrote:

 I can't imagine why the raw number of tuples in a relation would be
 expected to necessarily be terribly useful.


We use stuff like that for reporting queries.

example:
On our message boards each post is a row.  The powers that be like to know
how many posts there are total (In addition to 'today')-
select count(*) from posts is how it has been
done on our informix db.  With our port to PG I instead select reltuples
pg_class.

I know when I login to a new db (or unknown to me db) the first thing I do
is look at tables and see what sort of data there is.. but in code I'd
rarely do that.

I know some monitoring things around here also do a select count(*) on
sometable to ensure it is growing, but like you said, this is easily done
with the number of pages as well.

yes. Informix caches this data. I believe Oracle does too.

Mysql with InnoDB does the same thing PG does. (MyISAM caches it)

--
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] low cardinality column

2003-10-03 Thread rwu . cbnco . com
Hi,

I have a select like this:

SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;

in the query:
transactionid is the primary key of cbntransaction table,
But transactiontypeid is a low cardinality column, there're over 100,000
records has the same trnsactiontypeid.
I was trying to create an index on (transactiontypeid, transactionid), but
no luck on that, postgresql will still scan the table.
I'm wondering if there's solution for this query:
Maybe something like if I can partition the table using transactiontypeid,
and do a local index on transactionid on each partition, but I couldnt'
find any doc on postgresql to do that.

Thanks in advance,
rong :-)



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgres low end processing.

2003-10-03 Thread Tom Lane
Stef [EMAIL PROTECTED] writes:
 I've been trying to find out if some guidelines
 exist, somewhere, describing how postgres
 can possibly run on less than 8MB of RAM.

Are you sure you want Postgres, and not something smaller?  BDB,
or SQL Lite, for example?

Postgres is bloatware by design: it was built to house PhD theses.
-- J. Hellerstein (who ought to know)

But having said that ... given virtual memory and cramped configuration
settings, Postgres would certainly run in an 8M machine.  Maybe crawl
would be a more applicable verb than run, but you could execute it.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgres low end processing.

2003-10-03 Thread Stef
On Fri, 03 Oct 2003 11:42:54 -0400
Tom Lane [EMAIL PROTECTED] wrote:

= Are you sure you want Postgres, and not something smaller?  BDB,
= or SQL Lite, for example?
I have considered various options, including BDB and SQL Lite, but
alas, it will have to be postgres if it's going to be a database. Otherwise
it will be back to the original idea of flat .idx files :(
 
= Postgres is bloatware by design: it was built to house PhD theses.
= -- J. Hellerstein (who ought to know)
 :o)  Believe me, I've been amazed since I encountered postgres v6.3.2
in '98

= But having said that ... given virtual memory and cramped configuration
= settings, Postgres would certainly run in an 8M machine.  Maybe crawl
= would be a more applicable verb than run, but you could execute it.

Crawling is ok. Won't differ much from normal operation on a machine like that.
Any  tips on how to achieve the most diminutive vmem an conf settings?
I tried to figure this out from the docs, and played around with 
backend/storage , but I'm not really winning.

Regards
Stef


pgp0.pgp
Description: PGP signature


Re: [PERFORM] TPC-R benchmarks

2003-10-03 Thread Oleg Lebedev
Josh,

I declared all the indexes that you suggested and ran vacuum full
analyze. The query plan has not changed and it's still trying to use
seqscan. I tried to disable seqscan, but the plan didn't change. Any
other suggestions?
I started explain analyze on the query, but I doubt it will finish any
time soon.
Thanks.

Oleg


-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 02, 2003 11:27 PM
To: Oleg Lebedev; scott.marlowe
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] TPC-R benchmarks


Oleg,

 I have another question. How do I optimize my indexes for the query 
 that contains a lot of ORed blocks, each of which contains a bunch of 
 ANDed expressions? The structure of each ORed block is the same except

 the right-hand-side values vary.

Given the example, I'd do a multicolumn index on p_brand, p_container,
p_size 
and a second multicolumn index on l_partkey, l_quantity, l_shipmode.
Hmmm 
... or maybe seperate indexes, one on l_partkey and one on l_quantity, 
l_shipmode  l_instruct.   Test both configurations.

Mind you, if this is also an OLTP table, then you'd want to test those 
multi-column indexes to determine the least columns you need for the
indexes 
still to be used, since more columns = more index maintainence.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

*

This e-mail may contain privileged or confidential material intended for the named 
recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information 
in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network. 

*

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres low end processing.

2003-10-03 Thread Stef
On Fri, 03 Oct 2003 12:32:00 -0400
Tom Lane [EMAIL PROTECTED] wrote:

= What exactly is failing?  And what's the platform, anyway?

Nothing is really failing atm, except the funds for better 
hardware. JBOSS and some other servers need to be 
run on these machines, along with linux, which will be 
a minimal RH = 7.2 with kernel 2.4.21
(Any better suggestions here?)

In this case, whatever is the least amount of memory
postgres can run on, is what is needed. So this is still
a kind of feasibility study. Of course, it will still be thoroughly
tested, if it turns out to be possible. (Which I know it is, but not how)

Regards
Stef


pgp0.pgp
Description: PGP signature


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Josh Berkus
Dror,

 select articlenumber, channel, description, title, link, dtstamp  from
   items, my_channels where items.channel = '2' and my_channels.id =
   '2' and owner = 'drormata'  and dtstamp  last_viewed and
   articlenumber not in (select item from viewed_items where channel
   ='2' and owner = 'drormata');

the NOT IN is a bad idea unless the subselect never returns more than a 
handful of rows.  If viewed_items can grow to dozens  of rows, wyou should 
use WHERE NOT EXISTS instead.  Unless you're using 7.4.

 item_max_date() looks like this:
select max(dtstamp) from items where channel = $1 and link = $2;

Change it to 

SELECT dtstamp from iterm where channel = $1 and link = $2
ORDER BY dtstamp DESC LIMIT 1

and possibly build an index on channel, link, dtstamp

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Dror Matalon

Hi Josh,

On Fri, Oct 03, 2003 at 02:07:10PM -0700, Josh Berkus wrote:
 Dror,
 
  select articlenumber, channel, description, title, link, dtstamp  from
  items, my_channels where items.channel = '2' and my_channels.id =
  '2' and owner = 'drormata'  and dtstamp  last_viewed and
  articlenumber not in (select item from viewed_items where channel
  ='2' and owner = 'drormata');
 
 the NOT IN is a bad idea unless the subselect never returns more than a 
 handful of rows.  If viewed_items can grow to dozens  of rows, wyou should 
 use WHERE NOT EXISTS instead.  Unless you're using 7.4.
 

I am using 7.4, and had tried NOT EXISTS and didn't see any
improvements.

  item_max_date() looks like this:
 select max(dtstamp) from items where channel = $1 and link = $2;
 
 Change it to 
 
 SELECT dtstamp from iterm where channel = $1 and link = $2
 ORDER BY dtstamp DESC LIMIT 1
 

Didn't make a difference. And plugging real values into this query as
well as into  the original 
  select max(dtstamp) from items where channel = $1 and link = $2;

and doing an explain analyze shows that the cost is the same. The
strange things is that when I run the above queries by hand they take
about .5 msec. Yet on a resultset that fetches 5 rows, I go up from 15
msec to 300 msec. It would seem like it should be something like 15 +
(0.5 * 5) + small overhead, = 30 msec or so rather than the 300 I'm
seeing.

 and possibly build an index on channel, link, dtstamp

Didn't make a difference either. Explain analyze shows that it didn't
use it.

 
 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco
 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Josh Berkus
Dror,

 I am using 7.4, and had tried NOT EXISTS and didn't see any
 improvements.

It wouldn't if you're using 7.4, which has improved IN performance immensely.

What happens if you stop using a function and instead use a subselect?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Rod Taylor
 item_max_date() looks like this:
select max(dtstamp) from items where channel = $1 and link = $2;

It is too bad the (channel, link) index doesn't have dtstamp at the end
of it, otherwise the below query would be a gain (might be a small one
anyway).

  select dtstamp
from items
   where channel = $1
 and link = $2
ORDER BY dtstamp DESC
   LIMIT 1;


Could you show us the exact specification of the function?  In
particular, did you mark it VOLATILE, IMMUTABLE, or STABLE?

I hope it isn't the first or second one ;)


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Bruno Wolff III
On Fri, Oct 03, 2003 at 15:47:01 -0600,
  Rob Nagler [EMAIL PROTECTED] wrote:
  vacuum full does require exclusive lock, plain vacuum does not.
 
 I think I need full, because there are updates on the table.  As I
 understand it, an update in pg is an insert/delete, so it needs
 to be garbage collected.

Plain vacuum will mark the space used by deleted tuples as reusable.
Most of the time this is good enough and you don't need to run vacuum full.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Dror Matalon
On Fri, Oct 03, 2003 at 05:44:49PM -0400, Rod Taylor wrote:
  item_max_date() looks like this:
 select max(dtstamp) from items where channel = $1 and link = $2;
 
 It is too bad the (channel, link) index doesn't have dtstamp at the end
 of it, otherwise the below query would be a gain (might be a small one
 anyway).
 
   select dtstamp
 from items
where channel = $1
  and link = $2
 ORDER BY dtstamp DESC
LIMIT 1;

Similar idea to what Josh suggested. I did create an additional index
with dtstamp at the end and it doesn't look like the planner used it.
Using the above query instead of max() didn't improve things either.

 
 
 Could you show us the exact specification of the function?  In
 particular, did you mark it VOLATILE, IMMUTABLE, or STABLE?
 
 I hope it isn't the first or second one ;)

CREATE or REPLACE FUNCTION item_max_date (int4, varchar) RETURNS
timestamptz AS '
select max(dtstamp) from items where channel = $1 and link = $2;
' LANGUAGE 'sql';




-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Dror Matalon
On Fri, Oct 03, 2003 at 02:35:46PM -0700, Josh Berkus wrote:
 Dror,
 
  I am using 7.4, and had tried NOT EXISTS and didn't see any
  improvements.
 
 It wouldn't if you're using 7.4, which has improved IN performance immensely.
 
 What happens if you stop using a function and instead use a subselect?

An improvement. Now I'm getting in the 200 msec response time. 

And by the way, I tried not exists again and it actually runs slower
than not in.

 
 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco
 
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Neil Conway
On Fri, 2003-10-03 at 17:47, Rob Nagler wrote:
 They don't deadlock normally,
 only with reindex and vacuum did I see this behavior.

If you can provide a reproducible example of a deadlock induced by
REINDEX + VACUUM, that would be interesting.

(FWIW, I remember noticing a potential deadlock in the REINDEX code and
posting to -hackers about it, but I've never seen it occur in a
real-world situation...)

 One other question: The reindex seems to lock the table for the entire
 process as opposed to freeing the lock between index rebuilds.

Yeah, I wouldn't be surprised if there is some room for optimizing the
locks that are acquired by REINDEX.

-Neil



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Dror Matalon
On Fri, Oct 03, 2003 at 06:10:29PM -0400, Rod Taylor wrote:
 On Fri, 2003-10-03 at 17:53, Dror Matalon wrote:
  On Fri, Oct 03, 2003 at 05:44:49PM -0400, Rod Taylor wrote:
item_max_date() looks like this:
   select max(dtstamp) from items where channel = $1 and link = $2;
   
   It is too bad the (channel, link) index doesn't have dtstamp at the end
   of it, otherwise the below query would be a gain (might be a small one
   anyway).
   
 select dtstamp
   from items
  where channel = $1
and link = $2
   ORDER BY dtstamp DESC
  LIMIT 1;
 
 It didn't make a difference even with the 3 term index? I guess you
 don't have very many common values for channel / link combination.

There's no noticeable difference between two term and three term
indexes.

 
 
 
 How about the below? Note the word STABLE on the end.
 
 CREATE or REPLACE FUNCTION item_max_date (int4, varchar) RETURNS
 timestamptz AS '
 select max(dtstamp) from items where channel = $1 and link = $2;
 ' LANGUAGE 'sql' STABLE;

Made no difference.



-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Christopher Browne
[EMAIL PROTECTED] (Rob Nagler) writes:
 I've read some posts that says vacuum doesn't lock, but my experience
 today indicates the opposite.  It seemed that vacuum full analyze
 was locked waiting and so were other postmaster processes.  It
 appeared to be deadlock, because all were in WAITING state according
 to ps.  I let this go for about a 1/2 hour, and then killed the vacuum
 at which point all other processes completed normally.

VACUUM FULL certainly does lock.

See the man page:

   INPUTS
   FULL   Selects ``full'' vacuum, which may reclaim more space, but takes
  much longer and exclusively locks the table.

The usual answer is that you probably _didn't_ want to VACUUM FULL.

VACUUM ('no full') does NOT block updates.

 The same thing seemed to be happening with reindex on a table.  It
 seems that the reindex locks the table and some other resource which
 then causes deadlock with other active processes.

Not surprising either.  While the reindex takes place, updates to that
table have to be deferred.

 Another issue seems to be performance.  A reindex on some indexes is
 taking 12 minutes or so.  Vacuum seems to be slow, too.  Way longer
 than the time it takes to reimport the entire database (30 mins).

That seems a little surprising.

 In summary, I suspect that it is better from a UI perspective to
 bring down the app on Sat at 3 a.m and reimport with a fixed time
 period than to live through reindexing/vacuuming which may deadlock.
 Am I missing something?

Consider running pg_autovacuum, and thereby do a little bit of
vacuuming here and there all the time.  It DOESN'T block, so unless
your system is really busy, it shouldn't slow things down to a major
degree.
-- 
cbbrowne,@,libertyrms.info
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
  vacuum full does require exclusive lock, plain vacuum does not.

 I think I need full, because there are updates on the table.  As I
 understand it, an update in pg is an insert/delete, so it needs
 to be garbage collected.

Yes and no.  You only need a plain VACUUM that is run often enough to
recover space as fast as you need to grab it.  For heavily updated tables
run it often - I run it every 5 minutes on some tables.  A VACUUM FULL is
only needed if you haven't been running VACUUM often enough in the first
place.

 The description of vacuum full implies that is required if the db
 is updated frequently.   This db gets about 1 txn a second, possibly
 more at peak load.

Assuming you mean 1 update/insert per second that is an absolutely _trivial_
load on any reasonable hardware.  You can do thousands of updates/second on
hardware costing less than $2000.  If you vacuum every hour then you will be
fine.

 IOW, vacuum+reindex is faster than dump+restore?  I didn't see this,
 then again, I had this locking problem, so the stats are distorted.

REINDEX also locks tables like VACUUM FULL.  Either is terribly slow, but
unless you turn off fsync during the restore it's unlikely to be slower than
dump  restore.

Matt


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
  In summary, I suspect that it is better from a UI perspective to
  bring down the app on Sat at 3 a.m and reimport with a fixed time
  period than to live through reindexing/vacuuming which may deadlock.
  Am I missing something?

 Consider running pg_autovacuum, and thereby do a little bit of
 vacuuming here and there all the time.  It DOESN'T block, so unless
 your system is really busy, it shouldn't slow things down to a major
 degree.

My real world experience on a *very* heavily updated OLTP type DB, following
advice from this list (thanks guys!), is that there is essentially zero cost
to going ahead and vacuuming as often as you feel like it.  Go crazy, and
speed up your DB!

OK, that's on a quad CPU box with goodish IO, so maybe there are issues on
very slow boxen, but in a heavy-update environment the advantages seem to
easily wipe out the costs.

Matt

p.s.  Sorry to sound like a Shake'n'Vac advert.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Speeding up Aggregates

2003-10-03 Thread Rod Taylor

  I hope it isn't the first or second one ;)
 
 CREATE or REPLACE FUNCTION item_max_date (int4, varchar) RETURNS
 timestamptz AS '
 select max(dtstamp) from items where channel = $1 and link = $2;
 ' LANGUAGE 'sql';


How about the below?

CREATE or REPLACE FUNCTION item_max_date (int4, varchar) RETURNS
timestamptz AS '
select max(dtstamp) from items where channel = $1 and link = $2;
' LANGUAGE 'sql' STABLE;


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
 Also, if you find that you need to run VACUUM FULL often, then
 you need to
 raise your max_fsm_pages.

Yes and no.  If it's run often enough then the number of tracked pages
shouldn't need to be raised, but then again...

...max_fsm_pages should be raised anyway.  I'm about to reclaim a Pentium
166 w/ 64MB of RAM from a friend I lent it to _many_ years ago, and I
suspect PG would run happily on it as configured by default.  Set it to at
least 50,000 I say.  What do you have to lose, I mean if they're not free
then they're not tracked in the FSM right?

Of course if anyone knows a reason _not_ to raise it then I'm all ears!

Matt



 --
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Neil Conway
On Fri, 2003-10-03 at 17:34, Christopher Browne wrote:
 Not surprising either.  While the reindex takes place, updates to that
 table have to be deferred.

Right, but that's no reason not to let SELECTs proceed, for example.
(Whether that would actually be *useful* is another question...)

-Neil



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
  Also, if you find that you need to run VACUUM FULL often, then
  you need to
  raise your max_fsm_pages.

 Yes and no.  If it's run often enough then the number of tracked pages
 shouldn't need to be raised, but then again...

Oops, sorry, didn't pay attention and missed the mention of FULL.  My bad,
ignore my OT useless response.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Tuning/performance issue...

2003-10-03 Thread Bruce Momjian

I have updated the FAQ to be:

  In comparison to MySQL or leaner database systems, we are
  faster for multiple users, complex queries, and a read/write query
  load.  MySQL is faster for SELECT queries done by a few users. 

Is this accurate?  It seems so.

---

Oleg Lebedev wrote:
 Jeff,
 I would really appreciate if you could send me that lengthy presentation
 that you've written on pg/other dbs comparison.
 Thanks.
 
 Oleg
 
 -Original Message-
 From: Jeff [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, October 01, 2003 6:23 AM
 To: David Griffiths
 Cc: [EMAIL PROTECTED]
 Subject: Re: [PERFORM] Tuning/performance issue...
 Importance: Low
 
 
 On Tue, 30 Sep 2003, David Griffiths wrote:
 
 
  This is all part of a migrate away from Oracle project. We are 
  looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object 
  oriented). We have alot of queries like this
  or worse, and I'm worried that many of them would need to be
 re-written. The
  developers
  know SQL, but nothing about tuning, etc.
 
 
 There's a movement at my company to ditch several commercial db's in
 favor of a free one.  I'm currently the big pg fan around here and I've
 actually written a rather lengthy presentation about pg features, why,
 tuning, etc. but another part was some comparisons to other db's..
 
 I decided so I wouldn't be blinding flaming mysql to give it a whirl and
 loaded it up with the same dataset as pg.  First thing I hit was lack of
 stored procedures.   But I decided to code around that, giving mysql the
 benefit of the doubt.  What I found was interesting.
 
 For 1-2 concurrent
 'beaters' it screamed. ultra-fast.  But.. If you increase the concurrent
 beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
 machine itself become fairly unresponsive.  And if you do cache
 unfriendly
 queries it becomes even worse.   On PG - no problems at all. Scaled fine
 and dandy up.  And with 40 concurrent beaters the machine was still
 responsive.  (The numbers for 20 client was 220 seconds (pg) and 650
 seconds (mysql))
 
 So that is another test to try out - Given your configuration I expect
 you have lots of concurrent activity.
 
 --
 Jeff Trout [EMAIL PROTECTED]
 http://www.jefftrout.com/
 http://www.stuarthamm.net/
 
 
 
 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if
 your
   joining column's datatypes do not match
 
 *
 
 This e-mail may contain privileged or confidential material intended for the named 
 recipient only.
 If you are not the named recipient, delete this message and all attachments.
 Unauthorized reviewing, copying, printing, disclosing, or otherwise using 
 information in this e-mail is prohibited.
 We reserve the right to monitor e-mail sent through our network. 
 
 *
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Tuning/performance issue...

2003-10-03 Thread Rod Taylor
On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote:
 I have updated the FAQ to be:
 
   In comparison to MySQL or leaner database systems, we are
   faster for multiple users, complex queries, and a read/write query
   load.  MySQL is faster for SELECT queries done by a few users. 
 
 Is this accurate?  It seems so.

May wish to say ... for simple SELECT queries ...

Several left outer joins, subselects and a large number of joins are
regularly performed faster in PostgreSQL due to a more mature optimizer.

But MySQL can pump out SELECT * FROM table WHERE key = value; queries in
a hurry.


I've often wondered if they win on those because they have a lighter
weight parser / optimizer with less lets try simplifying this query
steps or if the MYISAM storage mechanism is simply quicker at pulling
data off the disk.



signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Tuning/performance issue...

2003-10-03 Thread Bruce Momjian
Rod Taylor wrote:
-- Start of PGP signed section.
 On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote:
  I have updated the FAQ to be:
  
In comparison to MySQL or leaner database systems, we are
faster for multiple users, complex queries, and a read/write query
load.  MySQL is faster for SELECT queries done by a few users. 
  
  Is this accurate?  It seems so.
 
 May wish to say ... for simple SELECT queries ...

Updated.

 Several left outer joins, subselects and a large number of joins are
 regularly performed faster in PostgreSQL due to a more mature optimizer.
 
 But MySQL can pump out SELECT * FROM table WHERE key = value; queries in
 a hurry.
 
 
 I've often wondered if they win on those because they have a lighter
 weight parser / optimizer with less lets try simplifying this query

I think that is part of it.

 steps or if the MYISAM storage mechanism is simply quicker at pulling
 data off the disk.

And their heap is indexed by myisam, right. I know with Ingres that Isam
was usually faster than btree because you didn't have all those leaves
to traverse to get to the data.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Tuning/performance issue...

2003-10-03 Thread Christopher Browne
Centuries ago, Nostradamus foresaw when [EMAIL PROTECTED] (Bruce Momjian) would write:
 I have updated the FAQ to be:

   In comparison to MySQL or leaner database systems, we are
   faster for multiple users, complex queries, and a read/write query
   load.  MySQL is faster for SELECT queries done by a few users. 

 Is this accurate?  It seems so.

I would think it more accurate if you use the phrase faster for
simple SELECT queries.

MySQL uses a rule-based optimizer which, when the data fits the rules
well, can pump queries through lickety-split without any appreciable
pause for evaluation (or reflection :-).  That's _quite_ a successful
strategy when users are doing what loosely amounts to evaluating
association tables.

select * from table where key = value;

Which is just like tying a Perl variable to a hash table, and doing
   $value = $TABLE{$key};

In web applications where they wanted something a _little_ more
structured than hash tables, that may 'hit the spot.'

Anything hairier than that gets, of course, hairier.  If you want
something that's TRULY more structured, you may lose a lot of hair
:-).
-- 
output = reverse(gro.gultn @ enworbbc)
http://www.ntlug.org/~cbbrowne/oses.html
If you want to talk with some experts about something, go to the bar
where they hang out, buy a round of beers, and they'll surely talk
your ear off, leaving you wiser than before.

If you, a stranger, show up at the bar, walk up to the table, and ask
them to fax you a position paper, they'll tell you to call their
office in the morning and ask for a rate sheet. -- Miguel Cruz

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Fri, 2003-10-03 at 17:34, Christopher Browne wrote:
 Not surprising either.  While the reindex takes place, updates to that
 table have to be deferred.

 Right, but that's no reason not to let SELECTs proceed, for example.

What if said SELECTs are using the index in question?

I suspect it is true that REINDEX locks more than it needs to, but we
should tread carefully about loosening it.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster