Re: [PERFORM] bad performance on Solaris 10

2006-04-04 Thread Josh Berkus
Mark,

 I suspect that making a *separate* filesystem for the pg_xlog directory
 and mounting that logging + forcedirectio would be a nice way to also
 get performance while keeping the advantages of logging + file
 buffercache for the *rest* of the postgres components.
 Cheers

Yes, we tested this.  It makes a huge difference in WAL speed.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] freebsd/softupdates for data dir

2006-04-04 Thread Vivek Khera


On Apr 3, 2006, at 10:10 PM, Mark Kirkwood wrote:

I've always left them on, and never had any issues...(even after  
unscheduled power loss - which happened here yesterday). As I  
understand it, the softupdate code reorders *metadata* operations,  
and does not alter data operations - so the effect of fysnc(2) on a  
preexisting file is not changed by softupdates being on or off.


This is also my understanding, and I also leave softupdates on for  
the data partition.  Even if it doesn't improve performance, it will  
not reduce it, and otherwise does no harm with respect to postgres'  
disk usage.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] vacuum full seems to hang on very small table

2006-04-04 Thread Dan Harris
I have a table with 1 live row that I found has 115000 dead rows in it ( 
from a testing run ).  I'm trying to VACUUM FULL the table and it has 
run for over 18 hours without completion.  Considering the hardware on 
this box and the fact that performance seems reasonable in all other 
aspects, I'm confused as to why this would happen.  The database other 
than this table is quite large ( 70 gigs on disk ) and I would expect to 
take days to complete but I just did 'vacuum full table_stats'.  That 
should only do that table, correct?  I'm running 8.0.3.


 Table public.table_stats
  Column|Type | Modifiers
-+-+---
count_cfs   | integer |
count_ncfs  | integer |
count_unitactivity  | integer |
count_eventactivity | integer |
min_eventmain   | timestamp without time zone |
max_eventmain   | timestamp without time zone |
min_eventactivity   | timestamp without time zone |
max_eventactivity   | timestamp without time zone |
geocoding_hitrate   | double precision|
recent_load | timestamp without time zone |
count_eventmain | integer |


This is the table structure.

Any ideas where to begin troubleshooting this?

Thanks.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] vacuum full seems to hang on very small table

2006-04-04 Thread Brad Nicholson
Dan Harris wrote:
 I have a table with 1 live row that I found has 115000 dead rows in it (
 from a testing run ).  I'm trying to VACUUM FULL the table and it has
 run for over 18 hours without completion.  Considering the hardware on
 this box and the fact that performance seems reasonable in all other
 aspects, I'm confused as to why this would happen.  The database other
 than this table is quite large ( 70 gigs on disk ) and I would expect to
 take days to complete but I just did 'vacuum full table_stats'.  That
 should only do that table, correct?  I'm running 8.0.3.

VACUUM FULL requires an exclusive lock on the table that it's vacuuming.
 Chances are something else has a lock on the table is blocking the
vacuum from obtaining the necessary lock.  Check pg_locks for ungranted
locks, you'll probably find that the request from the vacuum is ungranted.
-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] vacuum full seems to hang on very small table

2006-04-04 Thread Simon Riggs
On Tue, 2006-04-04 at 08:59 -0600, Dan Harris wrote:
 I have a table with 1 live row that I found has 115000 dead rows in it ( 
 from a testing run ).  I'm trying to VACUUM FULL the table and it has 
 run for over 18 hours without completion.  Considering the hardware on 
 this box and the fact that performance seems reasonable in all other 
 aspects, I'm confused as to why this would happen.  The database other 
 than this table is quite large ( 70 gigs on disk ) and I would expect to 
 take days to complete but I just did 'vacuum full table_stats'.  That 
 should only do that table, correct?  I'm running 8.0.3.

Read this http://www.postgresql.org/docs/8.0/static/release-8-0-5.html
and you'll probably decide to upgrade.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] The order of fields around the = in the WHERE

2006-04-04 Thread Mike Quinn
The datatype of the join columns is a user defined type and there are no
commutators defined. I will fix that and retest. Thanks for the
insight.

Mike Quinn

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Query runs too long for indexed tables

2006-04-04 Thread Andrus
I have relatively small tables (toode and rid) in fast server.
Both tables are indexed on toode field.

Following query takes long time to run.
toode field type is char(20). It is difficult to change this field type.

Any idea how to speed up this query ?

UPDATE firma1.rid SET toode=NULL
   WHERE toode IS NOT NULL AND
   toode NOT IN (SELECT TOODE  FROM firma1.TOODE);

Query returned successfully: 0 rows affected, 594813 ms execution time.

explain window shows:

Seq Scan on rid  (cost=2581.07..20862553.77 rows=51848 width=1207)
  Filter: ((toode IS NOT NULL) AND (NOT (subplan)))
  SubPlan
-  Materialize  (cost=2581.07..2944.41 rows=14734 width=84)
  -  Seq Scan on toode  (cost=0.00..2350.34 rows=14734 width=84)


Andrus. 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query runs too long for indexed tables

2006-04-04 Thread Scott Marlowe
On Tue, 2006-04-04 at 14:37, Andrus wrote:
 I have relatively small tables (toode and rid) in fast server.
 Both tables are indexed on toode field.
 
 Following query takes long time to run.
 toode field type is char(20). It is difficult to change this field type.
 
 Any idea how to speed up this query ?
 
 UPDATE firma1.rid SET toode=NULL
WHERE toode IS NOT NULL AND
toode NOT IN (SELECT TOODE  FROM firma1.TOODE);
 
 Query returned successfully: 0 rows affected, 594813 ms execution time.
 
 explain window shows:
 
 Seq Scan on rid  (cost=2581.07..20862553.77 rows=51848 width=1207)
   Filter: ((toode IS NOT NULL) AND (NOT (subplan)))
   SubPlan
 -  Materialize  (cost=2581.07..2944.41 rows=14734 width=84)
   -  Seq Scan on toode  (cost=0.00..2350.34 rows=14734 width=84)

Let me guess, you've updated it a lot and aren't familiar with Vacuum?

run a vacuum full on your database.  schedule a vacuum (plain one) to
run every so often (hours or days are a good interval for most folks)

If that's NOT your problem, then please, let us know.  

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query runs too long for indexed tables

2006-04-04 Thread Marc Morin
Wondering if

Update firma1.rid set toode=null where toode is not null and not
exists(select 1 from firma1.toode where toode=rid.toode); 

Would be faster... Problem appears to be the seqscan of seqscan... No?

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Scott Marlowe
 Sent: Tuesday, April 04, 2006 3:49 PM
 To: Andrus
 Cc: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Query runs too long for indexed tables
 
 On Tue, 2006-04-04 at 14:37, Andrus wrote:
  I have relatively small tables (toode and rid) in fast server.
  Both tables are indexed on toode field.
  
  Following query takes long time to run.
  toode field type is char(20). It is difficult to change 
 this field type.
  
  Any idea how to speed up this query ?
  
  UPDATE firma1.rid SET toode=NULL
 WHERE toode IS NOT NULL AND
 toode NOT IN (SELECT TOODE  FROM firma1.TOODE);
  
  Query returned successfully: 0 rows affected, 594813 ms 
 execution time.
  
  explain window shows:
  
  Seq Scan on rid  (cost=2581.07..20862553.77 rows=51848 width=1207)
Filter: ((toode IS NOT NULL) AND (NOT (subplan)))
SubPlan
  -  Materialize  (cost=2581.07..2944.41 rows=14734 width=84)
-  Seq Scan on toode  (cost=0.00..2350.34 rows=14734 
  width=84)
 
 Let me guess, you've updated it a lot and aren't familiar with Vacuum?
 
 run a vacuum full on your database.  schedule a vacuum (plain 
 one) to run every so often (hours or days are a good interval 
 for most folks)
 
 If that's NOT your problem, then please, let us know.  
 
 ---(end of 
 broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query runs too long for indexed tables

2006-04-04 Thread Marc Morin
Explain analyze would be nice ;-) 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Andrus
 Sent: Tuesday, April 04, 2006 3:37 PM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] Query runs too long for indexed tables
 
 I have relatively small tables (toode and rid) in fast server.
 Both tables are indexed on toode field.
 
 Following query takes long time to run.
 toode field type is char(20). It is difficult to change this 
 field type.
 
 Any idea how to speed up this query ?
 
 UPDATE firma1.rid SET toode=NULL
WHERE toode IS NOT NULL AND
toode NOT IN (SELECT TOODE  FROM firma1.TOODE);
 
 Query returned successfully: 0 rows affected, 594813 ms 
 execution time.
 
 explain window shows:
 
 Seq Scan on rid  (cost=2581.07..20862553.77 rows=51848 width=1207)
   Filter: ((toode IS NOT NULL) AND (NOT (subplan)))
   SubPlan
 -  Materialize  (cost=2581.07..2944.41 rows=14734 width=84)
   -  Seq Scan on toode  (cost=0.00..2350.34 
 rows=14734 width=84)
 
 
 Andrus. 
 
 
 
 ---(end of 
 broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-04 Thread Jim Nasby

On Apr 2, 2006, at 6:30 PM, Josh Berkus wrote:

But just as a follow up question to your #1 suggestion, I have 8 GB
of ram in my production server. You're saying to set the
effective_cache_size then to 5 GB roughly? Somewhere around 655360?
Currently it is set to 65535. Is that something that's OS dependent?
I'm not sure how much memory my server sets aside for disk caching.


Yes, about.  It's really a judgement call; you're looking for the  
approximate
combined RAM available for disk caching and shared mem.  However,  
this is
just used as a way of estimating the probability that the data you  
want is
cached in memory, so you're just trying to be order-of-magnitude  
accurate,

not to-the-MB accurate.


FWIW, I typically set effective_cache_size to the amount of memory in  
the machine minus 1G for the OS and various other daemons, etc. But  
as Josh said, as long as your somewhere in the ballpark it's probably  
good enough.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-04 Thread Jim Nasby

On Apr 1, 2006, at 12:51 PM, Brendan Duddridge wrote:
from SELECT * FROM pg_stats WHERE tablename='table' AND  
attname='category_id'


I find correlation on category_product for category_id is 0.643703

Would setting the index on category_id to be clustered help with this?


It would absolutely help on the query in question. In my experience,  
a correlation of 0.64 is too low to allow an index scan to be used  
for anything but a tiny number of rows.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] The order of fields around the = in the WHERE

2006-04-04 Thread Mike Quinn
version


 PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
3.3.6
(1 row)

-- After commutator added to operators of user defined type,
-- the order of fields around the = in WHERE conditions
-- no longer affect the query plan.

-- previously the worst way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = '0401606'
AND
-- Commtypes.number = Crops.Commtype
Crops.Commtype = Commtypes.number
AND
Locts.number = Crops.Loct
-- Crops.Loct = Locts.number
AND
Growers.number = Locts.Grower
-- Locts.Grower = Growers.number
;
   QUERY
PLAN   

 Nested Loop  (cost=0.00..17467.00 rows=954 width=20) (actual
time=0.253..1.155 rows=69 loops=1)
   -  Nested Loop  (cost=0.00..12413.93 rows=1202 width=18) (actual
time=0.191..0.472 rows=69 loops=1)
 -  Nested Loop  (cost=0.00..32.51 rows=104 width=18) (actual
time=0.142..0.171 rows=9 loops=1)
   -  Index Scan using growers_id on growers 
(cost=0.00..3.05 rows=4 width=12) (actual time=0.065..0.067 rows=1
loops=1)
 Index Cond: ((id)::text = '0401606'::text)
   -  Index Scan using locts_grower on locts 
(cost=0.00..6.23 rows=91 width=30) (actual time=0.070..0.085 rows=9
loops=1)
 Index Cond: (outer.number = locts.grower)
 -  Index Scan using crops_loct on crops  (cost=0.00..118.53
rows=42 width=24) (actual time=0.011..0.021 rows=8 loops=9)
   Index Cond: (outer.number = crops.loct)
   -  Index Scan using commtypes_number_key on commtypes 
(cost=0.00..4.19 rows=1 width=26) (actual time=0.006..0.007 rows=1
loops=69)
 Index Cond: (outer.commtype = commtypes.number)
 Total runtime: 1.299 ms
(12 rows)

-- previously the best way --

EXPLAIN ANALYZE
SELECT
Locts.id,
Commtypes.name
FROM
Growers
,
Locts
,
Crops
,
Commtypes
WHERE
Growers.id = 0401606
AND
Commtypes.number = Crops.Commtype
-- Crops.Commtype = Commtypes.number
AND
-- Locts.number = Crops.Loct
Crops.Loct = Locts.number
AND
-- Growers.number = Locts.Grower
Locts.Grower = Growers.number
;
   QUERY
PLAN   

 Nested Loop  (cost=0.00..17467.00 rows=954 width=20) (actual
time=0.063..0.947 rows=69 loops=1)
   -  Nested Loop  (cost=0.00..12413.93 rows=1202 width=18) (actual
time=0.050..0.318 rows=69 loops=1)
 -  Nested Loop  (cost=0.00..32.51 rows=104 width=18) (actual
time=0.036..0.064 rows=9 loops=1)
   -  Index Scan using growers_id on growers 
(cost=0.00..3.05 rows=4 width=12) (actual time=0.018..0.020 rows=1
loops=1)
 Index Cond: ((id)::text = '0401606'::text)
   -  Index Scan using locts_grower on locts 
(cost=0.00..6.23 rows=91 width=30) (actual time=0.012..0.023 rows=9
loops=1)
 Index Cond: (locts.grower = outer.number)
 -  Index Scan using crops_loct on crops  (cost=0.00..118.53
rows=42 width=24) (actual time=0.007..0.018 rows=8 loops=9)
   Index Cond: (crops.loct = outer.number)
   -  Index Scan using commtypes_number_key on commtypes 
(cost=0.00..4.19 rows=1 width=26) (actual time=0.005..0.006 rows=1
loops=69)
 Index Cond: (commtypes.number = outer.commtype)
 Total runtime: 1.091 ms
(12 rows)



 Mike Quinn [EMAIL PROTECTED] 4/4/06 10:18:30 AM 
The datatype of the join columns is a user defined type and there are
no
commutators defined. I will fix that and retest. Thanks for the
insight.

Mike Quinn

---(end of
broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster