subject:"\[ADMIN\] Performance problem..."

[ADMIN] performance problem

2005-04-16 Thread sanjay tumula

the problem is solved...
once i reverted back to 9.1...things seems to be much better.
i also revamped a lot of my code to optimize it for speed.
will post the results after i complete a sizable chunk of my data.
the in also seems to working much better now...
thanx
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq

[ADMIN] performance problem

2005-04-15 Thread sanjay tumula

Ruud,
thanx for your reply.
I am narrowing down on the problem...found that Suse 9.2 has a lot of 
memory related problems.

reverted back to 9.1 and rebuilding the app..will post an update.
you suggestion will not work for me though...i need to check existing 
records and create new records..which cannot be done on one table. 
unless you are going thru hoops and doing a lot of text processingat 
which point why do you need a database :-)

also here is something interesting that i found regarding IN vs = on a 
btree index.

select col1, col2
from table1
where col1 in ( 1,2,3)
;
this comes right back
select col1, col2
from table2
where col1 = 1
;
comes right back.
select col1,col2
from table2
where col1
IN
(
select col1 from table2
where col1 between 1 and 2
)
;
this takes a while
which tells me IN is still not optimized..or am i missing something?
lastly..please post to the forum and not directly to my emailso the 
community can see the 'whole' picture.

thanx
Sanjay
---(end of broadcast)---
TIP 8: explain analyze is your friend

[ADMIN] performance problem

2005-04-14 Thread sanjay tumula

looks like reducing shared_buffers and turning off archiving might be 
helping...i will post a followup in a few days...

kind of disappointed that no one bothered to reply yet though.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

[ADMIN] performance problem

2005-04-12 Thread sanjay tumula

i am having poor performance with postgres 8.0.0
the config of my machine is:
single xeon 3.4 Ghz, 1GB of memory, two IDE drives mounted using LVM and 
JFS file system on Suse 9.2 professional.

What I am trying to do is simple...atleast in ORACLE or DB2 that I am 
trying to stay away from.

I have bunch of records that i need to load...about 30mil every month 
for last 18months. each month is in four parts of about 7 mil.

so i have a stage table..that i load and check to see if there are any 
dups and load into prod tables...

i tried several ways...all using pl/Pgsql...details are below
what is it that i am doing wrong?  with method one...it finishes in 
around 8hrs for 7.5 mil records

with method 2...its processing 5000 records in 5 mins...but it increases 
with each pass... first pass  1min, second pass 2.5mins..and so on...

any input is greatly appreciated...other than i need to upgrade the 
machine...which i am in the process of.

1.  tried to reduce the 7mil, into 'manageable chunks' of about 500,000, 
100,000 or even 5000.
for commiting between...to keep transactions small.

a sample SQL that i use...
/
  copy existing records with updates to history
/
  INSERT INTO sms.sms_src_history
  (
  smsshist_id_n
 ,smsshist_phone_s
 ,status_cd_s
 ,smsshist_effective_date_s
 ,smsshist_effective_time_s
 ,resporg_cd_s
 ,dataset_cd_s
 ,smsshist_phone_n
 ,smsshist_npa_n
 ,smsshist_nxx_n
 ,smsshist_line_n
 ,smsshist_create_dt
 ,smsshist_create_user_s
 ,smsshist_effective_dt
  )
  SELECT
  nextval('sms.sms_src_history_seq')
 ,smsscurr_phone_s
 ,status_cd_s
 ,smsscurr_effective_date_s
 ,smsscurr_effective_time_s
 ,resporg_cd_s
 ,dataset_cd_s
 ,smsscurr_phone_n
 ,smsscurr_npa_n
 ,smsscurr_nxx_n
 ,smsscurr_line_n
 ,LOCALTIMESTAMP
 ,'PRCURR_LOAD'
 ,smsscurr_effective_dt
  FROM
 sms.sms_src_current
  WHERE
 smsscurr_phone_s
  IN
  (
 SELECT
phone_s
 FROM
 (
SELECT
trim(phone_s) as phone_s
   ,trim(status_cd_s) as status_cd_s
   ,trim(effective_date_s) as effective_date_s
   ,trim(effective_time_s) as effective_time_s
   ,trim(resporg_cd_s) as resporg_cd_s
FROM
   sms_stage.stg_sms_src_current
WHERE
   sssc_id_n between begin_rec and end_rec
EXCEPT
SELECT
smsscurr_phone_s,status_cd_s,smsscurr_effective_date_s
   ,smsscurr_effective_time_s,resporg_cd_s
FROM
   sms.sms_src_current
 ) AS in_values
 EXCEPT
 (
SELECT trim(phone_s)
FROM sms_stage.stg_sms_src_current
WHERE sssc_id_n between begin_rec and end_rec
EXCEPT
SELECT smsscurr_phone_s
FROM sms.sms_src_current
 )
  )
2.  declare a cursor and go thru records one by one...this one i tried 
only 5000 at a timestill VERY slow

  stg_curs CURSOR FOR
 SELECT
 sssc_id_n
,trim(phone_s) as phone_s
,trim(status_cd_s) as status_cd_s
,trim(effective_date_s) as effective_date_s
,trim(effective_time_s) as effective_time_s
,trim(resporg_cd_s) as resporg_cd_s
,trim(dataset_cd_s) as dataset_cd_s
 FROM
sms_stage.stg_sms_src_current
 WHERE
sssc_id_n between begin_rec and end_rec
  prod_curs CURSOR FOR
 SELECT *
 FROM sms.sms_src_current
 WHERE smsscurr_phone_n =  to_number(substr(stg_rec.phone_s,1,3)
  ||substr(stg_rec.phone_s,5,3)
  ||substr(stg_rec.phone_s,9,4),'99')
 ORDER BY smsscurr_id_n DESC
etc etc...
3.  the portion of the shell script that call the above function is
  echo Getting Record Count At: `date`
  recnum=`psql -t -c SELECT COALESCE(MAX(sssc_id_n),0) from 
sms_stage.stg_sms_src_current  sms`
  echo Done Getting Record Count At: `date`

  if [ $recnum -eq 0 ]
  then
 echo No Source Records To Process In Stage   else
 echo Processing Source Data Started At: `date`
 for (( begin_rec = 1,end_rec = increment
;end_rec  recnum
;begin_rec = end_rec + 1,end_rec = end_rec + increment
))
 do
echo 
echo Calling process_src_data At: `date`
echo With
echo begin_rec: $begin_rec
echo end_rec: $end_rec
psql -e -v precs=$begin_rec,$end_rec -f pg_process_src_data.sh sms
echo Completed process_src_data At: `date`
etc
4.  the following is my conf file
start of conf file:
#---
external_pid_file = 'postmaster'# write an extra pid file
#---
# CONNECTIONS AND AUTHENTICATION
#---
# - Connection Settings -
#listen_addresses = 'localhost' # what IP interface(s) to

Re: [ADMIN] Performance problem...

2005-03-16 Thread Marcin Giedz

Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisa:

OK now I know I mys query lasts so long:

SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
t11.idjezyktyp = 2 WHERE n.Miasto ILIKE 'warszawa%' LIMIT 25

...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to search 
several records . without DISTINCT (of course a few doubled records 
occured) but query tooks 10ms ;). On the other hand doubled records are 
annoying . is it possible to avoid these without changing data in tables? 
Maybe some other query?

BR,
Marcin

 On Tue, 2005-03-15 at 10:17, Marcin Giedz wrote:
  Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisa:
   On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
Dnia poniedziaek, 14 marca 2005 19:32, Scott Marlowe napisa:
 On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
  Hello...
 
 
  Our company is going to change SQL engine from MySQL to PSQL. Of
  course some performance problems occured. Our server is Dual Xeon
  3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql
  data + RAID1(software - two 146GB SCSI 15k) for pg_xlog.
  Postgres.conf parameters are as follows:
 
  max_connections = 150
  shared_buffers = 5  # min 16, at least
  max_connections*2, 8KB each work_mem = 2048 # min 64,
  size in KB

 50,000 shared buffers may or may not be too much.  Try it at
 different sizes from 5,000 or so up to 50,000 and find the knee. 
 It's usually closer to 10,000 than 50,000, but ymmv...
   
Playing with shared_buffers from 1 to 5 doesn't change
anything in total time for this query :( But when I change work_mem a
little higher to 1 total runtime decreases a little about 10% but
when I change random_page_cost to 0.2 (I know that almost all papers
say it should be higher then 1.0) total runtime decreases almost 3
times and lasts about 900ms - earlier with random_page_cost=1.2 it
took 2.7s. Is it possible to have random_page_cost on this value?
  
   IF random_page_cost needs to be that low, then it's likely that the
   query planner is either getting bad statistics and making a poor
   decision, or that you've got a corner case that it just can't figure
   out.  What does explain analyze yourqueryhere say with
   random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
   full lately?
 
  It cann't be possible - I've run vacuum full analyze - it didn't change
  anything ;)

 You might want to try adjusting these values to see if you can get the
 query planner to choose the faster plan without dropping
 random_page_cost to 0.2.  I.e. give the query planner candy and flowers,
 don't just bonk it on the head with a big stick and drag it back home...

 #cpu_tuple_cost = 0.01  # (same)
 #cpu_index_tuple_cost = 0.001   # (same)
 #cpu_operator_cost = 0.0025 # (same)

 Does explain analyze show a big difference in expected an actual rows
 returned for any of the parts of the query plan?

 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] Performance problem...

2005-03-16 Thread Scott Marlowe

On Wed, 2005-03-16 at 07:46, Marcin Giedz wrote:
 Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisa:
 
 OK now I know I mys query lasts so long:
 
 SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
 t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
 t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
 t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
 t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
 AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
 t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
 t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
 JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
 AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
 AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
 t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
 JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
 AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
 AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
 t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
 JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
 slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
 t11.idjezyktyp = 2 WHERE n.Miasto ILIKE 'warszawa%' LIMIT 25
 
 ...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to search 
 several records . without DISTINCT (of course a few doubled records 
 occured) but query tooks 10ms ;). On the other hand doubled records are 
 annoying . is it possible to avoid these without changing data in tables? 
 Maybe some other query?

Can you enclose the whole query in something like:

select distinct * from (your query here) as a 

and see if that helps?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] Performance problem...

2005-03-16 Thread Marcin Giedz

Dnia roda, 16 marca 2005 15:05, Scott Marlowe napisa:
 On Wed, 2005-03-16 at 07:46, Marcin Giedz wrote:
  Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisa:
 
  OK now I know I mys query lasts so long:
 
  SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa,
  t1.NazwaAscii, t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id,
  t4.numer, t4.MskNumer, t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer,
  t6.MskNumer, t7.Id, t7.numer, t7.MskNumer, t8.Id, t8.numer, t8.MskNumer,
  t9.Id, t9.numer, t9.MskNumer, t10.Id, t10.IdTypFormaPrawna,
  t10.MskFormaPrawna, t11.slowo FROM firmy.adres AS n INNER JOIN
  firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND t1.IdTypNazwa = 153 AND
  t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON t3.IdPodmiot = t1.IdFirma
  AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT JOIN firmy.numer AS t4
  ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160 AND t4.Historia = 0
  LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma AND
  t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
  t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0
  LEFT JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND
  t7.IdTypNumer = 848 AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON
  t8.IdPodmiot = t1.IdFirma AND t8.IdTypNumer = 763 AND t8.Historia = 0
  LEFT JOIN firmy.numer AS t9 ON t9.IdPodmiot = t1.IdFirma AND
  t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT JOIN firmy.formaprawna AS
  t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN slowniki.tslownik AS t11 ON
  t11.idslownik = t10.IdTypFormaPrawna AND t11.idjezyktyp = 2 WHERE
  n.Miasto ILIKE 'warszawa%' LIMIT 25
 
  ...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to
  search several records . without DISTINCT (of course a few doubled
  records occured) but query tooks 10ms ;). On the other hand doubled
  records are annoying . is it possible to avoid these without changing
  data in tables? Maybe some other query?

 Can you enclose the whole query in something like:

 select distinct * from (your query here) as a

 and see if that helps?


Yes it works excellent ;) Thanks so much . 

BR,
Marcin

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] Performance problem...

2005-03-15 Thread Marcin Giedz

Dnia poniedziaek, 14 marca 2005 19:32, Scott Marlowe napisa:
 On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
  Hello...
 
 
  Our company is going to change SQL engine from MySQL to PSQL. Of course
  some performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB
  RAM + RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software
  - two 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as
  follows:
 
  max_connections = 150
  shared_buffers = 5  # min 16, at least max_connections*2, 8KB
  each work_mem = 2048 # min 64, size in KB

 50,000 shared buffers may or may not be too much.  Try it at different
 sizes from 5,000 or so up to 50,000 and find the knee.  It's usually
 closer to 10,000 than 50,000, but ymmv...

Playing with shared_buffers from 1 to 5 doesn't change anything in 
total time for this query :( But when I change work_mem a little higher to 
1 total runtime decreases a little about 10% but when I change 
random_page_cost to 0.2 (I know that almost all papers say it should be 
higher then 1.0) total runtime decreases almost 3 times and lasts about 900ms 
- earlier with random_page_cost=1.2 it took 2.7s. Is it possible to have 
random_page_cost on this value? 


 On the other hand, for a machine with 8 gigs of ram, 2 meg of work_mem
 is pretty small.  Try bumping it up to 8 or 16 megs.  You can change
 this one on the fly for testing, so just do:

 set work_mem=16384;
 and then run the query again and see if that helps.  The hash aggregate
 method uses sort/work mem to do it's work, and if it doesn't think it
 can hold the result set in that space the planner will pick another
 method, like the merge left join.

 In your explain analyze output, look for gross mismatches between
 estimated and actual rows.  Most of yours here look pretty good in the
 areas where the data is being collected, but during the merges, the
 numbers are WAY off, but i'm not sure what to do to change that.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] Performance problem...

2005-03-15 Thread Scott Marlowe

On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
 Dnia poniedziaek, 14 marca 2005 19:32, Scott Marlowe napisa:
  On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
   Hello...
  
  
   Our company is going to change SQL engine from MySQL to PSQL. Of course
   some performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB
   RAM + RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software
   - two 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as
   follows:
  
   max_connections = 150
   shared_buffers = 5  # min 16, at least max_connections*2, 8KB
   each work_mem = 2048 # min 64, size in KB
 
  50,000 shared buffers may or may not be too much.  Try it at different
  sizes from 5,000 or so up to 50,000 and find the knee.  It's usually
  closer to 10,000 than 50,000, but ymmv...
 
 Playing with shared_buffers from 1 to 5 doesn't change anything in 
 total time for this query :( But when I change work_mem a little higher to 
 1 total runtime decreases a little about 10% but when I change 
 random_page_cost to 0.2 (I know that almost all papers say it should be 
 higher then 1.0) total runtime decreases almost 3 times and lasts about 900ms 
 - earlier with random_page_cost=1.2 it took 2.7s. Is it possible to have 
 random_page_cost on this value? 

IF random_page_cost needs to be that low, then it's likely that the
query planner is either getting bad statistics and making a poor
decision, or that you've got a corner case that it just can't figure
out.  What does explain analyze yourqueryhere say with
random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
full lately?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [ADMIN] Performance problem...

2005-03-15 Thread Marcin Giedz

Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisa:
 On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
  Dnia poniedziaek, 14 marca 2005 19:32, Scott Marlowe napisa:
   On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
Hello...
   
   
Our company is going to change SQL engine from MySQL to PSQL. Of
course some performance problems occured. Our server is Dual Xeon
3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql data
+ RAID1(software - two 146GB SCSI 15k) for pg_xlog. Postgres.conf
parameters are as follows:
   
max_connections = 150
shared_buffers = 5  # min 16, at least max_connections*2,
8KB each work_mem = 2048 # min 64, size in KB
  
   50,000 shared buffers may or may not be too much.  Try it at different
   sizes from 5,000 or so up to 50,000 and find the knee.  It's usually
   closer to 10,000 than 50,000, but ymmv...
 
  Playing with shared_buffers from 1 to 5 doesn't change anything
  in total time for this query :( But when I change work_mem a little
  higher to 1 total runtime decreases a little about 10% but when I
  change random_page_cost to 0.2 (I know that almost all papers say it
  should be higher then 1.0) total runtime decreases almost 3 times and
  lasts about 900ms - earlier with random_page_cost=1.2 it took 2.7s. Is it
  possible to have random_page_cost on this value?

 IF random_page_cost needs to be that low, then it's likely that the
 query planner is either getting bad statistics and making a poor
 decision, or that you've got a corner case that it just can't figure
 out.  What does explain analyze yourqueryhere say with
 random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
 full lately?
It cann't be possible - I've run vacuum full analyze - it didn't change 
anything ;)

Marcin


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] Performance problem...

2005-03-15 Thread Scott Marlowe

On Tue, 2005-03-15 at 10:17, Marcin Giedz wrote:
 Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisa:
  On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
   Dnia poniedziaek, 14 marca 2005 19:32, Scott Marlowe napisa:
On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
 Hello...


 Our company is going to change SQL engine from MySQL to PSQL. Of
 course some performance problems occured. Our server is Dual Xeon
 3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql data
 + RAID1(software - two 146GB SCSI 15k) for pg_xlog. Postgres.conf
 parameters are as follows:

 max_connections = 150
 shared_buffers = 5  # min 16, at least max_connections*2,
 8KB each work_mem = 2048 # min 64, size in KB
   
50,000 shared buffers may or may not be too much.  Try it at different
sizes from 5,000 or so up to 50,000 and find the knee.  It's usually
closer to 10,000 than 50,000, but ymmv...
  
   Playing with shared_buffers from 1 to 5 doesn't change anything
   in total time for this query :( But when I change work_mem a little
   higher to 1 total runtime decreases a little about 10% but when I
   change random_page_cost to 0.2 (I know that almost all papers say it
   should be higher then 1.0) total runtime decreases almost 3 times and
   lasts about 900ms - earlier with random_page_cost=1.2 it took 2.7s. Is it
   possible to have random_page_cost on this value?
 
  IF random_page_cost needs to be that low, then it's likely that the
  query planner is either getting bad statistics and making a poor
  decision, or that you've got a corner case that it just can't figure
  out.  What does explain analyze yourqueryhere say with
  random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
  full lately?
 It cann't be possible - I've run vacuum full analyze - it didn't change 
 anything ;)
 

You might want to try adjusting these values to see if you can get the
query planner to choose the faster plan without dropping
random_page_cost to 0.2.  I.e. give the query planner candy and flowers,
don't just bonk it on the head with a big stick and drag it back home...

#cpu_tuple_cost = 0.01  # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025 # (same)

Does explain analyze show a big difference in expected an actual rows
returned for any of the parts of the query plan?

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[ADMIN] Performance problem...

2005-03-14 Thread Marcin Giedz

Hello...


Our company is going to change SQL engine from MySQL to PSQL. Of course some 
performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB RAM + 
RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software - two 
146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as follows:

max_connections = 150
shared_buffers = 5  # min 16, at least max_connections*2, 8KB each
work_mem = 2048 # min 64, size in KB
maintenance_work_mem = 524288   # min 1024, size in KB
checkpoint_segments = 32# in logfile segments, min 1, 16MB each
archive_command = 'cp %p /mnt/logs/%f'  # command to use to 
archive a logfile segment

effective_cache_size = 655360   # typically 8KB each
random_page_cost = 1.2  # units are one sequential page fetch cost
stats_start_collector = true
stats_row_level = true

Of course our system is Debian Sarge with Shared memory size = 1GB

Here is an example:

 I know you don't have our schemas/tables etc. but I also attached QUERY 
PLAN for such query. Maybe there is something wrong with this query maybe it 
should be changed? or so?

SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
t11.idjezyktyp = 2 WHERE n.Ulica ILIKE 'pu%' AND n.IdKraj = 190 LIMIT 25

pl=# explain analyze SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, 
t1.Nazwa, t1.NazwaAscii,
pl-# t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
pl-# t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, 
t7.numer,
pl-# t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
pl-# t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM 
firmy.adres
pl-# AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
pl-# t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
pl-# t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 
LEFT
pl-# JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 
160
pl-# AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = 
t1.IdFirma
pl-# AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 
ON
pl-# t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 
LEFT
pl-# JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 
848
pl-# AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = 
t1.IdFirma
pl-# AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 
ON
pl-# t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 
LEFT
pl-# JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
pl-# slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
pl-# t11.idjezyktyp = 2 WHERE n.Ulica ILIKE 'pu%' AND n.IdKraj = 190 LIMIT 25;

  QUERY PLAN

 Limit  (cost=84757.05..84759.05 rows=25 width=264) (actual 
time=3153.752..3154.418 rows=25 loops=1)
   -  Unique  (cost=84757.05..88861.61 rows=51307 width=264) (actual 
time=3153.748..3154.391 rows=25 loops=1)
 -  Sort  (cost=84757.05..84885.32 rows=51307 width=264) (actual 
time=3153.745..3153.768 rows=44 loops=1)
   Sort Key: t1.id, t1.idtypnazwa, t1.idfirma, t1.nazwa, 
t1.nazwaascii, t1.msknazwa, t3.id, t3.numer, t3.msknumer, t4.id, t4.numer, 
t4.msknumer,
 t5.id, t5.numer, t5.msknumer, t6.id, t6.numer, t6.msknumer, t7.id, t7.numer, 
t7.msknumer, t8.id, t8.numer, t8.msknumer, t9.id, t9.numer, t9.msknumer, 
t10.i
d, t10.idtypformaprawna, t10.mskformaprawna, t11.slowo
   -  Hash Left Join  (cost=18104.92..77085.08 rows=51307 
width=264) (actual time=643.240..3131.874 rows=1128 loops=1)
 Hash Cond: (outer.idtypformaprawna =

Re: [ADMIN] Performance problem...

2005-03-14 Thread Scott Marlowe

On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
 Hello...
 
 
 Our company is going to change SQL engine from MySQL to PSQL. Of course some 
 performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB RAM + 
 RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software - two 
 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as follows:
 
 max_connections = 150
 shared_buffers = 5  # min 16, at least max_connections*2, 8KB each
 work_mem = 2048 # min 64, size in KB

50,000 shared buffers may or may not be too much.  Try it at different
sizes from 5,000 or so up to 50,000 and find the knee.  It's usually
closer to 10,000 than 50,000, but ymmv...

On the other hand, for a machine with 8 gigs of ram, 2 meg of work_mem
is pretty small.  Try bumping it up to 8 or 16 megs.  You can change
this one on the fly for testing, so just do:

set work_mem=16384;
and then run the query again and see if that helps.  The hash aggregate
method uses sort/work mem to do it's work, and if it doesn't think it
can hold the result set in that space the planner will pick another
method, like the merge left join.

In your explain analyze output, look for gross mismatches between
estimated and actual rows.  Most of yours here look pretty good in the
areas where the data is being collected, but during the merges, the
numbers are WAY off, but i'm not sure what to do to change that.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] Performance Problem

2004-06-02 Thread Frank Smith

Thanks to all

I will try the various options out and look to see if we can upgrade
and let you know the result.

It will take a few days as I am not officially here but on
holiday??

Frank

 Frank Finner [EMAIL PROTECTED] 31/05/04 15:03:02 
Hi,

I had a similiar problem with 7.3.5 some time ago (march 23rd). The
query durations increased extremely, as did the time vacuum needed. The
database
continuously (every 10 minutes) drops and creates tables as replication
of a
production database. Vacuum analyze did obviously not take care of
that and
did not release space correctly. Because my database is not so big
(about 1 GB),
I finally decided to dump, drop and recreate it out of the dump every
night.
This works fine for me. I was told on the list (Tom Lane did), that in
7.4 vacuum does handle these things in a much better way, so
dump/recreate is
no longer necessary. If possible (alas, not for me with my database),
I
recommend to upgrade to 7.4.

Regards, Frank Finner.


On 31 May 2004 12:41:07 +0200 Harald Fuchs [EMAIL PROTECTED] sat
down,
thought long and then wrote:

 In article [EMAIL PROTECTED],
 Frank Smith [EMAIL PROTECTED] writes:
 
  Hi
  ID:7
 
  I am running PostgreSQL 7.2.2 on Red Hat 9 and I am suffering a
growing
  performance problem. The problem shows through a slowing of queries
and
  an increase in the system CPU usage. Queries that took less than 6
  seconds clime to take more than 5 minutes and as the system is
driven by
  Apache through Perl scripts, the web server times out. Clearly I
could
  reset the Apache timers, however this would just hide the problem
for a
  little longer and of course once the problem starts to happen the
system
  tends to cascade because the users try again and the orphaned
processes
  continue to use processor time until they complete.
 
  I use Cron to 'VACUUM ANALIZE' the system every night and this
greatly
  improved the performance but has not stopped the delay from
growing. The
  strange thing is that apart from the delay everything seems fine.
 
 If VACUUM does not stop the delay from growing, you might be
suffering
 index bloat.  Either REINDEX by crontab, or upgrade to 7.4, where
 VACUUM seems to take care of that.


***
This transmission contains information which may be confidential and  
which may also be privileged.  It is intended for the named addressee  
only.  Unless you are the named addressee, or authorised to receive it 
on behalf of the addressee you may not copy or use it, or disclose it 
to anyone else.  If you have received this transmission in error please 
contact the sender.  Thank you for your cooperation. 
***

For more information about AEA Technology please visit our website at 
http://www.aeat.co.uk

AEA Technology plc registered office 329 Harwell, Didcot, Oxfordshire OX11 0QJ.
Registered in England and Wales, number 3095862.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] Performance Problem

2004-05-31 Thread Harald Fuchs

In article [EMAIL PROTECTED],
Frank Smith [EMAIL PROTECTED] writes:

 Hi
 ID:7

 I am running PostgreSQL 7.2.2 on Red Hat 9 and I am suffering a growing
 performance problem. The problem shows through a slowing of queries and
 an increase in the system CPU usage. Queries that took less than 6
 seconds clime to take more than 5 minutes and as the system is driven by
 Apache through Perl scripts, the web server times out. Clearly I could
 reset the Apache timers, however this would just hide the problem for a
 little longer and of course once the problem starts to happen the system
 tends to cascade because the users try again and the orphaned processes
 continue to use processor time until they complete.

 I use Cron to 'VACUUM ANALIZE' the system every night and this greatly
 improved the performance but has not stopped the delay from growing. The
 strange thing is that apart from the delay everything seems fine.

If VACUUM does not stop the delay from growing, you might be suffering
index bloat.  Either REINDEX by crontab, or upgrade to 7.4, where
VACUUM seems to take care of that.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[ADMIN] Performance Problem

2004-05-30 Thread Frank Smith

Hi

ID:7

I am running PostgreSQL 7.2.2 on Red Hat 9 and I am suffering a growing
performance problem. The problem shows through a slowing of queries and
an increase in the system CPU usage. Queries that took less than 6
seconds clime to take more than 5 minutes and as the system is driven by
Apache through Perl scripts, the web server times out. Clearly I could
reset the Apache timers, however this would just hide the problem for a
little longer and of course once the problem starts to happen the system
tends to cascade because the users try again and the orphaned processes
continue to use processor time until they complete.

I use Cron to 'VACUUM ANALIZE' the system every night and this greatly
improved the performance but has not stopped the delay from growing. The
strange thing is that apart from the delay everything seems fine.

Some facts:

Rebooting does not remove the problem.
Restarting Apache and/or Postgres makes no difference.
Other that the ANALIZE mentioned above the system has not been tuned in
anyway.
The following sequence does however cure the problem;

pg_dump database  temp.db
dropdb database
createdb database
psql -e database  temp.db

I have not tried leaving out the dropdb and createdb but I do not need
to stop postgres.

Has anybody any idea how to fix my problem is it something I have or
have not done? Please do not tell me to upgrade to the latest version of
Postgres unless it is a clearly identifiable Postgres problem that is at
the rot of my dilemma.

Thanks

Frank

I get balder by the handful, daily.



***
This transmission contains information which may be confidential and  
which may also be privileged.  It is intended for the named addressee  
only.  Unless you are the named addressee, or authorised to receive it 
on behalf of the addressee you may not copy or use it, or disclose it 
to anyone else.  If you have received this transmission in error please 
contact the sender.  Thank you for your cooperation. 
***

For more information about AEA Technology please visit our website at 
http://www.aeat.co.uk

AEA Technology plc registered office 329 Harwell, Didcot, Oxfordshire OX11 0QJ.
Registered in England and Wales, number 3095862.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] Performance Problem

2004-05-30 Thread Christopher Browne

Centuries ago, Nostradamus foresaw when [EMAIL PROTECTED] (Frank Smith) would write:
 Hi

 ID:7

 I am running PostgreSQL 7.2.2 on Red Hat 9 and I am suffering a
 growing performance problem. The problem shows through a slowing of
 queries and an increase in the system CPU usage. Queries that took
 less than 6 seconds clime to take more than 5 minutes and as the
 system is driven by Apache through Perl scripts, the web server
 times out. Clearly I could reset the Apache timers, however this
 would just hide the problem for a little longer and of course once
 the problem starts to happen the system tends to cascade because the
 users try again and the orphaned processes continue to use processor
 time until they complete.

 I use Cron to 'VACUUM ANALIZE' the system every night and this
 greatly improved the performance but has not stopped the delay from
 growing. The strange thing is that apart from the delay everything
 seems fine.

 Some facts:

 Rebooting does not remove the problem.
 Restarting Apache and/or Postgres makes no difference.
 Other that the ANALIZE mentioned above the system has not been tuned in
 anyway.
 The following sequence does however cure the problem;

 pg_dump database  temp.db
 dropdb database
 createdb database
 psql -e database  temp.db

 I have not tried leaving out the dropdb and createdb but I do not need
 to stop postgres.

 Has anybody any idea how to fix my problem is it something I have or
 have not done? Please do not tell me to upgrade to the latest
 version of Postgres unless it is a clearly identifiable Postgres
 problem that is at the rot of my dilemma.

Well, there certainly have been _major_ improvements in performance
between 7.2 and 7.4, so it is more than plausible that that could have
a significant impact on performance.

It sounds as though the traffic on the system is causing query plans
to get out of whack.  

But it is not clear what your problem actually is; you have not
explained the nature of the queries that are performing badly.

I would imagine that it would be beneficial to run VACUUM and/or
ANALYZE somewhat more often.  

- Let's suppose you run VACUUM on tables getting lots of UPDATEs and
  DELETEs (note, I did _not_ say INSERT; tables that only see INSERTs
  essentially NEVER need to be vacuumed) once every 4 hours.

- Tables seeing frequent updates of any sort (INSERT/DELETE/UPDATE)
  perhaps should get an ANALYZE every hour.

There are significant merits to jumping to a newer version in this
regard:

 1.  As of 7.3, you can run pg_autovacuum, which will automatically
 do VACUUMs and ANALYZEs, which hopefully would prevent things
 from going downhill.

 2.  As of 7.4, there is a new logging parameter that allows logging
just those queries that are running slowly, which should help you
to find just the offending queries.

The sort of information we'd actually need to help fix the problem is
the queries and the output provided by running EXPLAIN ANALYZE on some
of the slow queries.

There are some tables that we have in our applications that we happen
to ANALYZE _and_ VACUUM every couple of minutes, as they are UPDATEd
almost continually.  Waiting an hour between VACUUMS would lead to
_really_ bad performance.  You might have an instance of a table like
that, but you haven't said anything that would indicate that.
-- 
(reverse (concatenate 'string gro.mca @ enworbbc))
http://cbbrowne.com/info/sgml.html
Rules of  the Evil Overlord  #149. Ropes supporting  various fixtures
will not be  tied next to open windows  or staircases, and chandeliers
will be hung way at the top of the ceiling.
http://www.eviloverlord.com/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-11-10 Thread scott.marlowe

On Thu, 6 Nov 2003, William Yu wrote:

 scott.marlowe wrote:
  Note that if you're on an IDE drive and you haven't disabled the write 
  cache, you may as well turn off fsync as well, as it's just getting in the 
  way and doing nothing, i.e. the IDE drives are already lying about fsync 
  so why bother.
 
 What about Serial ATA?

I haven't gotten my hands on one yet to test.  We might be getting some in 
in the next few months where I work and I'll test them and report back 
here then.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [ADMIN] performance problem - 10.000 databases

2003-11-09 Thread Christopher Browne

Quoth [EMAIL PROTECTED] (Stephen Frost):
 * Christopher Browne ([EMAIL PROTECTED]) wrote:
 On one of our test servers, I set fsync=false, and a test load's
 load time dropped from about 90 minutes to 3 minutes.  (It was
 REALLY update heavy, with huge numbers of tiny transactions.)
 
 Which is, yes, quite spectacularly faster.  But also quite
 spectacularly unsafe.
 
 I'm willing to live with the risk on a test box whose purpose is
 _testing_; it's certainly not a good thing to do in production.

 Would it be possible to have the effectively done for a specific
 transaction?  If this was done as a single large transaction could
 there be an option to say don't fsync this until it's all done and
 then do it all or something?  Just looking for a way to get the
 'best of both worlds'...

Oh, for sure, the whole thing could be invoked as one giant
transaction, which would reduce the cost dramatically.

But it diminishes the value of the benchmark for my purposes.  It's
useful to measure how costly those individual transactions are.
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://www.ntlug.org/~cbbrowne/oses.html
As far as Saddam Hussein being a great military strategist, he is
neither a strategist, nor is he schooled in the operational arts, nor
is he a tactician, nor is he a general, nor is he as a soldier.  Other
than that, he's a great military man, I want you to know that.
-- General Norman Schwarzkopf, 2/27/91

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [ADMIN] performance problem - 10.000 databases

2003-11-08 Thread CoL

Hi,

Christopher Browne wrote, On 11/6/2003 4:40 PM:

[EMAIL PROTECTED] (Jeff) writes:
On 06 Nov 2003 15:21:03 +0100
Marek Florianczyk [EMAIL PROTECTED] wrote:
fsync = false   
HOLD THE BOAT THERE BATMAN!

I would *STRONGLY* advise not running with fsync=false in production as
PG _CANNOT_ guaruntee data consistancy in the event of a hardware
failure.  It would sure suck to have a power failure screw up your nice
db for the users!
On one of our test servers, I set fsync=false, and a test load's
load time dropped from about 90 minutes to 3 minutes.  (It was REALLY
update heavy, with huge numbers of tiny transactions.)
Which is, yes, quite spectacularly faster.  But also quite
spectacularly unsafe.
I'm willing to live with the risk on a test box whose purpose is
_testing_; it's certainly not a good thing to do in production.
There is something like: set fsync to off; or set fsync to on;
But it says: 'fsync' cannot be changed now. However could be very useful 
to set this option from sql, not just from config.

Tom Lane probably knows why :)

C.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [ADMIN] performance problem - 10.000 databases

2003-11-08 Thread Christopher Browne

[EMAIL PROTECTED] (Jeff) writes:
 On 06 Nov 2003 15:21:03 +0100
 Marek Florianczyk [EMAIL PROTECTED] wrote:

 fsync = false   

 HOLD THE BOAT THERE BATMAN!

 I would *STRONGLY* advise not running with fsync=false in production as
 PG _CANNOT_ guaruntee data consistancy in the event of a hardware
 failure.  It would sure suck to have a power failure screw up your nice
 db for the users!

On one of our test servers, I set fsync=false, and a test load's
load time dropped from about 90 minutes to 3 minutes.  (It was REALLY
update heavy, with huge numbers of tiny transactions.)

Which is, yes, quite spectacularly faster.  But also quite
spectacularly unsafe.

I'm willing to live with the risk on a test box whose purpose is
_testing_; it's certainly not a good thing to do in production.
-- 
let name=cbbrowne and tld=libertyrms.info in String.concat @ [name;tld];;
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] performance problem - 10.000 databases

2003-11-08 Thread William Yu

scott.marlowe wrote:
Note that if you're on an IDE drive and you haven't disabled the write 
cache, you may as well turn off fsync as well, as it's just getting in the 
way and doing nothing, i.e. the IDE drives are already lying about fsync 
so why bother.
What about Serial ATA?

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [ADMIN] performance problem - 10.000 databases

2003-11-08 Thread Stephen Frost

* Christopher Browne ([EMAIL PROTECTED]) wrote:
 On one of our test servers, I set fsync=false, and a test load's
 load time dropped from about 90 minutes to 3 minutes.  (It was REALLY
 update heavy, with huge numbers of tiny transactions.)
 
 Which is, yes, quite spectacularly faster.  But also quite
 spectacularly unsafe.
 
 I'm willing to live with the risk on a test box whose purpose is
 _testing_; it's certainly not a good thing to do in production.

Would it be possible to have the effectively done for a specific
transaction?  If this was done as a single large transaction could there
be an option to say don't fsync this until it's all done and then do it
all or something?  Just looking for a way to get the 'best of both
worlds'...

Stephen


signature.asc
Description: Digital signature

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread Marek Florianczyk


 Heck, you're already pushing the performance envelope with 3,000 users, 
 might as well go for the faster of the two and you'll have one less 
 scheduled upgrade ahead of you.
 
 When do you need to go live?  If it's 1 month, then I'd definitely 
 recommend 7.4.


heh... ;)

PostgreSQL 7.4 is so fu*#$%^ fast!!!

Unbelievable...

I've made test as usual ( some post earlier ) 3.000 schemas and 300
simultaneously connected. I've tuned postgresql.conf with my friend who
is our sql guru ( but he runs oracle usualy )
7.4 is so fast, that sometimes clients ( laptop with full X11
workstation and celeron 700 ) could not keep up forking perl script to
test new database ;) 

my conf:
---
max_connections = 512
shared_buffers = 32768

sort_mem = 2048 
vacuum_mem = 20480

max_fsm_pages = 589824  
max_fsm_relations = 32768
   
fsync = false   
wal_sync_method = fsync
wal_buffers = 1024
  
checkpoint_segments = 4 
checkpoint_timeout = 1800   
checkpoint_warning = 30 
commit_delay = 1
commit_siblings = 2

effective_cache_size = 131072   
random_page_cost = 4

log_connections = true
log_duration = true
log_pid = true
log_statement = true
log_timestamp = true

search_path = '$user'   
max_locks_per_transaction = 512 


from this test 4 tables(int,text,int) 1000 rows in each, no indexes

--
[test] times in sec.
(dbname) (conn. time)   (q = queries)
  (1row)(250rows)(tripleJoin)(update250rows)(update1000rows)
test2291: connect:1 q_fast:1 q_med:0 q_slow:4 q_upd:0 q_upd_all:9
test2260: connect:0 q_fast:1 q_med:0 q_slow:4 q_upd:0 q_upd_all:10
test2274: connect:0 q_fast:1 q_med:0 q_slow:4 q_upd:0 q_upd_all:8
test2296: connect:0 q_fast:1 q_med:0 q_slow:6 q_upd:0 q_upd_all:6
test2283: connect:0 q_fast:1 q_med:0 q_slow:4 q_upd:0 q_upd_all:8
test2302: connect:0 q_fast:1 q_med:0 q_slow:4 q_upd:0 q_upd_all:8
test2290: connect:0 q_fast:1 q_med:0 q_slow:3 q_upd:0 q_upd_all:8
test2287: connect:0 q_fast:1 q_med:0 q_slow:6 q_upd:0 q_upd_all:6
test2267: connect:0 q_fast:1 q_med:0 q_slow:1 q_upd:0 q_upd_all:11
-

the \d queries works under this load just fine!

Now, I just have to modify phpPgAdmin (it's for users to modify their
own database ), I don't know why when I select to database it's try to
fetch all tablenames from all schemas. From log:

-
2003-11-06 22:53:06 [8880] LOG:  statement: SET SEARCH_PATH TO test998
2003-11-06 22:53:06 [8880] LOG:  duration: 1.207 ms
2003-11-06 22:53:06 [8880] LOG:  statement: SELECT tablename, tableowner
FROM pg_catalog.pg_tables
WHERE schemaname='test998' ORDER BY
tablename
2003-11-06 22:53:06 [8880] LOG:  duration: 31.005 ms
2003-11-06 22:53:06 [8880] LOG:  statement: SET SEARCH_PATH TO test999
2003-11-06 22:53:06 [8880] LOG:  duration: 1.202 ms
2003-11-06 22:53:06 [8880] LOG:  statement: SELECT tablename, tableowner
FROM pg_catalog.pg_tables
WHERE schemaname='test999' ORDER BY
tablename
2003-11-06 22:53:06 [8880] LOG:  duration: 30.604 ms


I should go alive with this hosting at the end of the month, but at the
beginning we shouldn't have many customer, so we decide to try v7.4 in
beta now, and wait for official release.


... And my management says, that there is no good support for Open
Source, heh... ;)))


thanks all
Marek


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread Jeff

On 06 Nov 2003 15:21:03 +0100
Marek Florianczyk [EMAIL PROTECTED] wrote:


 fsync = false   

HOLD THE BOAT THERE BATMAN!

I would *STRONGLY* advise not running with fsync=false in production as
PG _CANNOT_ guaruntee data consistancy in the event of a hardware
failure.  It would sure suck to have a power failure screw up your nice
db for the users!


 wal_buffers = 1024

This also seems high. come to think about it- shared_buffers is also
high.

 commit_delay = 1

I could also read to data loss, but you'll get a speed increase on
inserts.

One of the best things you can do to increase insert speed is a nice,
battery backed raid card with a pile of disks hanging off of it.


-- 
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread Marek Florianczyk

W licie z czw, 06-11-2003, godz. 15:37, Jeff pisze: 
 On 06 Nov 2003 15:21:03 +0100
 Marek Florianczyk [EMAIL PROTECTED] wrote:
 
 
  fsync = false   
 
 HOLD THE BOAT THERE BATMAN!
 
 I would *STRONGLY* advise not running with fsync=false in production as
 PG _CANNOT_ guaruntee data consistancy in the event of a hardware
 failure.  It would sure suck to have a power failure screw up your nice
 db for the users!

Sure I know, but with WAL it will make fsync every some? seconds, right?
Maybe users data, aren't so critical ;) it's not for bank, only for www
sites. 
I will try with fsync=true also.

 
 
  wal_buffers = 1024
 
 This also seems high. come to think about it- shared_buffers is also
 high.
 
  commit_delay = 1
 
 I could also read to data loss, but you'll get a speed increase on
 inserts.
 
 One of the best things you can do to increase insert speed is a nice,
 battery backed raid card with a pile of disks hanging off of it.

we will put 4 disks for /data directory ( raid1+0 ) so it will have
performance and fault tolerance, so it should be OK.

greetings
Marek


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread scott.marlowe

On Thu, 6 Nov 2003, Jeff wrote:

 On 06 Nov 2003 15:21:03 +0100
 Marek Florianczyk [EMAIL PROTECTED] wrote:
 
 
  fsync = false   
 
 HOLD THE BOAT THERE BATMAN!
 
 I would *STRONGLY* advise not running with fsync=false in production as
 PG _CANNOT_ guaruntee data consistancy in the event of a hardware
 failure.  It would sure suck to have a power failure screw up your nice
 db for the users!

Note that if you're on an IDE drive and you haven't disabled the write 
cache, you may as well turn off fsync as well, as it's just getting in the 
way and doing nothing, i.e. the IDE drives are already lying about fsync 
so why bother.

Step the first, get on SCSI / or a good IDE RAID controller, then step the 
second, turn fsync back on.  Without reliable storage, fsync is a dunsel.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread David Green


Marek Florianczyk wrote, on Thursday, 11/06/03:

... And my management says, that there is no good support for Open
Source, heh... ;)))

In my experience, there is better support for Open Source than 
Closed Source when it comes to development (and usually all around).


David Green
Sage Automation, Inc.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread scott.marlowe

On 6 Nov 2003, Marek Florianczyk wrote:

 
 ... And my management says, that there is no good support for Open
 Source, heh... ;)))

That's because your support needs are different.  A developer wants 
answers and solutions, a manager often wants someone to blame. :-)


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] performance problem - 10.000 databases

2003-11-06 Thread Oli Sennhauser

David Green wrote:

Marek Florianczyk wrote, on Thursday, 11/06/03:
 

... And my management says, that there is no good support for Open
Source, heh... ;)))
   

In my experience, there is better support for Open Source than 
Closed Source when it comes to development (and usually all around).
 

There are two different kinds of support: One is the support for 
code/fixes. This is probably better in OSS. But there is also the 
support for a product. And here is still a big gap. Many oracle, db2, 
mssql, etc. consultants are available but only few OSS 
consultants/companies which are able to help in case of problems.
They made a survey here in CH under top managers. And this is the answer 
they gave: They do not know where to get support. E.g. there is no 
company called PostgreSQL (Switzerland) Ltd.

---

Oli Sennhauser
Database-Engineer (Oracle  PostgreSQL)
Rebenweg 6
CH - 8610 Uster / Switzerland
Phone (+41) 1 940 24 82 or Mobile (+41) 79 450 49 14
e-Mail [EMAIL PROTECTED]
Website http://mypage.bluewin.ch/shinguz/PostgreSQL/
Secure (signed/encrypted) e-Mail with a Free Personal SwissSign ID: http://www.swisssign.ch

Import the SwissSign Root Certificate: http://swisssign.net/cgi-bin/trust/import



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

Hi all

I was asking for help, a week ago. Performace tests took mi some more
time because of other things that I had to do.
The problem was: tune PostgreSQL to work with 10.000 databases.
Tom Lane (thanks) suggested solution: one database and 10.000 schemas.
From now I will write switch database - thats mean one database, but
different schema.
I don't know how much traffic there will be on production site, so I
tested 3.000 schemas and 400 clients simultaneously connected.
Each database has 4 tables (int,text,int) with 1.000 records and no
idexes.
Each client was doing:

10 x connect,select * from table[rand(1-4)] where
number=[rand(1-1000)],disconnect--(fetch one row)

5 x connect,select * from table[rand(1-4)] where
position=[rand(1-5)],update table[rand(1-4)] set text='some text',
position='integer[rand(1-5)] where number=[rand(1-1000)],disconnect(
fetch 250 row/update one row)

1 x connect,update table[rand(1-4)] set text='some text',
position='[rand(1-5)]',disconnect(update 250 rows)

1 x connect,select * from table1 where position in (select position
from table2 where number in (select number from table3)),disconnect

after that client switch to another database, and start testing from
the beginning.

During this test I was changing some parameters in postgres, and send
kill -HUP ( pg_ctl reload ). I still don't know what settings will be
best for me, except shared buffers, and some kernel and shell
settings.

I noticed that queries like: \d table1 \di \dp are extremly slow,
when 400 clients is connected not even postgres user can't do that query
- why, how to improve that? Will it be a problem ?

Below, there are some results of this test, and postgresql.conf settings
I didn't change random_page_cost because on this test machine I've got
only one scsi disk for /data directory.

Postgres use 90% of the 4 cpus and takes 2GB RAM but load average
doesn't jump over 10-20, so it works better with lot of schemas, than
with a lot of db's

Maybe some suggestion of my postgresql.conf settings? And why queries
\d are so extremly slow?

Thank You
Marek

[PostgreSQL]
max_connections = 512
shared_buffers = 65536
max_fsm_relations = 1
max_fsm_pages = 10
max_locks_per_transaction = 512
wal_buffers = 32   
sort_mem = 16384  
vacuum_mem = 8192
effective_cache_size = 100 
random_page_cost = 4  

[kernel]
kernel/shmmni = 8192
kernel/shmall = 134217728
kernel/shmmax = 536870912

[test] times in sec.
(dbname) (conn. time)   (q = queries)
  (1 row)(250 rows)(triple join)(update 250 rows)(update 1000 rows)
test2374: connect:1 q_fast:4 q_med:0 q_slow:46 q_upd:0 q_upd_all:33
test2347: connect:1 q_fast:4 q_med:1 q_slow:48 q_upd:1 q_upd_all:32
test2351: connect:0 q_fast:4 q_med:2 q_slow:49 q_upd:0 q_upd_all:31
test2373: connect:0 q_fast:5 q_med:0 q_slow:46 q_upd:0 q_upd_all:25

[PostgreSQL]
max_connections = 512
shared_buffers = 4096  
max_fsm_relations = 1000
max_fsm_pages = 1 
max_locks_per_transaction = 512
wal_buffers = 32 
sort_mem = 16384
vacuum_mem = 8192 
effective_cache_size = 100
random_page_cost = 4

[test]
test2430: connect:0 q_fast:2 q_med:1 q_slow:40 q_upd:0 q_upd_all:17
test2425: connect:0 q_fast:2 q_med:0 q_slow:45 q_upd:0 q_upd_all:20
test2434: connect:0 q_fast:2 q_med:0 q_slow:44 q_upd:0 q_upd_all:23
test2435: connect:1 q_fast:2 q_med:0 q_slow:50 q_upd:0 q_upd_all:18

[PostgreSQL]
max_fsm_relations = 2000
max_fsm_pages = 2

[test]
test2171: connect:0 q_fast:3 q_med:1 q_slow:42 q_upd:1 q_upd_all:20
test2177: connect:1 q_fast:3 q_med:0 q_slow:43 q_upd:0 q_upd_all:21
test2166: connect:1 q_fast:3 q_med:1 q_slow:39 q_upd:0 q_upd_all:20
test2165: connect:1 q_fast:3 q_med:1 q_slow:42 q_upd:0 q_upd_all:24
test2162: connect:1 q_fast:3 q_med:1 q_slow:39 q_upd:1 q_upd_all:23






---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Jeff

On 05 Nov 2003 14:33:33 +0100
Marek Florianczyk [EMAIL PROTECTED] wrote:

 
 During this test I was changing some parameters in postgres, and send
 kill -HUP ( pg_ctl reload ). I still don't know what settings will be
 best for me, except shared buffers, and some kernel and shell
 settings.
 

as far as I know, -HUP won't make things like shared buffer changes
take. you need a full restart of PG.
..
but your numbers are different... I guess it did take. huh.
..

how much disk IO is going on during these tests? (vmstat 1)
Any swapping (also shown in vmstat)

Where any of these tables analyze'd?
I see you used no indexes, so on each of your tables it must do a seq
scan. Try adding an index to your test tables and rerun..



-- 
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 14:48, Jeff pisze: 
 On 05 Nov 2003 14:33:33 +0100
 Marek Florianczyk [EMAIL PROTECTED] wrote:
 
  
  During this test I was changing some parameters in postgres, and send
  kill -HUP ( pg_ctl reload ). I still don't know what settings will be
  best for me, except shared buffers, and some kernel and shell
  settings.
  
 
 as far as I know, -HUP won't make things like shared buffer changes
 take. you need a full restart of PG.
 ..
 but your numbers are different... I guess it did take. huh.

Well, I'm not sure, but I only did pg_ctl reload

 ..
 
 how much disk IO is going on during these tests? (vmstat 1)
 Any swapping (also shown in vmstat)

I was watching iostat 1, and it shows about 600 tps, so it's not much,
and when we do raid(1+0) on production machine, disk will go fine.

 
 Where any of these tables analyze'd?
 I see you used no indexes, so on each of your tables it must do a seq
 scan. Try adding an index to your test tables and rerun..

No they weren't analyzed, and I did not indexes specially.
I'm testing postgres to work as sql engine for a hosting environment,
these databases will be used by users=lamers, so many of them will not
do any indexes. I wanted to make a test really close to reality, and see
how many databases I can take on single machine.

One database with 3.000 schemas works better than 3.000 databases, but
there is REAL, BIG problem, and I won't be able to use this solution:
Every query, like \d table \di takes veeery long time.
Users have to have phpPgAdmin wich I modified to suit our needs, but now
it doesn't work, not even log-in. If I rewrite phpPgAdmin to log users
without checking all schemas, and tables within schemas, none of users
will be able to examine structure of table. Query like \d table from
psql monitor takes about 2-5 MINUTES :(

I see that only option is to create one database for every user, and
monitor traffic, and machine load to see when we need another PC and
another PostgreSQL...

Marek


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Bruno Wolff III

On Wed, Nov 05, 2003 at 16:14:59 +0100,
  Marek Florianczyk [EMAIL PROTECTED] wrote:
 One database with 3.000 schemas works better than 3.000 databases, but
 there is REAL, BIG problem, and I won't be able to use this solution:
 Every query, like \d table \di takes veeery long time.
 Users have to have phpPgAdmin wich I modified to suit our needs, but now
 it doesn't work, not even log-in. If I rewrite phpPgAdmin to log users
 without checking all schemas, and tables within schemas, none of users
 will be able to examine structure of table. Query like \d table from
 psql monitor takes about 2-5 MINUTES :(

Analyzing the system tables will likely make these queries go faster.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Andrew Sullivan

On Wed, Nov 05, 2003 at 08:48:52AM -0500, Jeff wrote:
 
 as far as I know, -HUP won't make things like shared buffer changes
 take. you need a full restart of PG.

It definitely will not.  Anything that can only be set on startup
actually means startup.

 but your numbers are different... I guess it did take. huh.

No, that probably just means that there is more variability to runs
than people like to imagine.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 17:24, Andrew Sullivan pisze: 
 On Wed, Nov 05, 2003 at 08:48:52AM -0500, Jeff wrote:
  
  as far as I know, -HUP won't make things like shared buffer changes
  take. you need a full restart of PG.
 
 It definitely will not.  Anything that can only be set on startup
 actually means startup.
 
  but your numbers are different... I guess it did take. huh.
 
 No, that probably just means that there is more variability to runs
 than people like to imagine.

It look's like I will have to make this tests again. It's not big deal,
just when I solve slow \d table problem.
VACUUM ANALYZE in progress...

thanks,

Marek


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Tom Lane

Marek Florianczyk [EMAIL PROTECTED] writes:
 Each client was doing:

 10 x connect,select * from table[rand(1-4)] where
 number=[rand(1-1000)],disconnect--(fetch one row)

Seems like this is testing the cost of connect and disconnect to the
exclusion of nearly all else.  PG is not designed to process just one
query per connection --- backend startup is too expensive for that.
Consider using a connection-pooling module if your application wants
short-lived connections.

 I noticed that queries like: \d table1 \di \dp are extremly slow,

I thought maybe you'd uncovered a performance issue with lots of
schemas, but I can't reproduce it here.  I made 1 schemas each
containing a table mytab, which is about the worst case for an
unqualified \d mytab, but it doesn't seem excessively slow --- maybe
about a quarter second to return the one mytab that's actually in my
search path.  In realistic conditions where the users aren't all using
the exact same table names, I don't think there's an issue.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 17:18, Bruno Wolff III pisze: 
 On Wed, Nov 05, 2003 at 16:14:59 +0100,
   Marek Florianczyk [EMAIL PROTECTED] wrote:
  One database with 3.000 schemas works better than 3.000 databases, but
  there is REAL, BIG problem, and I won't be able to use this solution:
  Every query, like \d table \di takes veeery long time.
  Users have to have phpPgAdmin wich I modified to suit our needs, but now
  it doesn't work, not even log-in. If I rewrite phpPgAdmin to log users
  without checking all schemas, and tables within schemas, none of users
  will be able to examine structure of table. Query like \d table from
  psql monitor takes about 2-5 MINUTES :(
 
 Analyzing the system tables will likely make these queries go faster.

I've made:
VACUUM FULL;
ANALYZE;

and it works better, but no revelation, when I do \d schemaname.table
it's better. I've to still wait about 10-30 sec. and now it's only 100
clients connected. :(


Marek


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Jeff

On 05 Nov 2003 19:01:38 +0100
Marek Florianczyk [EMAIL PROTECTED] wrote:

 and it works better, but no revelation, when I do \d
 schemaname.table it's better. I've to still wait about 10-30 sec. and
 now it's only 100 clients connected. :(
 

So it only goes slow with hundred(s) of clients connecting and
disconnecting?

I'm guessing the CPU is pretty pegged on the box.


-- 
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 18:59, Tom Lane pisze: 
 Marek Florianczyk [EMAIL PROTECTED] writes:
  Each client was doing:
 
  10 x connect,select * from table[rand(1-4)] where
  number=[rand(1-1000)],disconnect--(fetch one row)
 
 Seems like this is testing the cost of connect and disconnect to the
 exclusion of nearly all else.  PG is not designed to process just one
 query per connection --- backend startup is too expensive for that.
 Consider using a connection-pooling module if your application wants
 short-lived connections.

You right, maybe typical php page will have more queries per view
How good is connection-pooling module when connection from each virtual
site is uniq? Different user and password, and differen schemas and
permissions, so this connect-pooling module would have to switch between
users, without reconnecting to database? Impossible ?

 
  I noticed that queries like: \d table1 \di \dp are extremly slow,
 
 I thought maybe you'd uncovered a performance issue with lots of
 schemas, but I can't reproduce it here.  I made 1 schemas each
 containing a table mytab, which is about the worst case for an
 unqualified \d mytab, but it doesn't seem excessively slow --- maybe
 about a quarter second to return the one mytab that's actually in my
 search path.  In realistic conditions where the users aren't all using
 the exact same table names, I don't think there's an issue.

But did you do that under some database load ? eg. 100 clients
connected, like in my example ? When I do these queries \d without any
clients connected and after ANALYZE it's fast, but only 100 clients is
enough to lengthen query time to 30 sec. :(

I've 3000 schemas named: test[1-3000] and 3000 users named test[1-3000]
in each schema there is four tables (table1 table2 table3 table4 )
each table has 3 column (int,text,int) and some of them has also
indexes.

If you want, I will send perl script that forks to 100 process and
perform my queries.

greetings
Marek

 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 19:23, Jeff pisze: 
 On 05 Nov 2003 19:01:38 +0100
 Marek Florianczyk [EMAIL PROTECTED] wrote:
 
  and it works better, but no revelation, when I do \d
  schemaname.table it's better. I've to still wait about 10-30 sec. and
  now it's only 100 clients connected. :(
  
 
 So it only goes slow with hundred(s) of clients connecting and
 disconnecting?
 
 I'm guessing the CPU is pretty pegged on the box.


Thats not exacly what you think, my test script ( some post earlier ) is
doing queries, look at the results below, queries are very fast, just
these with \d ale slow:

test2364: connect:0 q_fast:0 q_med:0 q_slow:15 q_upd:0 q_upd_all:0
test2346: connect:0 q_fast:0 q_med:0 q_slow:17 q_upd:0 q_upd_all:7
test2347: connect:0 q_fast:0 q_med:0 q_slow:15 q_upd:0 q_upd_all:9
test2350: connect:0 q_fast:0 q_med:0 q_slow:16 q_upd:0 q_upd_all:8
test2349: connect:0 q_fast:1 q_med:0 q_slow:15 q_upd:0 q_upd_all:8
test2343: connect:0 q_fast:1 q_med:0 q_slow:17 q_upd:0 q_upd_all:7
test2351: connect:0 q_fast:0 q_med:0 q_slow:17 q_upd:0 q_upd_all:9

output from command: top

 02:20:00 up 1 day,  6:19,  2 users,  load average: 6.62, 6.67, 6.60
130 processes: 123 sleeping, 7 running, 0 zombie, 0 stopped
CPU states:  82.4% user,  16.2% system,   0.0% nice,   1.4% idle
Mem:   2070084K total,  2052932K used,17152K free,78048K buffers
Swap:   995988K total, 6272K used,   989716K free,  1771140K cached

  PID USER PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
18775 postgres  18   0  5532 5440  4688 S 8.7  0.2   0:01 postmaster
19001 postgres  15   0  5484 5392  4664 S 8.3  0.2   0:00 postmaster
19085 postgres  13   0  5492 5400  4680 S 8.3  0.2   0:00 postmaster
18831 postgres  16   0  5532 5440  4680 S 7.9  0.2   0:01 postmaster
19114 postgres  19   0  5436 5344  4624 S 7.9  0.2   0:00 postmaster





---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 19:34, Tom Lane pisze: 
 Marek Florianczyk [EMAIL PROTECTED] writes:
  But did you do that under some database load ? eg. 100 clients
  connected, like in my example ? When I do these queries \d without any
  clients connected and after ANALYZE it's fast, but only 100 clients is
  enough to lengthen query time to 30 sec. :(
 
 Then it's not \d's fault --- you simply don't have enough horsepower to
 support 100 concurrent clients, regardless of what specific query you're
 testing.
 
 You might find that not reconnecting so often would improve matters;
 I'm sure that a lot of your cycles are being taken by backend startup.

Maybe reconnect is to often, but how to explain that reular queries like
select * from table1 ale much faster than \d's ? ( my post to Jeff )

Marek


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Tom Lane

Marek Florianczyk [EMAIL PROTECTED] writes:
 Maybe reconnect is to often, but how to explain that reular queries like
 select * from table1 ale much faster than \d's ? ( my post to Jeff )

[ further experimentation... ]  Ah-hah, I see the problem in 7.3, though
not in 7.4 which is what I was testing to begin with.  7.4 is smarter
about optimizing the LEFT JOINs that are used in \d's queries.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread Marek Florianczyk

W licie z ro, 05-11-2003, godz. 19:52, Tom Lane pisze: 
 Marek Florianczyk [EMAIL PROTECTED] writes:
  Maybe reconnect is to often, but how to explain that reular queries like
  select * from table1 ale much faster than \d's ? ( my post to Jeff )
 
 [ further experimentation... ]  Ah-hah, I see the problem in 7.3, though
 not in 7.4 which is what I was testing to begin with.  7.4 is smarter
 about optimizing the LEFT JOINs that are used in \d's queries.
 

So how do you think
sould I try v7.4 on production machine and wait for stable version? 
Or better use v7.3 with 3000 databases?
I have to do that till december.
Will this optimization solve problem \d queries, or just speed it up
few seconds ?

greetings
Marek


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [ADMIN] performance problem - 10.000 databases

2003-11-05 Thread scott.marlowe

On 5 Nov 2003, Marek Florianczyk wrote:

 W li¶cie z ¶ro, 05-11-2003, godz. 19:52, Tom Lane pisze: 
  Marek Florianczyk [EMAIL PROTECTED] writes:
   Maybe reconnect is to often, but how to explain that reular queries like
   select * from table1 ale much faster than \d's ? ( my post to Jeff )
  
  [ further experimentation... ]  Ah-hah, I see the problem in 7.3, though
  not in 7.4 which is what I was testing to begin with.  7.4 is smarter
  about optimizing the LEFT JOINs that are used in \d's queries.
  
 
 So how do you think
 sould I try v7.4 on production machine and wait for stable version? 
 Or better use v7.3 with 3000 databases?
 I have to do that till december.
 Will this optimization solve problem \d queries, or just speed it up
 few seconds ?

7.4 just went RC1 and is looking good for release in 2 weeks.  I'd test 
it to see if it works for you.  I've found the .0 releases in postgresql 
to be good enough for production, even if they do have one or two minor 
bugs that pop up.  Take frequent backups (you should anyway) and you'll 
probably be alright. 

Heck, you're already pushing the performance envelope with 3,000 users, 
might as well go for the faster of the two and you'll have one less 
scheduled upgrade ahead of you.

When do you need to go live?  If it's 1 month, then I'd definitely 
recommend 7.4.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

Hi all

We are building hosting with apache + php ( our own mod_virtual module )
with about 10.000 wirtul domains + PostgreSQL.
PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
scsi raid 1+0 )

I've made some test's - 3000 databases and 400 clients connected at same
time. These clients was doing on each database set of query 10 times,
and then connect to next database. These queries:
select * from table1 where number='$i'
update table2 set some_text='int(rand(5))'
select * from table1 where position in (select position from table2
where number in (select number from table3))

Each database has four tables (int,text,int) with 1000 records.
Postgres is taking all memory and all processor ( 4CPU with Hyper
Threading )
The first two queries has time duration 0 to 10 sec
Third query has 15-70 sec.

But my problem is that when I hit command:
psql -h 127.0.0.1 dbname dbuser 
I'm waiting about 3-5 sec to enter psql monitor, so every new connection
from apache will wait about 3-5 sec to put query to server. Thats a very
long time... 
4 sec. to connect to server and 4 sec. to process a query.
Why this time to connect to server is so long ???
I could made persistent connection, but with 10.000 clients it will kill
the server.
Has any one idea how to tune postgres, to accept connection faster?
Maybe some others settings to speed up server ?
My settings:
PostgreSQL:
max_connections = 512
shared_buffers = 8192
max_fsm_relations = 1   
max_fsm_pages = 10  
max_locks_per_transaction = 512
wal_buffers = 32
sort_mem = 327681 
vacuum_mem = 8192
fsync = true
effective_cache_size = 100
log_connections = true
log_pid = true
log_statement = true
log_duration = true
log_timestamp = true

Kernel:
kernel.shmmni = 8192
kernel.shmall = 134217728
kernel.shmmax = 536870912
RLIMIT_NPROC=1000


greetings
Marek


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Matt Clark

 I could made persistent connection, but with 10.000 clients it will kill
 the server.

But if they're virtual domains, why would you need one connection per domain?  You 
should only need one connection per apache
process...






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 11:52, Matt Clark pisze: 
  I could made persistent connection, but with 10.000 clients it will kill
  the server.
 
 But if they're virtual domains, why would you need one connection per domain?  You 
 should only need one connection per apache
 process...

Because every virtual domain has its own database, username and
password. So one client domain1.com with db: domain1db user: domain1user
cannot access to second client database domain2.com db: domain2db user:
domain2user
pg_hba.conf look like this:
hostdomain1db   domain1user ip_addr netmask md5
hostdomain2db   domain2user ip_addr netmask md5
.

10.000 records
...
...
hostdomain1db   domain1user ip_addr netmask md5


You know,  one client cannot access to other client database.
So, if one process is connected to domain1db it cannto access to
domain2db, and others.


greetings
Marek


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Gaetano Mendola

Marek Florianczyk wrote:

But my problem is that when I hit command:
psql -h 127.0.0.1 dbname dbuser 
I'm waiting about 3-5 sec to enter psql monitor, so every new connection
from apache will wait about 3-5 sec to put query to server. Thats a very
long time... 
Why don't you use a connection manager ?



Regards
Gaetano Mendola
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 12:25, Matt Clark pisze: 
 Ooh, I see.  That's a tricky one.   Do you really need that level of separation?

Well, if you talk with the clients, and they promise, that they will not
access to other databasess, and specially don't do drop database
my_bes_fried_db
I can put: 
hostany any 0.0.0.0 0.0.0.0 trust
in the very beginning of pg_hba.conf ;)

greetings ;)
Marek

 
 
  Because every virtual domain has its own database, username and
  password. So one client domain1.com with db: domain1db user: domain1user
  cannot access to second client database domain2.com db: domain2db user:
  domain2user
  pg_hba.conf look like this:
  hostdomain1db   domain1user ip_addr netmask md5
  hostdomain2db   domain2user ip_addr netmask md5
  .
  
  10.000 records
  ...
  ...
  hostdomain1db   domain1user ip_addr netmask md5
  
  
  You know,  one client cannot access to other client database.
  So, if one process is connected to domain1db it cannto access to
  domain2db, and others.
 
 
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Matt Clark


 W licie z pi, 31-10-2003, godz. 12:25, Matt Clark pisze:
  Ooh, I see.  That's a tricky one.   Do you really need that level of separation?

 Well, if you talk with the clients, and they promise, that they will not
 access to other databasess, and specially don't do drop database
 my_bes_fried_db
 I can put:
 host  any any 0.0.0.0 0.0.0.0 trust
 in the very beginning of pg_hba.conf ;)

I was more thinking that it might be possible to manage the security at a different 
level than the DB.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Jamie Lawrence

On Fri, 31 Oct 2003, Matt Clark wrote:

 I was more thinking that it might be possible to manage the security at a different 
 level than the DB.
 


We do this with users and permissions. 

Each virtual host has an apache config include specifying a db user,
pass (and database, although most of them use the same one).
Permissions on the database tables are set so that a given vhost can
only access their own data.

Our setup is mod_perl. Don't know how one would go about doing this with
PHP, but I imagine it has some mechanism for per-vhost variables or
similar.

-j

-- 
Jamie Lawrence[EMAIL PROTECTED]
Remember, half-measures can be very effective if all you deal with are
half-wits.
   - Chris Klein



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 13:54, Jamie Lawrence pisze: 
 On Fri, 31 Oct 2003, Matt Clark wrote:
 
  I was more thinking that it might be possible to manage the security at a 
  different level than the DB.
  
 
 
 We do this with users and permissions. 
 
 Each virtual host has an apache config include specifying a db user,
 pass (and database, although most of them use the same one).
 Permissions on the database tables are set so that a given vhost can
 only access their own data.
 
 Our setup is mod_perl. Don't know how one would go about doing this with
 PHP, but I imagine it has some mechanism for per-vhost variables or
 similar.

So, as I understand apache vhost can only connect to specified database.
Strange... no PHP only mod_perl that fetch data from database and writes
html document ? So, clients don't make any scripts, and don't use
function like pgconnect? Do they use CGI with mod_perl, and they write
scripts in perl ? Interesting.
Don't know if it's possible with PHP, don't think so.
But... If I would have 200, or even 900 clients I would do apache with
vhost. But when I have 10.000 clients, apache cannot work with vhosts. (
some system limitation ) So we use our own dynamic vhost module. When
request is made to server, it checks domain part of the request, and
search i LDAP what is DocumentRoot for that domain, and then return
proper file. Config looks like it was only one vhost, but it works with
10.000 domains ;)
No, I think that your solution, would not work for us.
Everything is complicated when a large number of anything occurs. ;)

greetings
sorry for my bad english



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Tom Lane

Marek Florianczyk [EMAIL PROTECTED] writes:
 We are building hosting with apache + php ( our own mod_virtual module )
 with about 10.000 wirtul domains + PostgreSQL.
 PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
 scsi raid 1+0 )
 I've made some test's - 3000 databases and 400 clients connected at same
 time.

You are going to need much more serious iron than that if you want to
support 1 active databases.  The required working set per database
is a couple hundred K just for system catalogs (I don't have an exact
figure in my head, but it's surely of that order of magnitude).  So the
system catalogs alone would require 2 gig of RAM to keep 'em swapped in;
never mind caching any user data.

The recommended way to handle this is to use *one* database and create
1 users each with his own schema.  That should scale a lot better.

Also, with a large max_connections setting, you have to beware that your
kernel settings are adequate --- particularly the open-files table.
It's pretty easy for Postgres to eat all your open files slots.  PG
itself will usually survive this condition just fine, but everything
else you run on the machine will start falling over :-(.  For safety
you should make sure that max_connections * max_files_per_process is
comfortably less than the size of the kernel's open-files table.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Matt Clark

Hmm, maybe you need to back off a bit here on your expectations.  You said your test 
involved 400 clients simultaneously running
queries that hit pretty much all the data in each client's DB.  Why would you expect 
that to be anything *other* than slow?

And does it reflect expected production use?  Unless those 10,000 sites are all 
fantastically popular, surely it's more likely that
only a small number of queries will be in progress at any given time?  You're 
effectively simulating running 400 _very_ popular
dynamic websites off one 2-cpu DB server.

You also said that CPU is pegged at 100%.  Given that you've got 400 backends all 
competing for CPU time you must have an insane
load average too, so improving the connect time might prove to be of no use, as you 
could well just get fasert connects and then
slower queries!

Sorry this email wasn't more constructive ;-)

M

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Marek Florianczyk
 Sent: 31 October 2003 13:20
 To: Jamie Lawrence
 Cc: Matt Clark; [EMAIL PROTECTED]
 Subject: Re: [ADMIN] performance problem - 10.000 databases


 W licie z pi, 31-10-2003, godz. 13:54, Jamie Lawrence pisze:
  On Fri, 31 Oct 2003, Matt Clark wrote:
 
   I was more thinking that it might be possible to manage the security at a 
   different level than the DB.
  
 
 
  We do this with users and permissions.
 
  Each virtual host has an apache config include specifying a db user,
  pass (and database, although most of them use the same one).
  Permissions on the database tables are set so that a given vhost can
  only access their own data.
 
  Our setup is mod_perl. Don't know how one would go about doing this with
  PHP, but I imagine it has some mechanism for per-vhost variables or
  similar.

 So, as I understand apache vhost can only connect to specified database.
 Strange... no PHP only mod_perl that fetch data from database and writes
 html document ? So, clients don't make any scripts, and don't use
 function like pgconnect? Do they use CGI with mod_perl, and they write
 scripts in perl ? Interesting.
 Don't know if it's possible with PHP, don't think so.
 But... If I would have 200, or even 900 clients I would do apache with
 vhost. But when I have 10.000 clients, apache cannot work with vhosts. (
 some system limitation ) So we use our own dynamic vhost module. When
 request is made to server, it checks domain part of the request, and
 search i LDAP what is DocumentRoot for that domain, and then return
 proper file. Config looks like it was only one vhost, but it works with
 10.000 domains ;)
 No, I think that your solution, would not work for us.
 Everything is complicated when a large number of anything occurs. ;)

 greetings
 sorry for my bad english



 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 15:23, Tom Lane pisze: 
 Marek Florianczyk [EMAIL PROTECTED] writes:
  We are building hosting with apache + php ( our own mod_virtual module )
  with about 10.000 wirtul domains + PostgreSQL.
  PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
  scsi raid 1+0 )
  I've made some test's - 3000 databases and 400 clients connected at same
  time.
 
 You are going to need much more serious iron than that if you want to
 support 1 active databases.  The required working set per database
 is a couple hundred K just for system catalogs (I don't have an exact
 figure in my head, but it's surely of that order of magnitude).

it's about 3.6M

 So the
 system catalogs alone would require 2 gig of RAM to keep 'em swapped in;
 never mind caching any user data.
 
 The recommended way to handle this is to use *one* database and create
 1 users each with his own schema.  That should scale a lot better.
 
 Also, with a large max_connections setting, you have to beware that your
 kernel settings are adequate --- particularly the open-files table.
 It's pretty easy for Postgres to eat all your open files slots.  PG
 itself will usually survive this condition just fine, but everything
 else you run on the machine will start falling over :-(.  For safety
 you should make sure that max_connections * max_files_per_process is
 comfortably less than the size of the kernel's open-files table.

Yes, I have made some updates, number of process, semaphores, and file
descriptor. I'm aware of this limitation. On this machine there will be
only PostgreSQL, nothing else.
This idea with one database and 10.000 schemas is very interesting, I
never thought about that. I will make some tests on monday and send
results to the list.

greeings
Marek



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 15:30, Matt Clark pisze: 
 Hmm, maybe you need to back off a bit here on your expectations.  You said your test 
 involved 400 clients simultaneously running
 queries that hit pretty much all the data in each client's DB.  Why would you expect 
 that to be anything *other* than slow?
 
 And does it reflect expected production use?  Unless those 10,000 sites are all 
 fantastically popular, surely it's more likely that
 only a small number of queries will be in progress at any given time?  You're 
 effectively simulating running 400 _very_ popular
 dynamic websites off one 2-cpu DB server.

Well, maybe these queries will not happens in production life, but if
many clients will make large tables and no index, effect can be this
same. Besides I wanted to identify thin throat on this machine before we
will put this to the production. 
PostgreSQL was working quite good, and if not this long time to connect
to database I would be quite happy.
But solution from Tom is great I think, so I must test it.

have nice weekend !
Marek

 
 You also said that CPU is pegged at 100%.  Given that you've got 400 backends all 
 competing for CPU time you must have an insane
 load average too, so improving the connect time might prove to be of no use, as you 
 could well just get fasert connects and then
 slower queries!
 
 Sorry this email wasn't more constructive ;-)
 
 M
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Marek Florianczyk
  Sent: 31 October 2003 13:20
  To: Jamie Lawrence
  Cc: Matt Clark; [EMAIL PROTECTED]
  Subject: Re: [ADMIN] performance problem - 10.000 databases
 
 
  W licie z pi, 31-10-2003, godz. 13:54, Jamie Lawrence pisze:
   On Fri, 31 Oct 2003, Matt Clark wrote:
  
I was more thinking that it might be possible to manage the security at a 
different level than the DB.
   
  
  
   We do this with users and permissions.
  
   Each virtual host has an apache config include specifying a db user,
   pass (and database, although most of them use the same one).
   Permissions on the database tables are set so that a given vhost can
   only access their own data.
  
   Our setup is mod_perl. Don't know how one would go about doing this with
   PHP, but I imagine it has some mechanism for per-vhost variables or
   similar.
 
  So, as I understand apache vhost can only connect to specified database.
  Strange... no PHP only mod_perl that fetch data from database and writes
  html document ? So, clients don't make any scripts, and don't use
  function like pgconnect? Do they use CGI with mod_perl, and they write
  scripts in perl ? Interesting.
  Don't know if it's possible with PHP, don't think so.
  But... If I would have 200, or even 900 clients I would do apache with
  vhost. But when I have 10.000 clients, apache cannot work with vhosts. (
  some system limitation ) So we use our own dynamic vhost module. When
  request is made to server, it checks domain part of the request, and
  search i LDAP what is DocumentRoot for that domain, and then return
  proper file. Config looks like it was only one vhost, but it works with
  10.000 domains ;)
  No, I think that your solution, would not work for us.
  Everything is complicated when a large number of anything occurs. ;)
 
  greetings
  sorry for my bad english
 
 
 
  ---(end of broadcast)---
  TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Mike Rylander

On Friday 31 October 2003 09:59 am, Marek Florianczyk wrote:
 W licie z pi, 31-10-2003, godz. 15:23, Tom Lane pisze:
  Marek Florianczyk [EMAIL PROTECTED] writes:
   We are building hosting with apache + php ( our own mod_virtual module
   ) with about 10.000 wirtul domains + PostgreSQL.
   PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
   scsi raid 1+0 )
   I've made some test's - 3000 databases and 400 clients connected at
   same time.
 
  You are going to need much more serious iron than that if you want to
  support 1 active databases.  The required working set per database
  is a couple hundred K just for system catalogs (I don't have an exact
  figure in my head, but it's surely of that order of magnitude).

 it's about 3.6M

  So the
  system catalogs alone would require 2 gig of RAM to keep 'em swapped in;
  never mind caching any user data.
 
  The recommended way to handle this is to use *one* database and create
  1 users each with his own schema.  That should scale a lot better.
 
  Also, with a large max_connections setting, you have to beware that your
  kernel settings are adequate --- particularly the open-files table.
  It's pretty easy for Postgres to eat all your open files slots.  PG
  itself will usually survive this condition just fine, but everything
  else you run on the machine will start falling over :-(.  For safety
  you should make sure that max_connections * max_files_per_process is
  comfortably less than the size of the kernel's open-files table.

 Yes, I have made some updates, number of process, semaphores, and file
 descriptor. I'm aware of this limitation. On this machine there will be
 only PostgreSQL, nothing else.
 This idea with one database and 10.000 schemas is very interesting, I
 never thought about that. I will make some tests on monday and send
 results to the list.

Following this logic, if you are willing to place the authentication in front 
of the database instead of inside it you can use a connection pool and simply 
change the search_path each time a new user accesses the database.


 greeings
 Marek



 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

-- 
Mike Rylander


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Marek Florianczyk

W licie z pi, 31-10-2003, godz. 16:51, Mike Rylander pisze: 
 On Friday 31 October 2003 09:59 am, Marek Florianczyk wrote:
  W licie z pi, 31-10-2003, godz. 15:23, Tom Lane pisze:
   Marek Florianczyk [EMAIL PROTECTED] writes:
We are building hosting with apache + php ( our own mod_virtual module
) with about 10.000 wirtul domains + PostgreSQL.
PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
scsi raid 1+0 )
I've made some test's - 3000 databases and 400 clients connected at
same time.
  
   You are going to need much more serious iron than that if you want to
   support 1 active databases.  The required working set per database
   is a couple hundred K just for system catalogs (I don't have an exact
   figure in my head, but it's surely of that order of magnitude).
 
  it's about 3.6M
 
   So the
   system catalogs alone would require 2 gig of RAM to keep 'em swapped in;
   never mind caching any user data.
  
   The recommended way to handle this is to use *one* database and create
   1 users each with his own schema.  That should scale a lot better.
  
   Also, with a large max_connections setting, you have to beware that your
   kernel settings are adequate --- particularly the open-files table.
   It's pretty easy for Postgres to eat all your open files slots.  PG
   itself will usually survive this condition just fine, but everything
   else you run on the machine will start falling over :-(.  For safety
   you should make sure that max_connections * max_files_per_process is
   comfortably less than the size of the kernel's open-files table.
 
  Yes, I have made some updates, number of process, semaphores, and file
  descriptor. I'm aware of this limitation. On this machine there will be
  only PostgreSQL, nothing else.
  This idea with one database and 10.000 schemas is very interesting, I
  never thought about that. I will make some tests on monday and send
  results to the list.
 
 Following this logic, if you are willing to place the authentication in front 
 of the database instead of inside it you can use a connection pool and simply 
 change the search_path each time a new user accesses the database.

Well it's not so simple, I think. If I've got apache+php+phpAccelerator
with persistent connection on. Server holds some pool of connection, but
when new request is made from phpscripts, apache looks at his connection
pool for a connection with parameters: dbname,dbuser,dbserver. So for
each of 10.000 virtual domain ( 10.000 databases ) I would have to hold
such a connection.
Second thing: How to change search_path dynamically ? I can set in
postgresql.conf: search_path '$user, public' but it works when a new
client (username,password) is connecting to server, it gets his own
schema with proper privileges (USE,CREATE) and thats all. Right ?

Or maybe I don't uderstand something ? If I will do in pg_hba.conf only
one record:
hostany any ip_addr netmask md5 
and only one database, I must make 10.000 schemas with proper accesslist
(USE,CREATE only for one user, and schemaname is same as dbusername) 
This is what I want to test ;)

Now user connect from phpscript with dbusername=unique_user
dbpass=unique_pass dbname=shared_db
Server holds persistent connection, but it's still one connection per
user, so it would have to have 10.000 simultaneous connection.

I can't see any benefits, with connection pool, or I did not understand
what you wanted to tell me. How to place authentication in front of the
database using, when clients are using phpscripts ?

greetings
Marek



 
 
  greeings
  Marek
 
 
 
  ---(end of broadcast)---
  TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Naomi Walker

We have a similar issue regarding security.  Some of the access to our 
database will be by ODBC connections for reporting purposes (ie. Actuate 
Report/Crystal Reports).  Without creating a zillion or so views (which I 
suspect carries with it alot of overhead), I believe it would be tricky to 
maintain security.

Our application is medical related, and we are bound by HIPAA rules, so 
security is most important.  How would you architect this scenario so our 
ASP customers cannot see each others data?

Naomi




  I was more thinking that it might be possible to manage the security at 
 a different level than the DB.
 


We do this with users and permissions.

Each virtual host has an apache config include specifying a db user,
pass (and database, although most of them use the same one).
Permissions on the database tables are set so that a given vhost can
only access their own data.

Our setup is mod_perl. Don't know how one would go about doing this with
PHP, but I imagine it has some mechanism for per-vhost variables or
similar.

-j

--
Jamie Lawrence[EMAIL PROTECTED]
Remember, half-measures can be very effective if all you deal with are
half-wits.
- Chris Klein



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

-
Naomi Walker Chief Information Officer
   Eldorado Computing, Inc.
[EMAIL PROTECTED]   602-604-3100
-
Insanity is doing things in the same way and expecting different results.


-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is 
addressed, and may contain information that is privileged, confidential and exempt 
from disclosure under applicable law. If you are not the intended addressee, nor 
authorized to receive for the intended addressee, you are hereby notified that you may 
not use, copy, disclose or distribute to anyone the message or any information 
contained in the message. If you have received this message in error, please 
immediately advise the sender by reply email, and delete the message. Thank you.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread scott.marlowe

On 31 Oct 2003, Marek Florianczyk wrote:

 Hi all
 
 We are building hosting with apache + php ( our own mod_virtual module )
 with about 10.000 wirtul domains + PostgreSQL.
 PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB RAM
 scsi raid 1+0 )

Tom's right, you need more memory, period, and probably want a very large 
RAID1+0 (with like 10 or more disks).


 Has any one idea how to tune postgres, to accept connection faster?

Postgresql will take the amount of time it needs.  Connections, especially 
in a contentious environment, aren't cheap.

 Maybe some others settings to speed up server ?
 My settings:
 PostgreSQL:
 max_connections = 512
 shared_buffers = 8192
 max_fsm_relations = 1   
 max_fsm_pages = 10  
 max_locks_per_transaction = 512
 wal_buffers = 32
 sort_mem = 327681  
-^^-- THIS IS WAY TOO HIGH. That's ~320Meg!  PER SORT.  
Drop this down to something reasonable like 8192 or something. (i.e. 8 
meg)  If there were lots of big sorts going on by all 300 users, then 
that's 300*320 Meg memory that could get used up.  I.e. swap storm.

Have you adjusted random_page_cost to reflect your I/O setup?  While the 
default of 4 is a good number for a single drive server, it's kinda high 
for a machine with 4 or more drives in an array.  Figures from 1.2 to 2.0 
seem common.  My database under 7.2.4 run best with about 1.4 
random_page_cost


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] performance problem - 10.000 databases

2003-10-31 Thread Mike Rylander

On Friday 31 October 2003 11:19 am, Marek Florianczyk wrote:
 W licie z pi, 31-10-2003, godz. 16:51, Mike Rylander pisze:
  On Friday 31 October 2003 09:59 am, Marek Florianczyk wrote:
   W licie z pi, 31-10-2003, godz. 15:23, Tom Lane pisze:
Marek Florianczyk [EMAIL PROTECTED] writes:
 We are building hosting with apache + php ( our own mod_virtual
 module ) with about 10.000 wirtul domains + PostgreSQL.
 PostgreSQL is on a different machine ( 2 x intel xeon 2.4GHz 1GB
 RAM scsi raid 1+0 )
 I've made some test's - 3000 databases and 400 clients connected at
 same time.
   
You are going to need much more serious iron than that if you want to
support 1 active databases.  The required working set per
database is a couple hundred K just for system catalogs (I don't have
an exact figure in my head, but it's surely of that order of
magnitude).
  
   it's about 3.6M
  
So the
system catalogs alone would require 2 gig of RAM to keep 'em swapped
in; never mind caching any user data.
   
The recommended way to handle this is to use *one* database and
create 1 users each with his own schema.  That should scale a lot
better.
   
Also, with a large max_connections setting, you have to beware that
your kernel settings are adequate --- particularly the open-files
table. It's pretty easy for Postgres to eat all your open files
slots.  PG itself will usually survive this condition just fine, but
everything else you run on the machine will start falling over :-(. 
For safety you should make sure that max_connections *
max_files_per_process is comfortably less than the size of the
kernel's open-files table.
  
   Yes, I have made some updates, number of process, semaphores, and file
   descriptor. I'm aware of this limitation. On this machine there will be
   only PostgreSQL, nothing else.
   This idea with one database and 10.000 schemas is very interesting, I
   never thought about that. I will make some tests on monday and send
   results to the list.
 
  Following this logic, if you are willing to place the authentication in
  front of the database instead of inside it you can use a connection pool
  and simply change the search_path each time a new user accesses the
  database.

 Well it's not so simple, I think. If I've got apache+php+phpAccelerator
 with persistent connection on. Server holds some pool of connection, but
 when new request is made from phpscripts, apache looks at his connection
 pool for a connection with parameters: dbname,dbuser,dbserver. So for
 each of 10.000 virtual domain ( 10.000 databases ) I would have to hold
 such a connection.
 Second thing: How to change search_path dynamically ? I can set in
 postgresql.conf: search_path '$user, public' but it works when a new
 client (username,password) is connecting to server, it gets his own
 schema with proper privileges (USE,CREATE) and thats all. Right ?

search_path documentation is here:
http://www.postgresql.org/docs/view.php?version=7.3idoc=1file=ddl-schemas.html

 Or maybe I don't uderstand something ? If I will do in pg_hba.conf only
 one record:
 host  any any ip_addr netmask md5
 and only one database, I must make 10.000 schemas with proper accesslist
 (USE,CREATE only for one user, and schemaname is same as dbusername)
 This is what I want to test ;)

 Now user connect from phpscript with dbusername=unique_user
 dbpass=unique_pass dbname=shared_db
 Server holds persistent connection, but it's still one connection per
 user, so it would have to have 10.000 simultaneous connection.

 I can't see any benefits, with connection pool, or I did not understand
 what you wanted to tell me. How to place authentication in front of the
 database using, when clients are using phpscripts ?

I suppose I didn't really explain what I was thinking.  The senario I was 
thinking of would go something like this:

User logins (ssh, etc...) if available would be PAM based.  The user/customer 
creation process would create a new schema in the single database with the 
username for web/shell/ftp/etc logins.  Postgresql can also use PAM for 
logins and this would allow logins to the database from outside your web app.  
The web app would always connect to the database as a user with access to all 
schemas, but would look at the session authentication information to change 
the active search path to be [username],public.  In the case of shell (psql) 
logins, the default search path would be $user,public as the docs show.  If 
the schemas are created with an AUTORIZATION of the [username] then local 
(psql) logins would only allow them to see thier schema.  But, because the 
web app is connecting as a user with privileges that allow it to see (the 
tables in) all schemas, it can act as any user by changing its search_path on 
a connection by connection basis.


 greetings
 Marek

   greeings
   Marek
  
  
  
   ---(end of

[ADMIN] Performance Problem Index Ignored, but why

2002-05-22 Thread Thomas A. Lowery


I've the task of porting a current Oracle application to PostgreSQL.

Database: 7.2.1
OS: Linux 2.4.9-13smp

I've an odd thing happening with a query.  Using a simple table:

Table state_tst
Column  | Type | Modifiers
-+--+---
id  | integer  | not null
v_state | character varying(2) |
f_state | character(2) |
Indexes: st_f_state_idx,
st_v_state_idx
Primary key: state_tst_pkey

id is a sequence number and primary key, v_state and f_state are 2
character U.S. States.  I created v_state as varchar(2) and f_state as
char(2) to test if the query explained/performed differently (it
doesn't).

CREATE INDEX st_v_state_idx ON state_tst USING btree (v_state);
CREATE INDEX st_f_state_idx ON state_tst USING btree (f_state);

Load the table using a copy from ...

vacuum verbose analyze state_tst;

Total rows: 14309241

Queries using either f_state = or v_state =  explain (and appear to
execute) using a sequential scan.  Resulting in 60 - 80 second query
times.

Can I force the use of an index?  Or do I have something wrong?  Any
ideas?

pg_test=# explain select  count(*) from state_tst where f_state = 'PA';
NOTICE:  QUERY PLAN:

Aggregate  (cost=277899.65..277899.65 rows=1 width=0)
  -  Seq Scan on state_tst  (cost=0.00..277550.51 rows=139654
  width=0)

EXPLAIN

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] Performance Problem Index Ignored, but why

2002-05-22 Thread Thomas A. Lowery


 estimate that 139654 rows will match f_state = 'PA' in the right
No, 375342 is the actual number.  Using the index does appear slower
(limited testing noted).

explain select count(*) from state_tst where f_state = 'PA'/
NOTICE:  QUERY PLAN:

Aggregate  (cost=277899.65..277899.65 rows=1 width=0)
  -  Seq Scan on state_tst  (cost=0.00..277550.51 rows=139654
  width=0)

select count(*) from state_tst where f_state = 'PA'/
count
'375342'

Elapsed: 139 wallclock secs

set enable_seqscan = off/

 explain select count(*) from state_tst where f_state = 'PA'/
NOTICE:  QUERY PLAN:

Aggregate  (cost=542303.53..542303.53 rows=1 width=0)
 -  Index Scan using st_f_state_idx on state_tst
 (cost=0.00..541954.39 rows=139654 width=0)

 select count(*) from state_tst where f_state = 'PA'/
count
'375342'

Elapsed: 222 wallclock secs

Tom

On Wed, May 22, 2002 at 12:26:35AM -0400, Tom Lane wrote:
 Thomas A. Lowery [EMAIL PROTECTED] writes:
  Can I force the use of an index?
 
 Try set enable_seqscan = off.  But on the basis of what you've shown,
 it's not obvious that an indexscan will be faster.  Is the planner's
 estimate that 139654 rows will match f_state = 'PA' in the right
 ballpark?
 
   regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [ADMIN] Performance Problem Index Ignored, but why

2002-05-22 Thread Naomi Walker



I'm not sure how well this works in Postgres.  For Informix, we could trick 
the optimizer into using an index with something like:

Select colname from table where colname !=NULL.

Specifically mentioning the column in the query was the trick.  In esql/C, 
there were return parameters that then told you how many rows were found.


CREATE INDEX st_v_state_idx ON state_tst USING btree (v_state);
CREATE INDEX st_f_state_idx ON state_tst USING btree (f_state);

Load the table using a copy from ...

vacuum verbose analyze state_tst;

Total rows: 14309241

Queries using either f_state = or v_state =  explain (and appear to
execute) using a sequential scan.  Resulting in 60 - 80 second query
times.

Can I force the use of an index?  Or do I have something wrong?  Any




ideas?

pg_test=# explain select  count(*) from state_tst where f_state = 'PA';
NOTICE:  QUERY PLAN:

Aggregate  (cost=277899.65..277899.65 rows=1 width=0)
   -  Seq Scan on state_tst  (cost=0.00..277550.51 rows=139654
   width=0)

EXPLAIN

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[ADMIN] Performance Problem Index Ignored, but why

2002-05-21 Thread Thomas A. Lowery


I've the task of porting a current Oracle application to PostgreSQL.

Database: 7.2.1
OS: Linux 2.4.9-13smp

I've an odd thing happening with a query.  Using a simple table:

Table state_tst
Column  | Type | Modifiers
-+--+---
id  | integer  | not null
v_state | character varying(2) |
f_state | character(2) |
Indexes: st_f_state_idx,
st_v_state_idx
Primary key: state_tst_pkey

id is a sequence number and primary key, v_state and f_state are 2
character U.S. States.  I created v_state as varchar(2) and f_state as
char(2) to test if the query explained/performed differently (it
doesn't).

CREATE INDEX st_v_state_idx ON state_tst USING btree (v_state);
CREATE INDEX st_f_state_idx ON state_tst USING btree (f_state);

Load the table using a copy from ...

vacuum verbose analyze state_tst;

Total rows: 14309241

Queries using either f_state = or v_state =  explain (and appear to
execute) using a sequential scan.  Resulting in 60 - 80 second query
times.

Can I force the use of an index?  Or do I have something wrong?  Any
ideas?

pg_test=# explain select  count(*) from state_tst where f_state = 'PA';
NOTICE:  QUERY PLAN:

Aggregate  (cost=277899.65..277899.65 rows=1 width=0)
  -  Seq Scan on state_tst  (cost=0.00..277550.51 rows=139654
  width=0)

EXPLAIN

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [ADMIN] Performance Problem Index Ignored, but why

2002-05-21 Thread Tom Lane


Thomas A. Lowery [EMAIL PROTECTED] writes:
 Can I force the use of an index?

Try set enable_seqscan = off.  But on the basis of what you've shown,
it's not obvious that an indexscan will be faster.  Is the planner's
estimate that 139654 rows will match f_state = 'PA' in the right
ballpark?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[ADMIN] performance problem with version 7.0.3

2002-01-28 Thread juerg . rietmann


Hello there

Running postgreSQL 7.0.3 on a SuSE linux server (1GHz, PIII, 384 MB RAM) I
have problems with the performance. There are three to six postmaster
processes running. One with ODBC connector to a NT server , the others use
JDBC. Each month, we're having this situation. The customer encounters
bad/slow performance. A list that normaly takes 20 seconds to display takes
the up to 3 minutes.

I did a

vacuum verbose analyze zylinder on the main table for this list and got the
following output :

mondadori=# vacuum verbose analyze zylinder;
NOTICE:  --Relation zylinder--
NOTICE:  Pages 12275: Changed 2, reaped 12264, Empty 0, New 0; Tup 1609:
Vac 143979, Keep/VTL 4/4, Crash 0, UnUsed 0, MinLen 557, MaxLen 703;
Re-using: Free/Avail. Space 98833500/98830092; EndEmpty/Avail. Pages
0/12263. CPU 0.47s/0.45u sec.
NOTICE:  Index z_a_nr_idx: Pages 555; Tuples 1609: Deleted 143979. CPU
0.05s/1.03u sec.
NOTICE:  Index zylinder_pkey: Pages 830; Tuples 1609: Deleted 143979. CPU
0.06s/0.94u sec.
NOTICE:  Rel zylinder: Pages: 12275 -- 134; Tuple(s) moved: 700. CPU
1.14s/0.37u sec.
NOTICE:  Index z_a_nr_idx: Pages 555; Tuples 1609: Deleted 700. CPU
0.04s/0.00u sec.
NOTICE:  Index zylinder_pkey: Pages 830; Tuples 1609: Deleted 700. CPU
0.06s/0.01u sec.
VACUUM
mondadori=# vacuum verbose analyze zylinder;
NOTICE:  --Relation zylinder--
NOTICE:  Pages 134: Changed 0, reaped 3, Empty 0, New 0; Tup 1609: Vac 0,
Keep/VTL 4/0, Crash 0, UnUsed 7, MinLen 557, MaxLen 703; Re-using:
Free/Avail. Space 4348/0; EndEmpty/Avail. Pages 0/0. CPU 0.01s/0.22u sec.
NOTICE:  Index z_a_nr_idx: Pages 555; Tuples 1609: Deleted 0. CPU
0.02s/0.00u sec.
NOTICE:  Index zylinder_pkey: Pages 830; Tuples 1609: Deleted 0. CPU
0.02s/0.03u sec.
VACUUM
mondadori=# vacuum verbose analyze zylinder;
NOTICE:  --Relation zylinder--
NOTICE:  Pages 134: Changed 0, reaped 7, Empty 0, New 0; Tup 1605: Vac 4,
Keep/VTL 0/0, Crash 0, UnUsed 7, MinLen 557, MaxLen 703; Re-using:
Free/Avail. Space 8564/4216; EndEmpty/Avail. Pages 0/4. CPU 0.01s/0.23u
sec.
NOTICE:  Index z_a_nr_idx: Pages 555; Tuples 1605: Deleted 4. CPU
0.03s/0.00u sec.
NOTICE:  Index zylinder_pkey: Pages 830; Tuples 1605: Deleted 4. CPU
0.04s/0.00u sec.
NOTICE:  Rel zylinder: Pages: 134 -- 134; Tuple(s) moved: 4. CPU
0.00s/0.00u sec.
NOTICE:  Index z_a_nr_idx: Pages 555; Tuples 1605: Deleted 4. CPU
0.02s/0.01u sec.
NOTICE:  Index zylinder_pkey: Pages 830; Tuples 1605: Deleted 4. CPU
0.04s/0.00u sec.
VACUUM

Please help me in analyzing this output. After vacuum/analyze, the
performance is ok for another month.

Thanks in advance ... jr

__

PFISTER + PARTNER, SYSTEM - ENGINEERING AG
Juerg Rietmann
Grundstrasse 22a
6343 Rotkreuz
Switzerland

internet   :  www.pup.ch
phone  : +4141 790 4040
fax   : +4141 790 2545
mobile : +4179 211 0315
__


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ADMIN] performance problem with version 7.0.3

2002-01-28 Thread Tom Lane


[EMAIL PROTECTED] writes:
 Please help me in analyzing this output. After vacuum/analyze, the
 performance is ok for another month.

So, vacuum on a regular basis.  Most people do it at least once a day.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ADMIN] performance problem with version 7.0.3

2002-01-28 Thread Tony Reina


[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]...
 Hello there
 
  Please help me in analyzing this output. After vacuum/analyze, the
 performance is ok for another month.
 

Are you regularily doing a vacuum analyze? This is a critical step
to maintaining the database as PostgreSQL doesn't re-use
deleted/updated tuples. In order to keep the db queries fast within
PostgreSQL, you'll need to regularily perform a vacuum analyze. The
frequency will depend on how often the database has
updated/deleted/inserted tuples.

HTH,
-Tony

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

70 matches

Mail list logo