[Pgpool-general] performance sol10 zone

2010-08-03 Thread Heiko L.
Hallo,

Im running pg-8 on sol10-zone.


After update sol10u7 queries on coltype timestamp are very slow.
System: sparc, 2GB RAM

This DB is a greylist-DB to fight spam.
500 connections should be easy.
But 16 connection  consum 10sec/query.
On another system (sparc) only 1 sec.i

s. details

howto diag?

regards heiko


- details

  $ /opt/csw/postgresql/bin/postmaster -V
  postgres (PostgreSQL) 8.3.1

  - numDS: 200k-800k
  - conn: 50-500

- testbed
 tab=blacklist
 filter=^ .*[0-9]|elaps
 cmd=/usr/bin/time psql -t -p 5432 -h 192.168.5.126 -U smtpuser smtp
 sql=select count(*) from $tab;
 sql1=select count(*) from $tab where create_time  abstime(int4(timenow()) -30
00);
 sql2=select count(*) from $tab where create_time  abstime(int4(timenow()) -30
00);
 echo $sql | $cmd 21 | egrep $filter;
 echo $sql1 | $cmd 21 | egrep $filter;
 echo $sql2 | $cmd 21 | egrep $filter;

- result:
  152603
  0.01user 0.01system 0:00.29elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
 0
  0.02user 0.01system 0:00.05elapsed 58%CPU (0avgtext+0avgdata 0maxresident)k
  152603
  0.00user 0.02system 0:01.95elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k


  - select without where: 15 / 0.3 s
  - select with where:  15 / 2 s
  - time is depended on recv DS (linear)

I simulate an paralle queries:
$ ./sqltestcon.sh -v 1 -i 20 -min 1 -max 64 -sql $sql1; $sql2 -cmd $cmd
3081844 sum: 1 (5) * 13 = 13 -  0 R/s, e=0 0:01.36elapsed 3%CPU
3081901 sum: 2 (7) * 10 = 20 -  1 R/s, e=0 0:01.53elapsed 3%CPU
3081922 sum: 4 (11) * 8 = 32 -  1 R/s, e=0 0:02.40elapsed 0%CPU
3081947 sum: 8 (19) * 5 = 40 -  2 R/s, e=0 0:04.69elapsed 0%CPU
3082005 sum: 16 (36) * 2 = 32 -  1 R/s, e=0 0:09.06elapsed 0%CPU
3082043 sum: 32 (67) * 2 = 64 -  3 R/s, e=0 0:17.83elapsed 0%CPU
3082119 sum: 64 (130) * 1 = 64 -  3 R/s, e=0 0:34.19elapsed 0%CPU

  - 16 connections:  only 2 passes - aprox. 10s /query

$ connstat -cols _sy_load 5432 5433 _ps_pool _ps_postmaster 60
## time load54325433pgpoolpostmaster
08:18:340260125536
08:19:3647111055314
08:20:401804   2 3555342
08:21:51587763128553132

  - 128 connections: Load=58!

- following test are used:

  - zfs recordsize: 128,8  - no differ
  - zfs load:  10% ops, 10% read
  - shmmax: (0x8000)
  - shared_buffers
  - work_mem 64,256,1024
  - wal_buffers
  - effective_cache_size
  - #log_disconnections = off
  - autovacuum = on , off

What can i do?
Howto interpret explain output?

$ echo EXPLAIN ANALYSE $sql2 | $cmd 21
 Aggregate  (cost=5236.53..5236.54 rows=1 width=0) (actual time=2018.985..2018.9
86 rows=1 loops=1)
   -  Seq Scan on blacklist  (cost=0.00..4855.06 rows=152588 width=0) (actual t
ime=0.329..1883.275 rows=152603 loops=1)
 Filter: (create_time  timenow())::integer - 3000))::abstime)::time
stamp with time zone)
 Total runtime: 2019.371 ms

0.00user 0.02system 0:02.07elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+2353minor)pagefaults 0swaps



- test 5 frische DB

tab=testbl
f=/tmp/test.sql
cat  $f EOF
create table $tab
(
relay_ipinet,
create_time timestampdefault now() NOT NULL
);

create index ${tab}_relay_ip_idx on ${tab}(relay_ip);
create index ${tab}_create_time_idx on ${tab}(create_time);
EOF
cat $f | $cmd


- memstat [10]

 $  echo ::memstat|mdb -k
 Page SummaryPagesMB  %Tot
      
 Kernel 189712  1482   74%
 Anon25308   197   10%
 Exec and libs1991151%
 Page cache   1849141%
 Free (cachelist) 2949231%
 Free (freelist) 35060   273   14%

 Total  256869  2006
 Physical   255327  1994


- shm (org)
 $ echo shminfo_shmmax/E | mdb -k
 shminfo_shmmax:
 shminfo_shmmax: 8388608



___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] performance sol10 zone

2010-08-03 Thread Josh Berkus
Heiko,

First, what does this have to do with pgPool?

  Filter: (create_time  timenow())::integer - 
 3000))::abstime)::time
 stamp with time zone)
  Total runtime: 2019.371 ms

Looks like you don't have a usable index on create_time.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] Order of Query Results

2010-08-03 Thread Sean Brown
On Wednesday, July 28, 2010 07:08:42 pm you wrote:
  Le 28/07/2010 17:16, Sean Brown a écrit :
   On Wednesday, July 28, 2010 10:56:27 am you wrote:
   Le 28/07/2010 16:50, Sean Brown a écrit :
   Does PgPool have an issue with the order of results from a query
   being returned in a different order?
   
   With the ongoing issue of our cluster falling out, we just had one
   member fall out again and again, no error in PostgreSQLs log is
   recorded. PgPool reports the possible last query was:
   select * from carts where cart_id in (11835,11824,11819)
   
   Specifically the error in PgPools log is:
   read_kind_from_backend: 1 th kind D does not match with master or
   majority connection kind ^@
   2010-07-28 10:21:23 ERROR: pid 28242: kind mismatch among backends.
   Possible last query was: select * from carts where cart_id in
   (11835,11824,11819) kind details are: 1[D]
   
   If I run that query on the remaining node and on the one that just
   fell out, I get the same 3 results, but the orders of the records
   are different.
   
   The query you show doesn't ask for a specific order (no ORDER BY
   clause), so each backend can send the data in whatever order they
   prefer.
   
   That I understand, what I am wondering is if this will cause PgPool to
   believe there is an error in the information returned from the
   backends, even if the only difference is the difference in order?
   
   If it does, I assume the best way to deal with it is add order by's to
   every query passed to pgpool?
  
  That's actually really good questions, and I don't have an answer to
  them. CC back the list to know if other people have an answer to give us.
 
 No, pgpool does not care about the order of data returned by SELECT.
 More details...
 
  PostgreSQL returns data packet like this:
 T(descriptions about tuple)
 D(actual one raw tuple data)
 D
 D
 
 
 C(indicates succeeded in sending data)
 
 where each single capital letter is packet kind(see PostgreSQL docs
 for more details). What pgpool actually does is, checking the packet
 kind, not content of tuple data.  So as long as same number of tuples
 are returned from each backend, pgpool is happy.
 
 I'm not sure what pgpool version Sean's uses, it seems it's a little
 bit old (from the error message I guess). Also, majority connection
 kind ^@ looks strange. ^@ = 0x00, which is not valid kind at all.
 It seems something unusual is going on...
 
 If self cantained test case is provided, I will be able to look into
 this.
 --
 Tatsuo Ishii
 SRA OSS, Inc. Japan
 English: http://www.sraoss.co.jp/index_en.php
 Japanese: http://www.sraoss.co.jp

Ok I have some more information. The cluster fell apart again yesterday. 
Again, there was nothing out of the ordinary in the postgresql logs leading 
upto the nodes falling out but this was logged in the pgpool log

First node fell out at 10:36am:

2010-08-02 10:36:47 DEBUG: pid 16699: AsciiRow: 24 th field size does not match 
between master(16777216) and 2 th backend(0)
2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 0 
th backend ^@ NUM_BACKENDS: 3
2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 1 
th backend ^@ NUM_BACKENDS: 3
2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind from 2 
th backend D NUM_BACKENDS: 3
2010-08-02 10:36:47 ERROR: pid 16699: read_kind_from_backend: 2 th kind D does 
not match with master or majority connection kind ^@
2010-08-02 10:36:47 ERROR: pid 16699: kind mismatch among backends. Possible 
last query was: select * from carts where cart_id in 
(12979,12984,12987,12986,12982,12981) kind details are: 2[D]
2010-08-02 10:36:47 LOG:   pid 16699: notice_backend_error: 2 fail over 
request from pid 16699
2010-08-02 10:36:47 DEBUG: pid 24933: failover_handler called

Second node fell out at 1:56pm:

2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t
2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 0 
th backend ^@ NUM_BACKENDS: 3
2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind from 1 
th backend D NUM_BACKENDS: 3
2010-08-02 13:56:21 ERROR: pid 19720: read_kind_from_backend: 1 th kind D does 
not match with master or majority connection kind ^@
2010-08-02 13:56:21 ERROR: pid 19720: kind mismatch among backends. Possible 
last query was: select * from carts where cart_id in (13019,13018) kind 
details are: 1[D]
2010-08-02 13:56:21 LOG:   pid 19720: notice_backend_error: 1 fail over 
request from pid 19720
2010-08-02 13:56:21 DEBUG: pid 24933: failover_handler called

I notice two things, first that we died both times on a query of the carts 
table, but this could just be a coincidence. Second, we have the ^@ showing up 
as a kind again, except while two nodes returned ^@ when the first node fell 
out, only one returned ^@ when the second fell out, even though it was one of 
the nodes that returned ^@ that morning.

I have PgPool 2.3.2 and PostgreSQL 8.4.4 

[Pgpool-general] performance sol10 zone

2010-08-03 Thread Heiko L.
 First, what does this have to do with pgPool?
maybe less.
to butter sb. uppgpool admin eq. postgres master/to butter sb. up

pgpool running to minimized fork postmaster processes.
(pgpoolII :5432, postmaster :5433, s. output connstat.sh)

 Looks like you don't have a usable index on create_time.
my test (without success):
- reindex
- drop, create index
- drop, create database

Howto diag?

regards heiko

___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general