Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-11-15 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Jignesh K. Shah wrote:
> 
> I  changed  CLOG Buffers to 16
> 
> Running the test again:
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU IDFUNCTION:NAME
>   0   1027   :tick-5sec
> 
>   /export/home0/igen/pgdata/pg_clog/0024   
> -27530282192961
>   /export/home0/igen/pgdata/pg_clog/0025   
> -27530282111041
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU IDFUNCTION:NAME
>   1   1027   :tick-5sec
> 
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU IDFUNCTION:NAME
>   1   1027   :tick-5sec
> 
> # ./read.d
> dtrace: script './read.d' matched 2 probes
> CPU IDFUNCTION:NAME
>   0   1027   :tick-5sec
> 
>   /export/home0/igen/pgdata/pg_clog/0025   
> -27530281947201
> 
> 
> So Tom seems to be correct that it is a case of CLOG Buffer thrashing. 
> But since I saw the same problem with two different workloads, I think 
> people hitting this problem is pretty high.
> 
> Also I am bit surprised that CLogControlFile did not show up as being 
> hot.. Maybe because not much writes are going on .. Or maybe since I did 
> not trace all 500 users to see their hot lock status..
> 
> 
> Dmitri has another workload to test, I might try that out later on to 
> see if it causes similar impact or not.
> 
> Of course I havent seen my throughput go up yet since I am already CPU 
> bound... But this is good since the number of IOPS to the disk are 
> reduced (and hence system calls).
> 
> 
> If I take this as my baseline number.. I can then proceed to hunt other 
> bottlenecks
> 
> 
> Whats the view of the community?
> 
> Hunt down CPU utilizations or Lock waits next?
> 
> Your votes are crucial on where I put my focus.
> 
> Another thing Josh B told me to check out was the wal_writer_delay setting:
> 
> I have done two settings with almost equal performance (with the CLOG 16 
> setting) .. One with 100ms and other default at 200ms.. Based on the 
> runs it seemed that the 100ms was slightly better than the default .. 
> (Plus the risk of loosing data is reduced from 600ms to 300ms)
> 
> Thanks.
> 
> Regards,
> Jignesh
> 
> 
> 
> 
> Tom Lane wrote:
> > "Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
> >   
> >> So the ratio of reads vs writes to clog files is pretty huge..
> >> 
> >
> > It looks to me that the issue is simply one of not having quite enough
> > CLOG buffers.  Your first run shows 8 different pages being fetched and
> > the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
> > into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
> > older pages, so what we've got here is thrashing for the available
> > slots.
> >
> > Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.
> >
> > regards, tom lane
> >
> > ---(end of broadcast)---
> > TIP 3: Have you checked our extensive FAQ?
> >
> >http://www.postgresql.org/docs/faq
> >   
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah


I  changed  CLOG Buffers to 16

Running the test again:
# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/0024   
-27530282192961
 /export/home0/igen/pgdata/pg_clog/0025   
-27530282111041

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 1   1027   :tick-5sec

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 1   1027   :tick-5sec

# ./read.d
dtrace: script './read.d' matched 2 probes
CPU IDFUNCTION:NAME
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/0025   
-27530281947201



So Tom seems to be correct that it is a case of CLOG Buffer thrashing. 
But since I saw the same problem with two different workloads, I think 
people hitting this problem is pretty high.


Also I am bit surprised that CLogControlFile did not show up as being 
hot.. Maybe because not much writes are going on .. Or maybe since I did 
not trace all 500 users to see their hot lock status..



Dmitri has another workload to test, I might try that out later on to 
see if it causes similar impact or not.


Of course I havent seen my throughput go up yet since I am already CPU 
bound... But this is good since the number of IOPS to the disk are 
reduced (and hence system calls).



If I take this as my baseline number.. I can then proceed to hunt other 
bottlenecks



Whats the view of the community?

Hunt down CPU utilizations or Lock waits next?

Your votes are crucial on where I put my focus.

Another thing Josh B told me to check out was the wal_writer_delay setting:

I have done two settings with almost equal performance (with the CLOG 16 
setting) .. One with 100ms and other default at 200ms.. Based on the 
runs it seemed that the 100ms was slightly better than the default .. 
(Plus the risk of loosing data is reduced from 600ms to 300ms)


Thanks.

Regards,
Jignesh




Tom Lane wrote:

"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
  

So the ratio of reads vs writes to clog files is pretty huge..



It looks to me that the issue is simply one of not having quite enough
CLOG buffers.  Your first run shows 8 different pages being fetched and
the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
older pages, so what we've got here is thrashing for the available
slots.

Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq
  


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
> So the ratio of reads vs writes to clog files is pretty huge..

It looks to me that the issue is simply one of not having quite enough
CLOG buffers.  Your first run shows 8 different pages being fetched and
the second shows 10.  Bearing in mind that we "pin" the latest CLOG page
into buffers, there are only NUM_CLOG_BUFFERS-1 buffers available for
older pages, so what we've got here is thrashing for the available
slots.

Try increasing NUM_CLOG_BUFFERS to 16 and see how it affects this test.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah

Also to give perspective on the equivalent writes on CLOG

I used the following script which runs for 10 sec to track all writes to 
the clog directory and here is what it came up with... (This is with 500 
users running)


# cat write.d
#!/usr/sbin/dtrace -s
syscall::write:entry
/execname=="postgres" && 
dirname(fds[arg0].fi_pathname)=="/export/home0/igen/pgdata/pg_clog"/

{
   @write[fds[arg0].fi_pathname,arg1] = count();
}
tick-10sec
{
exit(0);
}

# ./write.d
dtrace: script './write.d' matched 2 probes
CPU IDFUNCTION:NAME
 3   1026  :tick-10sec

 /export/home0/igen/pgdata/pg_clog/001E   
-27530282770881

#
I modified read.d to do a 5sec read
# ./read.d
dtrace: script './read.d' matched 3 probes
CPU IDFUNCTION:NAME
 0  1   :BEGIN
 0   1027   :tick-5sec

 /export/home0/igen/pgdata/pg_clog/001F   
-27530282688961
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282525121
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282852802
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282770883
 /export/home0/igen/pgdata/pg_clog/001F   
-27530282361283
 /export/home0/igen/pgdata/pg_clog/001E   
-27530282852805
 /export/home0/igen/pgdata/pg_clog/001E   
-27530282361289
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028277088   13
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028268896   15
 /export/home0/igen/pgdata/pg_clog/001E   
-2753028252512   27

#

So the ratio of reads vs writes to clog files is pretty huge..


-Jignesh



Jignesh K. Shah wrote:

Tom,

Here is what I did:

I started aggregating all read information:

First I also had added group by pid(arg0,arg1, pid) and the counts 
were all coming as 1


Then I just grouped by filename and location (arg0,arg1 of reads) and 
the counts came back as


# cat read.d
#!/usr/sbin/dtrace -s
syscall::read:entry
/execname=="postgres"/
{
   @read[fds[arg0].fi_pathname, arg1] = count();
}


# ./read.d
dtrace: script './read.d' matched 1 probe
^C

 /export/home0/igen/pgdata/pg_clog/0014   
-27530282934721
 /export/home0/igen/pgdata/pg_clog/0014   
-27530282770881
 /export/home0/igen/pgdata/pg_clog/0015   
-27530282443202
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028268896   14
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028260704   25
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028252512   27
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028277088   28
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028293472   37



FYI  I pressed ctrl-c within like less than a second

So to me this seems that multiple processes are reading the same page 
from different pids. (This was with about 600 suers active.


Aparently we do have a problem that we are reading the same buffer 
address again.  (Same as not being cached anywhere or not finding it 
in cache anywhere).


I reran lock wait script on couple of processes and did not see 
CLogControlFileLock  as a problem..


# ./83_lwlock_wait.d 14341

Lock IdMode   Count
  WALInsertLock   Exclusive   1
  ProcArrayLock   Exclusive  16

Lock Id   Combined Time (ns)
  WALInsertLock   383109
  ProcArrayLock198866236

# ./83_lwlock_wait.d 14607

Lock IdMode   Count
  WALInsertLock   Exclusive   2
  ProcArrayLock   Exclusive  15

Lock Id   Combined Time (ns)
  WALInsertLock55243
  ProcArrayLock 69700140

#

What will help you find out why it is reading the same page again?


-Jignesh



Jignesh K. Shah wrote:
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is 
just avoiding the symptom to a later value.. I promise to look more 
into it before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the 
reason I think this is not the problem in 8.3beta1 since the Lock 
Output clearly does not show CLOGControlFile as to be the issue which 
I had seen in earlier case.  So I dont think that increasing 
NUM_CLOG_BUFFERS will change thing here.


Now I dont understand the code pretty well yet I see three hotspots 
and not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.

Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah

Tom,

Here is what I did:

I started aggregating all read information:

First I also had added group by pid(arg0,arg1, pid) and the counts 
were all coming as 1


Then I just grouped by filename and location (arg0,arg1 of reads) and 
the counts came back as


# cat read.d
#!/usr/sbin/dtrace -s
syscall::read:entry
/execname=="postgres"/
{
   @read[fds[arg0].fi_pathname, arg1] = count();
}


# ./read.d
dtrace: script './read.d' matched 1 probe
^C

 /export/home0/igen/pgdata/pg_clog/0014   
-27530282934721
 /export/home0/igen/pgdata/pg_clog/0014   
-27530282770881
 /export/home0/igen/pgdata/pg_clog/0015   
-27530282443202
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028268896   14
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028260704   25
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028252512   27
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028277088   28
 /export/home0/igen/pgdata/pg_clog/0015   
-2753028293472   37



FYI  I pressed ctrl-c within like less than a second

So to me this seems that multiple processes are reading the same page 
from different pids. (This was with about 600 suers active.


Aparently we do have a problem that we are reading the same buffer 
address again.  (Same as not being cached anywhere or not finding it in 
cache anywhere).


I reran lock wait script on couple of processes and did not see 
CLogControlFileLock  as a problem..


# ./83_lwlock_wait.d 14341

Lock IdMode   Count
  WALInsertLock   Exclusive   1
  ProcArrayLock   Exclusive  16

Lock Id   Combined Time (ns)
  WALInsertLock   383109
  ProcArrayLock198866236

# ./83_lwlock_wait.d 14607

Lock IdMode   Count
  WALInsertLock   Exclusive   2
  ProcArrayLock   Exclusive  15

Lock Id   Combined Time (ns)
  WALInsertLock55243
  ProcArrayLock 69700140

#

What will help you find out why it is reading the same page again?


-Jignesh



Jignesh K. Shah wrote:
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is 
just avoiding the symptom to a later value.. I promise to look more 
into it before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the 
reason I think this is not the problem in 8.3beta1 since the Lock 
Output clearly does not show CLOGControlFile as to be the issue which 
I had seen in earlier case.  So I dont think that increasing 
NUM_CLOG_BUFFERS will change thing here.


Now I dont understand the code pretty well yet I see three hotspots 
and not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.d script
* SimpleLRUReadPage - causing read IOs as reported by 
iostat/rsnoop.d

* GetSnapshotData - causing CPU utilization  as reported by hotuser

But I will shut up and do more testing.

Regards,
Jignesh



Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
 
Actually, 32 made a significant difference as I recall ... do you 
still have the figures for that, Jignesh?



I'd want to see a new set of test runs backing up any call for a change
in NUM_CLOG_BUFFERS --- we've changed enough stuff around this area that
benchmarks using code from a few months back shouldn't carry a lot of
weight.

regards, tom lane
  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah
I agree with Tom..  somehow I think  increasing NUM_CLOG_BUFFERS is just 
avoiding the symptom to a later value.. I promise to look more into it 
before making any recommendations to increase NUM_CLOG_BUFFERs.



Because though "iGen"  showed improvements in that area by increasing 
num_clog_buffers , EAStress had shown no improvements.. Plus the reason 
I think this is not the problem in 8.3beta1 since the Lock Output 
clearly does not show CLOGControlFile as to be the issue which I had 
seen in earlier case.  So I dont think that increasing NUM_CLOG_BUFFERS 
will change thing here.


Now I dont understand the code pretty well yet I see three hotspots and 
not sure if they are related to each other
* ProcArrayLock waits  - causing Waits  as reported by 
83_lockwait.d script
* SimpleLRUReadPage - causing read IOs as reported by 
iostat/rsnoop.d

* GetSnapshotData - causing CPU utilization  as reported by hotuser

But I will shut up and do more testing.

Regards,
Jignesh



Tom Lane wrote:

Josh Berkus <[EMAIL PROTECTED]> writes:
  
Actually, 32 made a significant difference as I recall ... do you still have 
the figures for that, Jignesh?



I'd want to see a new set of test runs backing up any call for a change
in NUM_CLOG_BUFFERS --- we've changed enough stuff around this area that
benchmarks using code from a few months back shouldn't carry a lot of
weight.

regards, tom lane
  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Jignesh K. Shah
The problem I saw was first highlighted by EAStress runs with PostgreSQL 
on Solaris with 120-150 users. I just replicated that via my smaller 
internal benchmark that we use here to recreate that problem.


EAStress should be just fine to highlight it.. Just put pg_clog on 
O_DIRECT or something so that all IOs go to disk making it easier to 
observe.


In the meanwhile I will try to get more information.


Regards,
Jignesh


Tom Lane wrote:

Gregory Stark <[EMAIL PROTECTED]> writes:
  

Didn't we already go through this? He and Simon were pushing to bump up
NUM_CLOG_BUFFERS and you were arguing that the test wasn't representative and
some other clog.c would have to be reengineered to scale well to larger
values. 



AFAIR we never did get any clear explanation of what the test case is.
I guess it must be write-mostly, else lazy XID assignment would have
helped this by reducing the rate of XID consumption.

It's still true that I'm leery of a large increase in the number of
buffers without reengineering slru.c.  That code was written on the
assumption that there were few enough buffers that a linear search
would be fine.  I'd hold still for 16, or maybe even 32, but I dunno
how much impact that will have for such a test case.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match
  


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-26 Thread Gregory Stark
"Josh Berkus" <[EMAIL PROTECTED]> writes:

> Actually, 32 made a significant difference as I recall ... do you still have 
> the figures for that, Jignesh?

Well it made a difference but it didn't remove the bottleneck, it just moved
it. IIRC under that benchmark Jignesh was able to run with x sessions
efficiently with 8 clog buffers, x + 100 or so sessions with 16 clog buffers
and x + 200 or so sessions with 32 clog buffers.

It happened that x + 200 was > the number of sessions he wanted to run the
benchmark at so it helped the benchmark results quite a bit. But that was just
an artifact of how many sessions the benchmark needed. A user who needs 1200
sessions or who has a different transaction load might find he needs more clog
buffers to alleviate the bottleneck. And of course most (all?) normal users
use far fewer sessions and won't run into this bottleneck at all.

Raising NUM_CLOG_BUFFERS just moves around the arbitrary bottleneck. This
benchmark is useful in that it gives us an idea where the bottleneck lies for
various values of NUM_CLOG_BUFFERS but it doesn't tell us what value realistic
users are likely to bump into.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> Actually, 32 made a significant difference as I recall ... do you still have 
> the figures for that, Jignesh?

I'd want to see a new set of test runs backing up any call for a change
in NUM_CLOG_BUFFERS --- we've changed enough stuff around this area that
benchmarks using code from a few months back shouldn't carry a lot of
weight.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Josh Berkus
Tom,

> It's still true that I'm leery of a large increase in the number of
> buffers without reengineering slru.c.  That code was written on the
> assumption that there were few enough buffers that a linear search
> would be fine.  I'd hold still for 16, or maybe even 32, but I dunno
> how much impact that will have for such a test case.

Actually, 32 made a significant difference as I recall ... do you still have 
the figures for that, Jignesh?

The test case is a workload called "iGen" which is a "fixed" TPCC-like 
workload.  I've been trying to talk Sun into open-sourcing it, but no dice so 
far.  It is heavy on writes, and (like TPCC) consists mostly of one-line 
transactions.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Didn't we already go through this? He and Simon were pushing to bump up
> NUM_CLOG_BUFFERS and you were arguing that the test wasn't representative and
> some other clog.c would have to be reengineered to scale well to larger
> values. 

AFAIR we never did get any clear explanation of what the test case is.
I guess it must be write-mostly, else lazy XID assignment would have
helped this by reducing the rate of XID consumption.

It's still true that I'm leery of a large increase in the number of
buffers without reengineering slru.c.  That code was written on the
assumption that there were few enough buffers that a linear search
would be fine.  I'd hold still for 16, or maybe even 32, but I dunno
how much impact that will have for such a test case.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
>> CLOG data is 
>> not cached in any PostgreSQL shared memory segments
>
> The above statement is utterly false, so your trace seems to indicate
> something broken.  Are you sure these were the only reads of pg_clog
> files?  Can you extend the tracing to determine which page of the file
> got read?  I am wondering if your (unspecified) test load was managing
> to touch more pages of the clog than there is room for in shared memory.

Didn't we already go through this? He and Simon were pushing to bump up
NUM_CLOG_BUFFERS and you were arguing that the test wasn't representative and
some other clog.c would have to be reengineered to scale well to larger
values. 

Also it seemed there were only modest improvements from raising the value and
there would always be a ceiling to bump into so just raising the number of
buffers isn't particularly interesting unless there's some magic numbers we're
trying to hit.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Gregory Stark

"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:

> CLOG data is not cached in any PostgreSQL shared memory segments and hence
> becomes the bottleneck as it has to constantly go to the filesystem to get
> the read data.

This is the same bottleneck you discussed earlier. CLOG reads are cached in
the Postgres shared memory segment but only NUM_CLOG_BUFFERS are which
defaults to 8 buffers of 8kb each. With 1,000 clients and the transaction rate
you're running you needed a larger number of buffers.

Using the filesystem buffer cache is also an entirely reasonable solution
though. That's surely part of the logic behind not trying to keep more of the
clog in shared memory. Do you have any measurements of how much time is being
spent just doing the logical I/O to the buffer cache for the clog pages? 4MB/s
seems like it's not insignificant but your machine is big enough that perhaps
I'm thinking at the wrong scale.

I'm really curious whether you see any benefit from the vxid read-only
transactions. I'm not sure how to get an apples to apples comparison though.
Ideally just comparing it to CVS HEAD from immediately prior to the vxid patch
going in. Perhaps calling some function which forces an xid to be allocated
and seeing how much it slows down the benchmark would be a good substitute.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Tom Lane
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes:
> CLOG data is 
> not cached in any PostgreSQL shared memory segments

The above statement is utterly false, so your trace seems to indicate
something broken.  Are you sure these were the only reads of pg_clog
files?  Can you extend the tracing to determine which page of the file
got read?  I am wondering if your (unspecified) test load was managing
to touch more pages of the clog than there is room for in shared memory.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] 8.3beta1 testing on Solaris

2007-10-25 Thread Jignesh K. Shah

Update on my testing 8.3beta1 on Solaris.

* CLOG reads
* Asynchronous Commit benefit
* Hot CPU Utilization

Regards,
Jignesh

__Background_:_
We were using PostgreSQL 8.3beta1 testing on our latest Sun SPARC 
Enterprise T5220 Server using Solaris 10 8/07 and Sun Fire X4200 using 
Solaris 10 8/07. Generally for performance benefits in Solaris  we put 
file systems on forcedirectio we bypass the filesystem cache and go 
direct to disks.


__Problem_:_
What we were observing that there were lots of reads happening  about 
4MB/sec on the file system holding $PGDATA and the database tables 
during an OLTP Benchmark run. Initially we thought that our bufferpools 
were not big enough. But thanks to 64-bit builds  we could use bigger 
bufferpools. However even with extraordinary bufferpool sizes we still 
saw lots of reads going to the disks.


__DTrace to the Rescue_:_

I modified iosnoop.d to just snoop on reads. The modified rsnoop.d is as 
follows:

$ cat rsnoop.d
#!/usr/sbin/dtrace -s
syscall::read:entry
/execname=="postgres"/
{
  printf("pid %d reading  %s\n", pid, fds[arg0].fi_pathname);
}

Based on it I found that most postgresql  processes were doing lots of 
reads from pg_clog directory.
CLOG or commit logs keep track of transactions in flight. Writes of CLOG 
comes from recording of transaction commits( or when it aborts) or when 
an XLOG is generated. However though I am not clear on reads yet, it 
seems every process constantly reads it to get some status. CLOG data is 
not cached in any PostgreSQL shared memory segments and hence becomes 
the bottleneck as it has to constantly go to the filesystem to get the 
read data.

# ./rsnoop.d
dtrace: script './rsnoop.d' matched 1 probe
CPU IDFUNCTION:NAME
 0  49222   read:entry pid 8739 reading  
/export/home0/igen/pgdata/pg_clog/000C


 0  49222   read:entry pid 9607 reading  
/export/home0/igen/pgdata/pg_clog/000C


 0  49222   read:entry pid 9423 reading  
/export/home0/igen/pgdata/pg_clog/000C


 0  49222   read:entry pid 8731 reading  
/export/home0/igen/pgdata/pg_clog/000C


 0  49222   read:entry pid 8719 reading  
/export/home0/igen/pgdata/pg_clog/000C


 0  49222   read:entry pid 9019 reading  
/export/home0/igen/pgdata/pg_clog/000C


 1  49222   read:entry pid 9255 reading  
/export/home0/igen/pgdata/pg_clog/000C


 1  49222   read:entry pid 8867 reading  
/export/home0/igen/pgdata/pg_clog/000C



Later on  during another run I added  ustack() after the printf in the 
above script to get the function name also:


# ./rsnoop.d
dtrace: script './rsnoop.d' matched 1 probe
CPU IDFUNCTION:NAME
 0  49222   read:entry pid 10956 reading  
/export/home0/igen/pgdata/pg_clog/0011

 libc.so.1`_read+0xa
 postgres`SimpleLruReadPage+0x3e6
 postgres`SimpleLruReadPage_ReadOnly+0x9b
 postgres`TransactionIdGetStatus+0x1f
 postgres`TransactionIdDidCommit+0x42
 postgres`HeapTupleSatisfiesVacuum+0x21a
 postgres`heap_prune_chain+0x14b
 postgres`heap_page_prune_opt+0x1e6
 postgres`index_getnext+0x144
 postgres`IndexNext+0xe1
 postgres`ExecScan+0x189
 postgres`ExecIndexScan+0x43
 postgres`ExecProcNode+0x183
 postgres`ExecutePlan+0x9e
 postgres`ExecutorRun+0xab
 postgres`PortalRunSelect+0x47a
 postgres`PortalRun+0x262
 postgres`exec_execute_message+0x565
 postgres`PostgresMain+0xf45
 postgres`BackendRun+0x3f9

 0  49222   read:entry pid 10414 reading  
/export/home0/igen/pgdata/pg_clog/0011

 libc.so.1`_read+0xa
 postgres`SimpleLruReadPage+0x3e6
 postgres`SimpleLruReadPage_ReadOnly+0x9b
 postgres`TransactionIdGetStatus+0x1f
 postgres`TransactionIdDidCommit+0x42
 postgres`HeapTupleSatisfiesVacuum+0x21a
 postgres`heap_prune_chain+0x14b
 postgres`heap_page_prune_opt+0x1e6
 postgres`index_getnext+0x144
 postgres`IndexNext+0xe1
 postgres`ExecScan+0x189
^C  libc.so.1`_read+0xa
 postgres`SimpleLruReadPage+0x3e6
 postgres`SimpleLruReadPage_ReadOnly+0x9b
 postgres`TransactionIdGetStatus+0x1f
 postgres`TransactionIdDidCommit+0x42
 postgres`HeapTupleSatisfiesMVCC+0x34f
 postgres`index_getnext+0x29e
 postgres`IndexNext+0xe1
 postgres`ExecScan+0x189
 postgres`ExecIndexScan+0x43
 postgres`ExecProcNode+0x183
 postgres`ExecutePlan+0x9e
 postgres`ExecutorRun+0xab
 postgres`PortalRunSelect+0x47a
 postg