Re: Performance issues lately.

2010-11-15 Thread Doug Leavitt

 Jorgen,
I suggest that you use dtrace to get a better understanding of what is 
going on.


You can start with some pre-existing documented scripts from the dtrace 
toolkit

here:

http://hub.opensolaris.org/bin/view/Community+Group+dtrace/dtracetoolkit

The dtrace guide is here:

http://wikis.sun.com/display/DTrace/Documentation

There are many examples in the dtrace toolkit that should help sort out
what other processes or system resources are affecting the ldap servers
performance in your specific situation.

Doug.


On 11/14/10 07:42 PM, Jorgen Lundman wrote:



Howard Chu wrote:


If it slows down after you wait a while, that means some other process
on the machine is using the system RAM and forcing the BDB data out of
the system cache. Find out what other program is hogging the memory,
it's obvious that BDB is not doing anything wrong on its own.


If I db_stat another large file, like dn2id.bdb, the subsequent 
id2entry.bdb will be slower. So maybe it is fighting itself.


However, since I am executing separate db_stat processes each time, 
the setcachesize would have no chance to help improve things. I will 
have to try different values for slapd running.


Could be I should investigate various Solaris specific process limits 
as well. It is all 64bit now, but per process limits may still be 
interfering.






Re: Performance issues lately.

2010-11-14 Thread Jorgen Lundman


No real reason, tried various different settings but to no real advantage.

Now I have:

Filesystem size   used  avail capacity  Mounted on
swap19G   7.7G11G42%/tmp

# grep cache DB_CONFIG
set_cachesize 8 0 1

# time /usr/local/BerkeleyDB.4.8/bin/db_stat -d id2entry.bdb
real6m6.099s

# time cp id2entry.bdb /dev/null
real0m0.040s
(It's not on disk)

I thought to delete id2entry.bdb, and use slapindex to re-generate it but that 
appears not to be a supported feature. slapindex can not run without a valid 
id2entry.bdb. This is why I tried slapcat, rm *, slapadd. But no difference in 
speed up.


If I truss with -u *:* (All inter-library calls) I get no single large system 
call, just a lot of work somewhere (that does not call read/write etc). Alas, 
the number of lines in truss file is:

 6209933 /var/tmp/db_stat_truss


Lund

Quanah Gibson-Mount wrote:

--On Sunday, November 14, 2010 7:13 PM +0900 Jorgen Lundman
lund...@lundman.net wrote:


dbconfig set_cachesize 4 0 8


Why are you breaking your cache into segments? This has always had a
negative performance impact in all tests I've done, and stopped being
necessary to do with BDB 4.3 and later.

--Quanah



--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc

Zimbra :: the leader in open source messaging and collaboration



--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)


Re: Performance issues lately.

2010-11-14 Thread Jorgen Lundman

truss is pretty much useless in this context. Most of BDB's activity is
thru memory-mapping, which involves no system calls for truss to trace.
You need an actual profile (e.g. using oprofile) to identify where the
time is going.



This is very true. But reach for the tools you have, even if it is a hammer. I 
guess Purify would be the Solaris equivalent, unless we find the problem also 
occurs on a Linux box.


truss has a simple profiler, but only for System Calls which do not help in this 
case:


Library: Function calls
libaio:  close16
libc:membar_exit  1633814
libc:thr_self 653455
libc:_lock_clear  326545
libc:_lock_try326545
libc:memcpy   163311
libc:memcmp   242
libc:strcasecmp   195
libc:free 77

sys totals:  .921 303 22
usr time:  21.669
elapsed:  829.890

 Now repeat the db_stat call and see how long it takes the 2nd time.

It does indeed speed up, if I do not wait too long between tests.

real1m27.712s
real0m29.696s
real0m4.332s
real0m4.452s

4 seconds is much nicer. So what you are saying is that BDB uses mmap, and 
operations inside this memory will trigger reads inside the kernel which do not 
show as libc syscalls. Rats. So it may be IO? I need to throw even more memory 
at it, and live with the increasing startup times?


How does the set_cachesize relate to the mmap usage?




--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)


Re: Performance issues lately.

2010-11-14 Thread Howard Chu

Jorgen Lundman wrote:

Now repeat the db_stat call and see how long it takes the 2nd time.

It does indeed speed up, if I do not wait too long between tests.

real1m27.712s
real0m29.696s
real0m4.332s
real0m4.452s


If it slows down after you wait a while, that means some other process on the 
machine is using the system RAM and forcing the BDB data out of the system 
cache. Find out what other program is hogging the memory, it's obvious that 
BDB is not doing anything wrong on its own.


4 seconds is much nicer. So what you are saying is that BDB uses mmap, and
operations inside this memory will trigger reads inside the kernel which do not
show as libc syscalls. Rats. So it may be IO? I need to throw even more memory
at it, and live with the increasing startup times?

How does the set_cachesize relate to the mmap usage?



--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Re: Performance issues lately.

2010-11-14 Thread Jorgen Lundman



Howard Chu wrote:


If it slows down after you wait a while, that means some other process
on the machine is using the system RAM and forcing the BDB data out of
the system cache. Find out what other program is hogging the memory,
it's obvious that BDB is not doing anything wrong on its own.


If I db_stat another large file, like dn2id.bdb, the subsequent id2entry.bdb 
will be slower. So maybe it is fighting itself.


However, since I am executing separate db_stat processes each time, the 
setcachesize would have no chance to help improve things. I will have to try 
different values for slapd running.


Could be I should investigate various Solaris specific process limits as well. 
It is all 64bit now, but per process limits may still be interfering.




--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)