Re: Major SMP problems with lstat/namei

2008-10-01 Thread John Baldwin
On Thursday 25 September 2008 07:00:04 pm Jeff Wheelhouse wrote:
 
 On Sep 24, 2008, at 12:12 PM, John Baldwin wrote:
  Shared lookups only work on the NFS client in 6.x.  I'm about to  
  turn them on
  for UFS in HEAD (8.x) and will backport the needed fixes to 7.x  
  after 7.1
  (too risky to merge to 7.x this close to a release).
 
 OK, given all the patches you referenced, I did make a decent effort  
 at backporting to 7.0.

It sounds like you missed some of the dirhash changes somehow, as dirhash no 
longer has any lockmgr stuff in it (and only ever did in HEAD).  I've 
generated a patch though using svn.  You can grab it from 
http://www.FreeBSD.org/~jhb/patches/ufs_lookup7.patch  Note that you will 
have to set vfs.lookup_shared=1 to enable shared locks (either loader tunable 
or sysctl).

Also, I found a few other changes I had missed earlier that needed to be 
included.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-26 Thread Dag-Erling Smørgrav
Jeff Wheelhouse [EMAIL PROTECTED] writes:
 http://software.wheelhouse.org/rptest.tar.bz2

Thanks.  I get similar results on head; vfs.lookup_shared actually seems
to *reduce* performance by about 10% - 20%.  I ran the test on both UFS
and ZFS; there is no significant difference.

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-26 Thread John Baldwin
On Friday 26 September 2008 05:20:14 am Dag-Erling Smørgrav wrote:
 Jeff Wheelhouse [EMAIL PROTECTED] writes:
  http://software.wheelhouse.org/rptest.tar.bz2
 
 Thanks.  I get similar results on head; vfs.lookup_shared actually seems
 to *reduce* performance by about 10% - 20%.  I ran the test on both UFS
 and ZFS; there is no significant difference.

You might try http://www.FreeBSD.org/~jhb/patches/namei_rwlock.patch

However, it might also be useful in general to enable lock profiling and see 
which locks (if any) are contested.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-25 Thread Dag-Erling Smørgrav
Jeff Wheelhouse [EMAIL PROTECTED] writes:
 I've written a quick benchmark with a pair of tests to
 simplify/measure the problem.  [...]

Care to share?

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-25 Thread Jeff Wheelhouse


On Sep 25, 2008, at 10:51 AM, Dag-Erling Smørgrav wrote:

Jeff Wheelhouse [EMAIL PROTECTED] writes:

I've written a quick benchmark with a pair of tests to
simplify/measure the problem.  [...]


Care to share?


No problem:

http://software.wheelhouse.org/rptest.tar.bz2

Thanks,
Jeff

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-25 Thread Jeff Wheelhouse


On Sep 24, 2008, at 12:12 PM, John Baldwin wrote:
Shared lookups only work on the NFS client in 6.x.  I'm about to  
turn them on
for UFS in HEAD (8.x) and will backport the needed fixes to 7.x  
after 7.1

(too risky to merge to 7.x this close to a release).


OK, given all the patches you referenced, I did make a decent effort  
at backporting to 7.0.


Here are the results:

  Revision  ChangesPath
  1.87  +48 -29src/sys/ufs/ufs/ufs_lookup.c

Applied, changing a couple of VOP_ISLOCKED() and vn_lock() calls to  
add td as the last parameter.


 Revision  ChangesPath
 1.53  +0 -1  src/sys/ufs/ufs/inode.h
 1.88  +10 -13src/sys/ufs/ufs/ufs_lookup.c

Applied successfully.

  SVN rev 181018 on 2008-07-30 21:07:56Z by jhb

NOT applied, because it was a whitespace tweak on ufs_lookup 1.89  
which was not on your list.


 SVN rev 183079 on 2008-09-16 16:18:36Z by jhb

Applied cleanly.

  Modified files:
   sys/ufs/ufs  inode.h ufs_lookup.c
  Log:
 SVN rev 183093 on 2008-09-16 19:06:44Z by jhb

Applied cleanly.

 1.6   +2 -1  src/sys/ufs/ufs/dirhash.h
 1.24  +289 -227  src/sys/ufs/ufs/ufs_dirhash.c

This patch applies but generates an awful lot of errors (enclosed at  
end).  I think it may be dependent on the 8.0 lockmgr.  Since most of  
the remaining patches are against the same files, I bailed out here.


 SVN rev 183080 on 2008-09-16 16:23:56Z by jhb

Skipped.

 SVN rev 183280 on 2008-09-22 20:53:22Z by jhb

Skipped.

 There are additional fixes needed to fix races with umount -f,
 so if you backport all this stuff, don't use umount -f or you
 risk panics. :)

Noted.

 -   mp-mnt_kern_flag |= MNTK_MPSAFE;
 +   mp-mnt_kern_flag |= MNTK_MPSAFE | MNTK_LOOKUP_SHARED;

Applied.

If I can make the backport work (a big if, given the dirhash changes)  
on 7.0, I am happy to maintain and test the diffs locally until after  
the 7.1 release and send them over to you at that time, if it will  
save you some effort.


Thanks,
Jeff

Dirhash compile errors:

/usr/src/sys/ufs/ufs/ufs_dirhash.c:132:37: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_release':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: 'lockmgr' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: (Each undeclared  
identifier is reported only once
/usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: for each function it  
appears in.)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:161:45: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_create':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:161: error: 'lockmgr' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:178:17: error: macro lockmgr  
requires 4 arguments, but only 3 given
/usr/src/sys/ufs/ufs/ufs_dirhash.c:193:60: error: macro lockmgr  
requires 4 arguments, but only 3 given
/usr/src/sys/ufs/ufs/ufs_dirhash.c:198:42: error: macro lockmgr  
requires 4 arguments, but only 3 given
/usr/src/sys/ufs/ufs/ufs_dirhash.c:222:39: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_acquire':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:222: error: 'lockmgr' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:248:17: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_free':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:247: error: 'lockmgr' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:385:39: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_build':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:385: error: 'lockmgr' undeclared  
(first use in this function)

cc1: warnings being treated as errors
/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function  
'ufsdirhash_free_locked':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:403: warning: implicit declaration  
of function 'lockmgr_assert'
/usr/src/sys/ufs/ufs/ufs_dirhash.c:403: warning: nested extern  
declaration of 'lockmgr_assert'
/usr/src/sys/ufs/ufs/ufs_dirhash.c:403: error: 'KA_LOCKED' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:417:37: error: macro lockmgr  
requires 4 arguments, but only 3 given
/usr/src/sys/ufs/ufs/ufs_dirhash.c:417: error: 'lockmgr' undeclared  
(first use in this function)
/usr/src/sys/ufs/ufs/ufs_dirhash.c:418:35: error: macro lockmgr  
requires 4 arguments, but only 3 given
/usr/src/sys/ufs/ufs/ufs_dirhash.c:438:37: error: macro lockmgr  
requires 4 arguments, but only 3 given

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_lookup':
/usr/src/sys/ufs/ufs/ufs_dirhash.c:473: error: 'KA_LOCKED' undeclared  
(first use in this function)

/usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_findfree':

Re: Major SMP problems with lstat/namei

2008-09-24 Thread Ivan Voras
Jeff Wheelhouse wrote:

 This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1.
 
 I believe this is the same issue that was previously discussed as 2 x
 quad-core system is slower that 2 x dual core on FreeBSD archived here:
 
 http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038441.html

 This is becoming a huge problem for us.  Is there anything that at all
 can be done, or any news?  In the case linked above, improvement was
 made by changing a PHP setting that isn't applicable in our case.

There is nothing that can be done within the 6.x branch. 7.x contains
many improvements but I think only 8.x will directly change the lockmgr
and the namei cache. The best things you can try right now is to use
7-STABLE (or soon to be released 7.1; you might need tuning with
7.0-RELEASE) or try 8-CURRENT (it's quite stable).



signature.asc
Description: OpenPGP digital signature


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Ivan Voras
Ivan Voras wrote:

 There is nothing that can be done within the 6.x branch. 7.x contains
 many improvements but I think only 8.x will directly change the lockmgr
 and the namei cache. The best things you can try right now is to use
 7-STABLE (or soon to be released 7.1; you might need tuning with
 7.0-RELEASE) or try 8-CURRENT (it's quite stable).

I remembered two more things:

* The problematic load can also be generated with benchmarks/blogbench
* I don't have the numbers here but I think I remember that ZFS had
noticably larger score than UFS in this workload. Of course, ZFS has
other problems.




signature.asc
Description: OpenPGP digital signature


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Jeremy Chadwick
On Wed, Sep 24, 2008 at 09:26:55AM +0200, Daniel Gerzo wrote:
 Hello Jeff,
 
 On Wed, 24 Sep 2008 00:52:59 -0400, Jeff Wheelhouse
 [EMAIL PROTECTED] wrote:
  
  We have encountered some serious SMP performance/scalability problems  
  that we've tracked back to lstat/namei calls.  I've written a quick  
 
 this all seems like a reason of very poor performance of PHP when used with
 open_basedir and safe_mode enabled. It would be nice to see if there's
 something what could be done to make it better.

Both of which are features which will, thankfully, be removed in PHP 6.
Whoever uses these features in PHP deserves the pain -- they're
worthless and provide no security what-so-ever.  Consider using suPHP
or an MPM like mpm-itk.

Also, PHP and performance shouldn't be put in the same sentence. /rant

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Jeff Wheelhouse


On Sep 24, 2008, at 6:12 AM, Ivan Voras wrote:

There is nothing that can be done within the 6.x branch. 7.x contains
many improvements but I think only 8.x will directly change the  
lockmgr

and the namei cache. The best things you can try right now is to use
7-STABLE (or soon to be released 7.1; you might need tuning with
7.0-RELEASE) or try 8-CURRENT (it's quite stable).


Really?  Nothing?

We get lockmgr-related panics on FreeBSD 7.0, as detailed elsewhere on  
this list.


Stability issues aside, what else would we need to tune on 7.0,  
besides enabling the ULE scheduler, and how much benefit would we  
really get?


These servers are in production, so 8-CURRENT is not an option.  I've  
already had my knuckles rapped by a customer for trying 7.1-PRERELEASE  
on one of their machines.


Thanks,
Jeff



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread John Baldwin
On Wednesday 24 September 2008 12:52:59 am Jeff Wheelhouse wrote:
 
 We have encountered some serious SMP performance/scalability problems  
 that we've tracked back to lstat/namei calls.  I've written a quick  
 benchmark with a pair of tests to simplify/measure the problem.  Both  
 tests use a tree of directories: the top level directory contains five  
 subdirectories a, b, c, d, and e.  Each subdirectory contains five  
 subdirectories a, b, c, d, and e, and so on..  1 directory at level  
 one, 5 at level two, 25 at level three, 125 at level four, 625 at  
 level five, and 3125 at level six.
 
 In the realpath test, a random path is constructed at the bottom of  
 the tree (e.g. /tmp/lstat/a/b/c/d/e) and realpath() is called on that,  
 provoking lstat() calls on the whole tree.  This is to simulate a mix  
 of high-contention and low-contention lstat() calls.
 
 In the lstat test, lstat is called directly on a path at the bottom  
 of the tree.  Since there are 3125 files, this simulates relatively  
 low-contention lstat() calls.
 
 In both cases, the test repeats as many times as possible for 60  
 seconds.  Each test is run simultaneously by multiple processes, with  
 progressively doubling concurrency from 1 to 512.
 
 What I found was that everything is fine at concurrency 2, probably  
 indicating that the benchmark pegged on some other resource limit.  At  
 concurrency 4, realpath drops to 31.8% of concurrency 1.  At  
 concurrency 8, performance is down to 18.3%.  In the interim, CPU load  
 goes to 80-90% system CPU.  I've confirmed via ktrace and the rusage  
 that the CPU usage is all system time, and that lstat() is the *only*  
 system call in the test (realpath() is called with an absolute path).
 
 I then reran the 32-process test on 1-7 cores, and found that  
 performance peaks at 2 cores and drops sharply from there.  eight  
 cores runs *fifteen* times slower than two cores.
 
 The test full results are at the bottom of this message.
 
 This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1.

Shared lookups only work on the NFS client in 6.x.  I'm about to turn them on 
for UFS in HEAD (8.x) and will backport the needed fixes to 7.x after 7.1 
(too risky to merge to 7.x this close to a release).  So lookup_shared=1 
isn't going to really help on 6.x unless you are doing it all over NFS.  You 
also want to backport my fix to cache_enter() before using lookup_shared at 
all:

jhb 2008-08-23 15:13:39 UTC

  FreeBSD src repository

  Modified files:
sys/kern vfs_cache.c 
  Log:
  SVN rev 182061 on 2008-08-23 15:13:39Z by jhb
  
  Fix a race condition with concurrent LOOKUP namecache operations for a vnode
  not in the namecache when shared lookups are enabled (vfs.lookup_shared=1,
  it is currently off by default) and the filesystem supports shared lookups
  (e.g. NFS client).  Specifically, if multiple concurrent LOOKUPs both miss
  in the name cache in parallel, each of the lookups may each end up adding an
  entry to the namecache resulting in duplicate entries in the namecache
  for the same pathname.  A subsequent removal of the mapping of that
  pathname to that vnode (via remove or rename) would only evict one of the
  entries from the name cache.  As a result, subseqent lookups for that
  pathname would still return the old vnode.
  
  This race was observed with shared lookups over NFS where a file was updated
  by writing a new file out to a temporary file name and then renaming that
  temporary file to the real file to effect atomic updates of a file.  Other
  processes on the same client that were periodically reading the file would
  occasionally receive an ESTALE error from open(2) because the VOP_GETATTR()
  in nfs_open() would receive that error when given the stale vnode.
  
  The fix here is to check for duplicates in cache_enter() and just return
  if an entry for this same directory and leaf file name for this vnode is
  already in the cache.  The check for duplicates is done by walking the
  per-vnode list of name cache entries.  It is expected that this list should
  be very small in the common case (usually 0 or 1 entries during a
  cache_enter() since most files only have 1 leaf name).
  
  Reviewed by:ups, scottl
  MFC after:  2 months
  
  Revision  ChangesPath
  1.124 +33 -9 src/sys/kern/vfs_cache.c

If you want to try the UFS stuff on 7, you would need to probably backport at 
least the following, maybe more:

jeff2008-04-11 09:44:25 UTC

  FreeBSD src repository

  Modified files:
sys/ufs/ufs  ufs_lookup.c 
  Log:
   - cache dp-i_offset in the local 'i_offset' variable for use in loop
 indexes so directory lookup becomes shared lock safe.  In the modifying
 cases an exclusive lock is held here so the commit routine may
 rely on the state of i_offset.
   - Similarly handle i_diroff by fetching at the start and setting only once
 the operation is complete.  Without the exclusive 

Re: Major SMP problems with lstat/namei

2008-09-24 Thread Jeff Wheelhouse


On Sep 24, 2008, at 12:12 PM, John Baldwin wrote:
Shared lookups only work on the NFS client in 6.x.  I'm about to  
turn them on
for UFS in HEAD (8.x) and will backport the needed fixes to 7.x  
after 7.1

(too risky to merge to 7.x this close to a release).


Testers available, when you get to that.  :-)


So lookup_shared=1
isn't going to really help on 6.x unless you are doing it all over  
NFS.  You
also want to backport my fix to cache_enter() before using  
lookup_shared at

all:


Since it sounds like 6.x is a dead end, we'll focus on 7.x, provided  
we can get it to be stable for us.


Having never used svn, I do need to figure out how to pull the  
specific  patches you referenced, but I'm sure that's not an  
unclimbable mountain. :-)


I appreciate your insight on this, it's very helpful.

Thanks,
Jeff


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread John Baldwin
On Wednesday 24 September 2008 01:47:32 pm Jeff Wheelhouse wrote:
 
 On Sep 24, 2008, at 12:12 PM, John Baldwin wrote:
  Shared lookups only work on the NFS client in 6.x.  I'm about to  
  turn them on
  for UFS in HEAD (8.x) and will backport the needed fixes to 7.x  
  after 7.1
  (too risky to merge to 7.x this close to a release).
 
 Testers available, when you get to that.  :-)
 
  So lookup_shared=1
  isn't going to really help on 6.x unless you are doing it all over  
  NFS.  You
  also want to backport my fix to cache_enter() before using  
  lookup_shared at
  all:
 
 Since it sounds like 6.x is a dead end, we'll focus on 7.x, provided  
 we can get it to be stable for us.

Yes.

 Having never used svn, I do need to figure out how to pull the  
 specific  patches you referenced, but I'm sure that's not an  
 unclimbable mountain. :-)

You can still use cvs to pull the revisions.  All those e-mail msg's have the 
CVS revisions in them, too.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Jeff Wheelhouse


On Sep 24, 2008, at 2:10 PM, John Baldwin wrote:
You can still use cvs to pull the revisions.  All those e-mail msg's  
have the

CVS revisions in them, too.


If I'm ever to do anything that will benefit someone besides myself,  
it's worth my making the effort to learn SVN.  We have coasted on the  
back of FreeBSD without giving back for long enough.


Thanks,
Jeff

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Julian Elischer

Jeff Wheelhouse wrote:


On Sep 24, 2008, at 6:12 AM, Ivan Voras wrote:

There is nothing that can be done within the 6.x branch. 7.x contains
many improvements but I think only 8.x will directly change the lockmgr
and the namei cache. The best things you can try right now is to use
7-STABLE (or soon to be released 7.1; you might need tuning with
7.0-RELEASE) or try 8-CURRENT (it's quite stable).


Really?  Nothing?

We get lockmgr-related panics on FreeBSD 7.0, as detailed elsewhere on 
this list.


Stability issues aside, what else would we need to tune on 7.0, besides 
enabling the ULE scheduler, and how much benefit would we really get?


These servers are in production, so 8-CURRENT is not an option.  I've 
already had my knuckles rapped by a customer for trying 7.1-PRERELEASE 
on one of their machines.


You are supposed to edit the uname info back to 7.0 before installing 
experimental 7.1 systems!


Didn't you get the memo?



Thanks,
Jeff



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]