Re: dogfooding over in clusteradm land

2012-01-03 Thread Sean Bruno


On Tue, 2012-01-03 at 04:46 -0800, Florian Smeets wrote:
> Yes, the patch fixes the problem. The cvs2svn run completed this time.
> 
>  9132.25 real  8387.05 user   403.86 sys
> 
> I did not see any significant syncer activity in top -S anymore.
> 
> Thanks a lot.
> Florian 

Currently running stable-9 + this patch on crush.freebsd.org.  First run
was successful and took about 4 hours start to finish.  Nicely done
folks.

diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
index 716916f..52fc08b 100644
--- a/sys/vm/vm_object.c
+++ b/sys/vm/vm_object.c
@@ -841,7 +841,8 @@ rescan:
if (p->valid == 0)
continue;
if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
-   if (object->generation != curgeneration)
+   if ((flags & OBJPC_SYNC) != 0 &&
+   object->generation != curgeneration)
goto rescan;
np = vm_page_find_least(object, pi);
continue;
@@ -851,7 +852,8 @@ rescan:

n = vm_object_page_collect_flush(object, p, pagerflags,
flags, &clearobjflags);
-   if (object->generation != curgeneration)
+   if ((flags & OBJPC_SYNC) != 0 &&
+   object->generation != curgeneration)
goto rescan;

/* 


signature.asc
Description: This is a digitally signed message part


Re: dogfooding over in clusteradm land

2012-01-03 Thread Florian Smeets
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03.01.2012 10:18, Kostik Belousov wrote:
> On Tue, Jan 03, 2012 at 12:02:22AM -0800, Don Lewis wrote:
>> On  2 Jan, Don Lewis wrote:
>>> On  2 Jan, Don Lewis wrote:
 On  2 Jan, Florian Smeets wrote:
>>> 
> This does not make a difference. I tried on 32K/4K 
> with/without journal and on 16K/2K all exhibit the same 
> problem. At some point during the cvs2svn conversion the 
> sycer starts to use 100% CPU. The whole process hangs at 
> that point sometimes for hours, from time to time it does 
> continue doing some work, but really really slow. It's 
> usually between revision 21 and 22, when the 
> resulting svn file gets bigger than about 11-12Gb. At that 
> point an ls in the target dir hangs in state ufs.
> 
> I broke into ddb and ran all commands which i thought
> could be useful. The output is at 
> http://tb.smeets.im/~flo/giant-ape_syncer.txt
 
 Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
 cpustop_handler() at cpustop_handler+0x2b ipi_nmi_handler()
 at ipi_nmi_handler+0x50 trap() at trap+0x1a8 nmi_calltrap()
 at nmi_calltrap+0x8 --- trap 0x13, rip = 0x8082ba43,
 rsp = 0xff8000270fe0, rbp = 0xff88c97829a0 ---
 _mtx_assert() at _mtx_assert+0x13 pmap_remove_write() at
 pmap_remove_write+0x38 vm_object_page_remove_write() at 
 vm_object_page_remove_write+0x1f vm_object_page_clean() at 
 vm_object_page_clean+0x14d vfs_msync() at vfs_msync+0xf1 
 sync_fsync() at sync_fsync+0x12a sync_vnode() at 
 sync_vnode+0x157 sched_sync() at sched_sync+0x1d1
 fork_exit() at fork_exit+0x135 fork_trampoline() at
 fork_trampoline+0xe --- trap 0, rip = 0, rsp =
 0xff88c9782d00, rbp = 0 ---
 
 I thinks this explains why the r228838 patch seems to help 
 the problem. Instead of an application call to msync(), 
 you're getting bitten by the syncer doing the equivalent.  I 
 don't know why the syncer is CPU bound, though.  From my 
 understanding of the patch it only optimizes the I/O.
 Without the patch, I would expect that the syncer would just
 spend a lot of time waiting on I/O.  My guess is that this
 is actually a vm problem. There are nested loops in 
 vm_object_page_clean() and vm_object_page_remove_write(), so 
 you could be doing something that's causing lots of looping 
 in that code.
>>> 
>>> Does the machine recover if you suspend cvs2svn?  I think what 
>>> is happening is that cvs2svn is continuing to dirty pages
>>> while the syncer is trying to sync the file.  From my limited 
>>> understanding of this code, it looks to me like every time 
>>> cvs2svn dirties a page, it will trigger a call to 
>>> vm_object_set_writeable_dirty(), which will increment 
>>> object->generation.  Whenever vm_object_page_clean() detects a 
>>> change in the generation count, it restarts its scan of the 
>>> pages associated with the object.  This is probably not
>>> optimal ...
>> 
>> Since the syncer is only trying to flush out pages that have
>> been dirty for the last 30 seconds, I think that 
>> vm_object_set_writeable_dirty() should just make one pass
>> through the object, ignoring generation, and then return when it
>> is called from the syncer.  That should keep 
>> vm_object_set_writeable_dirty() from looping over the object 
>> again and again if another process is actively dirtying the 
>> object.
>> 
> This sounds very plausible. I think that there is no sense in 
> restarting the scan if it is requested in async mode at all. See 
> below.
> 
> Would be thrilled if this finally solves the svn2cvs issues.
> 
> commit 41aaafe5e3be5387949f303b8766da64ee4a521f Author: Kostik 
> Belousov  Date:   Tue Jan 3 11:16:30 2012 +0200
> 
> Do not restart the scan in vm_object_page_clean() if requested
> mode is async.
> 
> Proposed by:  truckman
> 
> diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c index 
> 716916f..52fc08b 100644 --- a/sys/vm/vm_object.c +++ 
> b/sys/vm/vm_object.c @@ -841,7 +841,8 @@ rescan: if (p->valid == 0)
> continue; if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) { -   
> if 
> (object->generation != curgeneration) +   if ((flags & 
> OBJPC_SYNC) 
> != 0 && + object->generation != curgeneration) goto 
> rescan; 
> np = vm_page_find_least(object, pi); continue; @@ -851,7 +852,8 @@ 
> rescan:
> 
> n = vm_object_page_collect_flush(object, p, pagerflags, flags, 
> &clearobjflags); -if (object->generation != curgeneration) +  
> if 
> ((flags & OBJPC_SYNC) != 0 && +   object->generation != 
> curgeneration) goto rescan;
> 
> /*

Yes, the patch fixes the problem. The cvs2svn run completed this time.

 9132.25 real  8387.05 user   403.86 sys

I did not see any significant syncer activity in top -S anymore.

Thanks a lot.
Florian
-BEGIN PGP SIGN

Re: dogfooding over in clusteradm land

2012-01-03 Thread Don Lewis
On  3 Jan, Kostik Belousov wrote:

>> With your change above, the code will skip the busy page after sleeping
>> if it is running in async mode.  It won't make another attempt to write
>> this page because it no longer attempts to rescan.
> Why would it skip it ? Please note the call to vm_page_find_least()
> with the pindex of the busy page right after the check for
> generation. If a page with the pindex is still present in the object,
> vm_page_find_least() should return it, and vm_object_page_clean() should
> make another attempt at processing it.
> 
> Am I missing something ?

Nope, I was missing something ...

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-03 Thread Kostik Belousov
On Tue, Jan 03, 2012 at 02:57:17AM -0800, Don Lewis wrote:
> On  3 Jan, Kostik Belousov wrote:
> > On Tue, Jan 03, 2012 at 01:45:26AM -0800, Don Lewis wrote:
> >> On  3 Jan, Kostik Belousov wrote:
> >> 
> >> > This sounds very plausible. I think that there is no sense in restarting
> >> > the scan if it is requested in async mode at all. See below.
> >> > 
> >> > Would be thrilled if this finally solves the svn2cvs issues.
> >> > 
> >> > commit 41aaafe5e3be5387949f303b8766da64ee4a521f
> >> > Author: Kostik Belousov 
> >> > Date:   Tue Jan 3 11:16:30 2012 +0200
> >> > 
> >> > Do not restart the scan in vm_object_page_clean() if requested
> >> > mode is async.
> >> > 
> >> > Proposed by: truckman
> >> > 
> >> > diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
> >> > index 716916f..52fc08b 100644
> >> > --- a/sys/vm/vm_object.c
> >> > +++ b/sys/vm/vm_object.c
> >> > @@ -841,7 +841,8 @@ rescan:
> >> >  if (p->valid == 0)
> >> >  continue;
> >> >  if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
> >> > -if (object->generation != curgeneration)
> >> > +if ((flags & OBJPC_SYNC) != 0 &&
> >> > +object->generation != curgeneration)
> >> >  goto rescan;
> >> >  np = vm_page_find_least(object, pi);
> >> >  continue;
> >> 
> >> I wonder if it would make more sense to just skip the busy pages in
> >> async mode instead of sleeping ...
> >> 
> > It would be too much weakening the guarantee of the vfs_msync(MNT_NOWAIT)
> > to not write such pages, IMO. Busy state indeed means that the page most
> > likely undergoing the i/o, but in case it is not, we would not write it
> > at all.
> 
> If the original code detects a busy page, it sleeps and then continues
> with the next page if generation hasn't changed.  If generation has
> changed, then it restarts the scan.
> 
> With your change above, the code will skip the busy page after sleeping
> if it is running in async mode.  It won't make another attempt to write
> this page because it no longer attempts to rescan.
Why would it skip it ? Please note the call to vm_page_find_least()
with the pindex of the busy page right after the check for
generation. If a page with the pindex is still present in the object,
vm_page_find_least() should return it, and vm_object_page_clean() should
make another attempt at processing it.

Am I missing something ?
> 
> My suggestion just omits the sleep in this particular case.
> 
> The syncer should write the page the next time it runs, unless we're
> particularly unlucky ...
> 
> > Lets see whether the change alone helps. Do you agree ?
> 
> Your patch is definitely worth trying as-is.  My latest suggestion is
> probably a minor additional optimization.


pgpe8aTZBd2Ul.pgp
Description: PGP signature


Re: dogfooding over in clusteradm land

2012-01-03 Thread Don Lewis
On  3 Jan, Kostik Belousov wrote:
> On Tue, Jan 03, 2012 at 01:45:26AM -0800, Don Lewis wrote:
>> On  3 Jan, Kostik Belousov wrote:
>> 
>> > This sounds very plausible. I think that there is no sense in restarting
>> > the scan if it is requested in async mode at all. See below.
>> > 
>> > Would be thrilled if this finally solves the svn2cvs issues.
>> > 
>> > commit 41aaafe5e3be5387949f303b8766da64ee4a521f
>> > Author: Kostik Belousov 
>> > Date:   Tue Jan 3 11:16:30 2012 +0200
>> > 
>> > Do not restart the scan in vm_object_page_clean() if requested
>> > mode is async.
>> > 
>> > Proposed by:   truckman
>> > 
>> > diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
>> > index 716916f..52fc08b 100644
>> > --- a/sys/vm/vm_object.c
>> > +++ b/sys/vm/vm_object.c
>> > @@ -841,7 +841,8 @@ rescan:
>> >if (p->valid == 0)
>> >continue;
>> >if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
>> > -  if (object->generation != curgeneration)
>> > +  if ((flags & OBJPC_SYNC) != 0 &&
>> > +  object->generation != curgeneration)
>> >goto rescan;
>> >np = vm_page_find_least(object, pi);
>> >continue;
>> 
>> I wonder if it would make more sense to just skip the busy pages in
>> async mode instead of sleeping ...
>> 
> It would be too much weakening the guarantee of the vfs_msync(MNT_NOWAIT)
> to not write such pages, IMO. Busy state indeed means that the page most
> likely undergoing the i/o, but in case it is not, we would not write it
> at all.

If the original code detects a busy page, it sleeps and then continues
with the next page if generation hasn't changed.  If generation has
changed, then it restarts the scan.

With your change above, the code will skip the busy page after sleeping
if it is running in async mode.  It won't make another attempt to write
this page because it no longer attempts to rescan.

My suggestion just omits the sleep in this particular case.

The syncer should write the page the next time it runs, unless we're
particularly unlucky ...

> Lets see whether the change alone helps. Do you agree ?

Your patch is definitely worth trying as-is.  My latest suggestion is
probably a minor additional optimization.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-03 Thread Kostik Belousov
On Tue, Jan 03, 2012 at 01:45:26AM -0800, Don Lewis wrote:
> On  3 Jan, Kostik Belousov wrote:
> 
> > This sounds very plausible. I think that there is no sense in restarting
> > the scan if it is requested in async mode at all. See below.
> > 
> > Would be thrilled if this finally solves the svn2cvs issues.
> > 
> > commit 41aaafe5e3be5387949f303b8766da64ee4a521f
> > Author: Kostik Belousov 
> > Date:   Tue Jan 3 11:16:30 2012 +0200
> > 
> > Do not restart the scan in vm_object_page_clean() if requested
> > mode is async.
> > 
> > Proposed by:truckman
> > 
> > diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
> > index 716916f..52fc08b 100644
> > --- a/sys/vm/vm_object.c
> > +++ b/sys/vm/vm_object.c
> > @@ -841,7 +841,8 @@ rescan:
> > if (p->valid == 0)
> > continue;
> > if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
> > -   if (object->generation != curgeneration)
> > +   if ((flags & OBJPC_SYNC) != 0 &&
> > +   object->generation != curgeneration)
> > goto rescan;
> > np = vm_page_find_least(object, pi);
> > continue;
> 
> I wonder if it would make more sense to just skip the busy pages in
> async mode instead of sleeping ...
> 
It would be too much weakening the guarantee of the vfs_msync(MNT_NOWAIT)
to not write such pages, IMO. Busy state indeed means that the page most
likely undergoing the i/o, but in case it is not, we would not write it
at all.

Lets see whether the change alone helps. Do you agree ?


pgpsejHYZyDCu.pgp
Description: PGP signature


Re: dogfooding over in clusteradm land

2012-01-03 Thread Don Lewis
On  3 Jan, Kostik Belousov wrote:

> This sounds very plausible. I think that there is no sense in restarting
> the scan if it is requested in async mode at all. See below.
> 
> Would be thrilled if this finally solves the svn2cvs issues.
> 
> commit 41aaafe5e3be5387949f303b8766da64ee4a521f
> Author: Kostik Belousov 
> Date:   Tue Jan 3 11:16:30 2012 +0200
> 
> Do not restart the scan in vm_object_page_clean() if requested
> mode is async.
> 
> Proposed by:  truckman
> 
> diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
> index 716916f..52fc08b 100644
> --- a/sys/vm/vm_object.c
> +++ b/sys/vm/vm_object.c
> @@ -841,7 +841,8 @@ rescan:
>   if (p->valid == 0)
>   continue;
>   if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
> - if (object->generation != curgeneration)
> + if ((flags & OBJPC_SYNC) != 0 &&
> + object->generation != curgeneration)
>   goto rescan;
>   np = vm_page_find_least(object, pi);
>   continue;

I wonder if it would make more sense to just skip the busy pages in
async mode instead of sleeping ...


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-03 Thread Kostik Belousov
On Tue, Jan 03, 2012 at 12:02:22AM -0800, Don Lewis wrote:
> On  2 Jan, Don Lewis wrote:
> > On  2 Jan, Don Lewis wrote:
> >> On  2 Jan, Florian Smeets wrote:
> > 
> >>> This does not make a difference. I tried on 32K/4K with/without journal
> >>> and on 16K/2K all exhibit the same problem. At some point during the
> >>> cvs2svn conversion the sycer starts to use 100% CPU. The whole process
> >>> hangs at that point sometimes for hours, from time to time it does
> >>> continue doing some work, but really really slow. It's usually between
> >>> revision 21 and 22, when the resulting svn file gets bigger than
> >>> about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
> >>> 
> >>> I broke into ddb and ran all commands which i thought could be useful.
> >>> The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt
> >> 
> >> Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
> >> cpustop_handler() at cpustop_handler+0x2b
> >> ipi_nmi_handler() at ipi_nmi_handler+0x50
> >> trap() at trap+0x1a8
> >> nmi_calltrap() at nmi_calltrap+0x8
> >> --- trap 0x13, rip = 0x8082ba43, rsp = 0xff8000270fe0, rbp = 
> >> 0xff88c97829a0 ---
> >> _mtx_assert() at _mtx_assert+0x13
> >> pmap_remove_write() at pmap_remove_write+0x38
> >> vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
> >> vm_object_page_clean() at vm_object_page_clean+0x14d
> >> vfs_msync() at vfs_msync+0xf1
> >> sync_fsync() at sync_fsync+0x12a
> >> sync_vnode() at sync_vnode+0x157
> >> sched_sync() at sched_sync+0x1d1
> >> fork_exit() at fork_exit+0x135
> >> fork_trampoline() at fork_trampoline+0xe
> >> --- trap 0, rip = 0, rsp = 0xff88c9782d00, rbp = 0 ---
> >> 
> >> I thinks this explains why the r228838 patch seems to help the problem.
> >> Instead of an application call to msync(), you're getting bitten by the
> >> syncer doing the equivalent.  I don't know why the syncer is CPU bound,
> >> though.  From my understanding of the patch it only optimizes the I/O.
> >> Without the patch, I would expect that the syncer would just spend a lot
> >> of time waiting on I/O.  My guess is that this is actually a vm problem.
> >> There are nested loops in vm_object_page_clean() and
> >> vm_object_page_remove_write(), so you could be doing something that's
> >> causing lots of looping in that code.
> > 
> > Does the machine recover if you suspend cvs2svn?  I think what is
> > happening is that cvs2svn is continuing to dirty pages while the syncer
> > is trying to sync the file.  From my limited understanding of this code,
> > it looks to me like every time cvs2svn dirties a page, it will trigger a
> > call to vm_object_set_writeable_dirty(), which will increment
> > object->generation.  Whenever vm_object_page_clean() detects a change in
> > the generation count, it restarts its scan of the pages associated with
> > the object.  This is probably not optimal ...
> 
> Since the syncer is only trying to flush out pages that have been dirty
> for the last 30 seconds, I think that vm_object_set_writeable_dirty()
> should just make one pass through the object, ignoring generation, and
> then return when it is called from the syncer.  That should keep
> vm_object_set_writeable_dirty() from looping over the object again and
> again if another process is actively dirtying the object.
> 
This sounds very plausible. I think that there is no sense in restarting
the scan if it is requested in async mode at all. See below.

Would be thrilled if this finally solves the svn2cvs issues.

commit 41aaafe5e3be5387949f303b8766da64ee4a521f
Author: Kostik Belousov 
Date:   Tue Jan 3 11:16:30 2012 +0200

Do not restart the scan in vm_object_page_clean() if requested
mode is async.

Proposed by:truckman

diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
index 716916f..52fc08b 100644
--- a/sys/vm/vm_object.c
+++ b/sys/vm/vm_object.c
@@ -841,7 +841,8 @@ rescan:
if (p->valid == 0)
continue;
if (vm_page_sleep_if_busy(p, TRUE, "vpcwai")) {
-   if (object->generation != curgeneration)
+   if ((flags & OBJPC_SYNC) != 0 &&
+   object->generation != curgeneration)
goto rescan;
np = vm_page_find_least(object, pi);
continue;
@@ -851,7 +852,8 @@ rescan:
 
n = vm_object_page_collect_flush(object, p, pagerflags,
flags, &clearobjflags);
-   if (object->generation != curgeneration)
+   if ((flags & OBJPC_SYNC) != 0 &&
+   object->generation != curgeneration)
goto rescan;
 
/*


pgpCaQxKBZews.pgp
Description: PGP signature


Re: dogfooding over in clusteradm land

2012-01-03 Thread Don Lewis
On  2 Jan, Don Lewis wrote:
> On  2 Jan, Don Lewis wrote:
>> On  2 Jan, Florian Smeets wrote:
> 
>>> This does not make a difference. I tried on 32K/4K with/without journal
>>> and on 16K/2K all exhibit the same problem. At some point during the
>>> cvs2svn conversion the sycer starts to use 100% CPU. The whole process
>>> hangs at that point sometimes for hours, from time to time it does
>>> continue doing some work, but really really slow. It's usually between
>>> revision 21 and 22, when the resulting svn file gets bigger than
>>> about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
>>> 
>>> I broke into ddb and ran all commands which i thought could be useful.
>>> The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt
>> 
>> Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
>> cpustop_handler() at cpustop_handler+0x2b
>> ipi_nmi_handler() at ipi_nmi_handler+0x50
>> trap() at trap+0x1a8
>> nmi_calltrap() at nmi_calltrap+0x8
>> --- trap 0x13, rip = 0x8082ba43, rsp = 0xff8000270fe0, rbp = 
>> 0xff88c97829a0 ---
>> _mtx_assert() at _mtx_assert+0x13
>> pmap_remove_write() at pmap_remove_write+0x38
>> vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
>> vm_object_page_clean() at vm_object_page_clean+0x14d
>> vfs_msync() at vfs_msync+0xf1
>> sync_fsync() at sync_fsync+0x12a
>> sync_vnode() at sync_vnode+0x157
>> sched_sync() at sched_sync+0x1d1
>> fork_exit() at fork_exit+0x135
>> fork_trampoline() at fork_trampoline+0xe
>> --- trap 0, rip = 0, rsp = 0xff88c9782d00, rbp = 0 ---
>> 
>> I thinks this explains why the r228838 patch seems to help the problem.
>> Instead of an application call to msync(), you're getting bitten by the
>> syncer doing the equivalent.  I don't know why the syncer is CPU bound,
>> though.  From my understanding of the patch it only optimizes the I/O.
>> Without the patch, I would expect that the syncer would just spend a lot
>> of time waiting on I/O.  My guess is that this is actually a vm problem.
>> There are nested loops in vm_object_page_clean() and
>> vm_object_page_remove_write(), so you could be doing something that's
>> causing lots of looping in that code.
> 
> Does the machine recover if you suspend cvs2svn?  I think what is
> happening is that cvs2svn is continuing to dirty pages while the syncer
> is trying to sync the file.  From my limited understanding of this code,
> it looks to me like every time cvs2svn dirties a page, it will trigger a
> call to vm_object_set_writeable_dirty(), which will increment
> object->generation.  Whenever vm_object_page_clean() detects a change in
> the generation count, it restarts its scan of the pages associated with
> the object.  This is probably not optimal ...

Since the syncer is only trying to flush out pages that have been dirty
for the last 30 seconds, I think that vm_object_set_writeable_dirty()
should just make one pass through the object, ignoring generation, and
then return when it is called from the syncer.  That should keep
vm_object_set_writeable_dirty() from looping over the object again and
again if another process is actively dirtying the object.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-02 Thread Don Lewis
On  2 Jan, Don Lewis wrote:
> On  2 Jan, Florian Smeets wrote:

>> This does not make a difference. I tried on 32K/4K with/without journal
>> and on 16K/2K all exhibit the same problem. At some point during the
>> cvs2svn conversion the sycer starts to use 100% CPU. The whole process
>> hangs at that point sometimes for hours, from time to time it does
>> continue doing some work, but really really slow. It's usually between
>> revision 21 and 22, when the resulting svn file gets bigger than
>> about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
>> 
>> I broke into ddb and ran all commands which i thought could be useful.
>> The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt
> 
> Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
> cpustop_handler() at cpustop_handler+0x2b
> ipi_nmi_handler() at ipi_nmi_handler+0x50
> trap() at trap+0x1a8
> nmi_calltrap() at nmi_calltrap+0x8
> --- trap 0x13, rip = 0x8082ba43, rsp = 0xff8000270fe0, rbp = 
> 0xff88c97829a0 ---
> _mtx_assert() at _mtx_assert+0x13
> pmap_remove_write() at pmap_remove_write+0x38
> vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
> vm_object_page_clean() at vm_object_page_clean+0x14d
> vfs_msync() at vfs_msync+0xf1
> sync_fsync() at sync_fsync+0x12a
> sync_vnode() at sync_vnode+0x157
> sched_sync() at sched_sync+0x1d1
> fork_exit() at fork_exit+0x135
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xff88c9782d00, rbp = 0 ---
> 
> I thinks this explains why the r228838 patch seems to help the problem.
> Instead of an application call to msync(), you're getting bitten by the
> syncer doing the equivalent.  I don't know why the syncer is CPU bound,
> though.  From my understanding of the patch it only optimizes the I/O.
> Without the patch, I would expect that the syncer would just spend a lot
> of time waiting on I/O.  My guess is that this is actually a vm problem.
> There are nested loops in vm_object_page_clean() and
> vm_object_page_remove_write(), so you could be doing something that's
> causing lots of looping in that code.

Does the machine recover if you suspend cvs2svn?  I think what is
happening is that cvs2svn is continuing to dirty pages while the syncer
is trying to sync the file.  From my limited understanding of this code,
it looks to me like every time cvs2svn dirties a page, it will trigger a
call to vm_object_set_writeable_dirty(), which will increment
object->generation.  Whenever vm_object_page_clean() detects a change in
the generation count, it restarts its scan of the pages associated with
the object.  This is probably not optimal ...

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-02 Thread Kostik Belousov
On Mon, Jan 02, 2012 at 12:47:03PM -0800, Don Lewis wrote:
> On  2 Jan, Florian Smeets wrote:
> > On 29.12.11 01:04, Kirk McKusick wrote:
> >> Rather than changing BKVASIZE, I would try running the cvs2svn
> >> conversion on a 16K/2K filesystem and see if that sorts out the
> >> problem. If it does, it tells us that doubling the main block
> >> size and reducing the number of buffers by half is the problem.
> >> If that is the problem, then we will have to increase the KVM
> >> allocated to the buffer cache.
> >> 
> > 
> > This does not make a difference. I tried on 32K/4K with/without journal
> > and on 16K/2K all exhibit the same problem. At some point during the
> > cvs2svn conversion the sycer starts to use 100% CPU. The whole process
> > hangs at that point sometimes for hours, from time to time it does
> > continue doing some work, but really really slow. It's usually between
> > revision 21 and 22, when the resulting svn file gets bigger than
> > about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
> > 
> > I broke into ddb and ran all commands which i thought could be useful.
> > The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt
> 
> Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
> cpustop_handler() at cpustop_handler+0x2b
> ipi_nmi_handler() at ipi_nmi_handler+0x50
> trap() at trap+0x1a8
> nmi_calltrap() at nmi_calltrap+0x8
> --- trap 0x13, rip = 0x8082ba43, rsp = 0xff8000270fe0, rbp = 
> 0xff88c97829a0 ---
> _mtx_assert() at _mtx_assert+0x13
> pmap_remove_write() at pmap_remove_write+0x38
> vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
> vm_object_page_clean() at vm_object_page_clean+0x14d
> vfs_msync() at vfs_msync+0xf1
> sync_fsync() at sync_fsync+0x12a
> sync_vnode() at sync_vnode+0x157
> sched_sync() at sched_sync+0x1d1
> fork_exit() at fork_exit+0x135
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xff88c9782d00, rbp = 0 ---
> 
> I thinks this explains why the r228838 patch seems to help the problem.
> Instead of an application call to msync(), you're getting bitten by the
> syncer doing the equivalent.  I don't know why the syncer is CPU bound,
> though.  From my understanding of the patch it only optimizes the I/O.
> Without the patch, I would expect that the syncer would just spend a lot
> of time waiting on I/O.  My guess is that this is actually a vm problem.
> There are nested loops in vm_object_page_clean() and
> vm_object_page_remove_write(), so you could be doing something that's
> causing lots of looping in that code.
r228838 allows the system to skip 50-70% of the code when initiating a
write of the UFS file page, due to async clustering. The system has to
maintain 75% less amount of writes in progress.

> I think that ls is hanging because it's stumbling across the vnode that
> the syncer has locked.
This is the only reasonable explanation.

Low-tech profile is to periodically break out into ddb and do backtrace
for the syncer thread. More advanced techniques is to use dtrace or normal
profiling.


pgpqweDffT9HY.pgp
Description: PGP signature


Re: dogfooding over in clusteradm land

2012-01-02 Thread Don Lewis
On  2 Jan, Florian Smeets wrote:
> On 29.12.11 01:04, Kirk McKusick wrote:
>> Rather than changing BKVASIZE, I would try running the cvs2svn
>> conversion on a 16K/2K filesystem and see if that sorts out the
>> problem. If it does, it tells us that doubling the main block
>> size and reducing the number of buffers by half is the problem.
>> If that is the problem, then we will have to increase the KVM
>> allocated to the buffer cache.
>> 
> 
> This does not make a difference. I tried on 32K/4K with/without journal
> and on 16K/2K all exhibit the same problem. At some point during the
> cvs2svn conversion the sycer starts to use 100% CPU. The whole process
> hangs at that point sometimes for hours, from time to time it does
> continue doing some work, but really really slow. It's usually between
> revision 21 and 22, when the resulting svn file gets bigger than
> about 11-12Gb. At that point an ls in the target dir hangs in state ufs.
> 
> I broke into ddb and ran all commands which i thought could be useful.
> The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt

Tracing command syncer pid 9 tid 100183 td 0xfe00120e9000
cpustop_handler() at cpustop_handler+0x2b
ipi_nmi_handler() at ipi_nmi_handler+0x50
trap() at trap+0x1a8
nmi_calltrap() at nmi_calltrap+0x8
--- trap 0x13, rip = 0x8082ba43, rsp = 0xff8000270fe0, rbp = 
0xff88c97829a0 ---
_mtx_assert() at _mtx_assert+0x13
pmap_remove_write() at pmap_remove_write+0x38
vm_object_page_remove_write() at vm_object_page_remove_write+0x1f
vm_object_page_clean() at vm_object_page_clean+0x14d
vfs_msync() at vfs_msync+0xf1
sync_fsync() at sync_fsync+0x12a
sync_vnode() at sync_vnode+0x157
sched_sync() at sched_sync+0x1d1
fork_exit() at fork_exit+0x135
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff88c9782d00, rbp = 0 ---

I thinks this explains why the r228838 patch seems to help the problem.
Instead of an application call to msync(), you're getting bitten by the
syncer doing the equivalent.  I don't know why the syncer is CPU bound,
though.  From my understanding of the patch it only optimizes the I/O.
Without the patch, I would expect that the syncer would just spend a lot
of time waiting on I/O.  My guess is that this is actually a vm problem.
There are nested loops in vm_object_page_clean() and
vm_object_page_remove_write(), so you could be doing something that's
causing lots of looping in that code.

I think that ls is hanging because it's stumbling across the vnode that
the syncer has locked.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2012-01-02 Thread Florian Smeets
On 29.12.11 01:04, Kirk McKusick wrote:
> Rather than changing BKVASIZE, I would try running the cvs2svn
> conversion on a 16K/2K filesystem and see if that sorts out the
> problem. If it does, it tells us that doubling the main block
> size and reducing the number of buffers by half is the problem.
> If that is the problem, then we will have to increase the KVM
> allocated to the buffer cache.
> 

This does not make a difference. I tried on 32K/4K with/without journal
and on 16K/2K all exhibit the same problem. At some point during the
cvs2svn conversion the sycer starts to use 100% CPU. The whole process
hangs at that point sometimes for hours, from time to time it does
continue doing some work, but really really slow. It's usually between
revision 21 and 22, when the resulting svn file gets bigger than
about 11-12Gb. At that point an ls in the target dir hangs in state ufs.

I broke into ddb and ran all commands which i thought could be useful.
The output is at http://tb.smeets.im/~flo/giant-ape_syncer.txt

The machine is still in ddb and i could run any additional commands, the
kernel is from Attilio's vmcontention branch, which was MFCed yesterday,
and updated after the MFC. The same problem happens on 9.0-RC3.

If i run the same test on a zfs filesystem i don't see any problems.

Florian



signature.asc
Description: OpenPGP digital signature


Re: dogfooding over in clusteradm land

2011-12-28 Thread Kirk McKusick
Rather than changing BKVASIZE, I would try running the cvs2svn
conversion on a 16K/2K filesystem and see if that sorts out the
problem. If it does, it tells us that doubling the main block
size and reducing the number of buffers by half is the problem.
If that is the problem, then we will have to increase the KVM
allocated to the buffer cache.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2011-12-27 Thread Florian Smeets
On 14.12.11 14:20, Sean Bruno wrote:
> We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
> the cvs2svn ports conversion box.  I'm not sure what resource is tapped
> out.  Effectively, I cannot access the directory under use and the
> converter application stalls out waiting for some resource that isn't
> clear. (Peter had posited kmem of some kind).
> 
> I've upped maxvnodes a bit on the host, turned off SUJ and mounted the
> f/s in question with async and noatime for performance reasons.
> 
> Can someone hit me up with the cluebat?  I can give you direct access to
> the box for debuginationing.
> 

Just for the archives. This is fixed or at least considerably improved
by r228838.

The ports cvs2svn run went down from panicking after about ~22h to being
finished after ~10h.

Thanks to Sean and Attilio for giving me access to test boxes.

Florian



signature.asc
Description: OpenPGP digital signature


Re: dogfooding over in clusteradm land

2011-12-16 Thread Ulrich Spörlein
On Thu, 2011-12-15 at 18:39:59 -0800, Doug Barton wrote:
> On 12/14/2011 05:20, Sean Bruno wrote:
> > We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
> > the cvs2svn ports conversion box.
> 
> ... sounds like a good reason not to migrate the history to me. :)

Sounds more like a new regression test that we could use :)

Uli
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2011-12-15 Thread Doug Barton
On 12/14/2011 05:20, Sean Bruno wrote:
> We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
> the cvs2svn ports conversion box.

... sounds like a good reason not to migrate the history to me. :)


-- 

[^L]

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2011-12-15 Thread Don Lewis
On 14 Dec, Poul-Henning Kamp wrote:
> In message <1323868832.5283.9.ca...@hitfishpass-lx.corp.yahoo.com>, Sean 
> Bruno 
> writes:
> 
>>We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
>>the cvs2svn ports conversion box.  I'm not sure what resource is tapped
>>out.
> 
> Search mailarcive for "lemming-syncer"

That should only produce a slowdown every 30 seconds but not cause a
deadlock.

I'd be more suspicious of a memory allocation deadlock.  This can happen
if the system runs short of free memory because there are a large number
of dirty buffers, but it needs to allocate some memory to flush the
buffers to disk.

This could be more likely to happen if you are using a software raid
layer, but I suspect that the recent change to the default UFS block
size from 16K to 32K is the culprit.  In another thread bde pointed out
that the BKVASIZE definition in sys/param.h hadn't been updated to match
the new default UFS block size.

 * BKVASIZE -   Nominal buffer space per buffer, in bytes.  BKVASIZE is the
 *  minimum KVM memory reservation the kernel is willing to make.
 *  Filesystems can of course request smaller chunks.  Actual 
 *  backing memory uses a chunk size of a page (PAGE_SIZE).
 *
 *  If you make BKVASIZE too small you risk seriously fragmenting
 *  the buffer KVM map which may slow things down a bit.  If you
 *  make it too big the kernel will not be able to optimally use 
 *  the KVM memory reserved for the buffer cache and will wind 
 *  up with too-few buffers.
 *
 *  The default is 16384, roughly 2x the block size used by a
 *  normal UFS filesystem.
 */
#define MAXBSIZE65536   /* must be power of 2 */
#define BKVASIZE16384   /* must be power of 2 */

The problem is that BKVASIZE is used in a number of the tuning
calculations in vfs_bio.c:

/*
 * The nominal buffer size (and minimum KVA allocation) is BKVASIZE.
 * For the first 64MB of ram nominally allocate sufficient buffers to
 * cover 1/4 of our ram.  Beyond the first 64MB allocate additional
 * buffers to cover 1/10 of our ram over 64MB.  When auto-sizing
 * the buffer cache we limit the eventual kva reservation to
 * maxbcache bytes.
 *
 * factor represents the 1/4 x ram conversion.
 */
if (nbuf == 0) {
int factor = 4 * BKVASIZE / 1024;

nbuf = 50;
if (physmem_est > 4096)
nbuf += min((physmem_est - 4096) / factor,
65536 / factor);
if (physmem_est > 65536)
nbuf += (physmem_est - 65536) * 2 / (factor * 5);

if (maxbcache && nbuf > maxbcache / BKVASIZE)
nbuf = maxbcache / BKVASIZE;
tuned_nbuf = 1;
} else
tuned_nbuf = 0;

/* XXX Avoid unsigned long overflows later on with maxbufspace. */
maxbuf = (LONG_MAX / 3) / BKVASIZE;


/*
 * maxbufspace is the absolute maximum amount of buffer space we are 
 * allowed to reserve in KVM and in real terms.  The absolute maximum
 * is nominally used by buf_daemon.  hibufspace is the nominal maximum
 * used by most other processes.  The differential is required to 
 * ensure that buf_daemon is able to run when other processes might 
 * be blocked waiting for buffer space.
 *
 * maxbufspace is based on BKVASIZE.  Allocating buffers larger then
 * this may result in KVM fragmentation which is not handled optimally
 * by the system.
 */
maxbufspace = (long)nbuf * BKVASIZE;
hibufspace = lmax(3 * maxbufspace / 4, maxbufspace - MAXBSIZE * 10);
lobufspace = hibufspace - MAXBSIZE;


If you are using the new 32K default filesystem block size, then you may
be consuming twice as much memory for buffers than the tuning
calculations think you are using.  Increasing maxvnodes is probably the
wrong way to go, since it will increase memory pressure.

As a quick and dirty test, try cutting kern.nbuf in half.  The correct
fix is probably to rebuild the kernel with BKVASIZE doubled.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land

2011-12-14 Thread Poul-Henning Kamp
In message <1323868832.5283.9.ca...@hitfishpass-lx.corp.yahoo.com>, Sean Bruno 
writes:

>We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
>the cvs2svn ports conversion box.  I'm not sure what resource is tapped
>out.

Search mailarcive for "lemming-syncer"

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land [cvs2svn for ports]

2011-12-14 Thread Garrett Cooper
On Wed, Dec 14, 2011 at 10:39 AM, Sean Bruno  wrote:
> On Wed, 2011-12-14 at 05:20 -0800, Sean Bruno wrote:
>> We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
>> the cvs2svn ports conversion box.  I'm not sure what resource is tapped
>> out.  Effectively, I cannot access the directory under use and the
>> converter application stalls out waiting for some resource that isn't
>> clear. (Peter had posited kmem of some kind).
>>
>> I've upped maxvnodes a bit on the host, turned off SUJ and mounted the
>> f/s in question with async and noatime for performance reasons.
>>
>> Can someone hit me up with the cluebat?  I can give you direct access to
>> the box for debuginationing.
>>
>> Sean
>
> BTW, this project is sort of stalled out by this problem.

A few things come to mind (in no particular order):

1. What does svn say before it dies?
2. What does df for the affected partition output?
3. Do you have syslog output that indicates where the starvation is occurring?
4. What do the following sysctls print out?

kern.maxvnodes kern.minvnodes vfs.freevnodes vfs.wantfreevnodes vfs.numvnodes

5. What does top / vmstat -z say for memory right before svn goes south?
6. Are you running the import as an unprivileged user, or root?
7. Has the login.conf been changed on the box?

Thanks,
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: dogfooding over in clusteradm land [cvs2svn for ports]

2011-12-14 Thread Sean Bruno
On Wed, 2011-12-14 at 05:20 -0800, Sean Bruno wrote:
> We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
> the cvs2svn ports conversion box.  I'm not sure what resource is tapped
> out.  Effectively, I cannot access the directory under use and the
> converter application stalls out waiting for some resource that isn't
> clear. (Peter had posited kmem of some kind).
> 
> I've upped maxvnodes a bit on the host, turned off SUJ and mounted the
> f/s in question with async and noatime for performance reasons.
> 
> Can someone hit me up with the cluebat?  I can give you direct access to
> the box for debuginationing.
> 
> Sean

BTW, this project is sort of stalled out by this problem.

Sean


signature.asc
Description: This is a digitally signed message part


dogfooding over in clusteradm land

2011-12-14 Thread Sean Bruno
We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
the cvs2svn ports conversion box.  I'm not sure what resource is tapped
out.  Effectively, I cannot access the directory under use and the
converter application stalls out waiting for some resource that isn't
clear. (Peter had posited kmem of some kind).

I've upped maxvnodes a bit on the host, turned off SUJ and mounted the
f/s in question with async and noatime for performance reasons.

Can someone hit me up with the cluebat?  I can give you direct access to
the box for debuginationing.

Sean

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"