A little progress.

I have a machine with a KTR enabled kernel running.

Another machine is running David's ffs_vfsops.c's patch.

I left two other machines (GENERIC kernels) running the packet loss test
overnight. At ~ 32480 seconds of uptime the problem starts. This is really close to a 16 bit overflow... See http://www.eng.oar.net/~maf/bsd6/ p1.png and http://www.eng.oar.net/~maf/bsd6/p2.png. The missing impulses at 31 second marks are the intervals between test runs. The window of missing packets
(timestamps between two packets where a sequence number is missing)
is usually less than 4us, altough I'm not sure gettimeofday() can be
trusted for measuring this. See https://www.eng.oar.net/~maf/bsd6/ p3.png

Things I'll try tonight:

  o check on the patched kernel

o Try KTR debugging enabled before and after an expected high latency period.

  o Dump all files to /dev/null to trigger the behavior.

I would expect the vnode problem to look a little different on the packet
loss graphs over time.  If this leads anywher I'll add a counter
before the msleep() and see how often it's getting there.

On Dec 17, 2007, at 5:24 AM, David G Lawrence wrote:
I noticed this as well some time ago. The problem has to do with the processing (syncing) of vnodes. When the total number of allocated vnodes
in the system grows to tens of thousands, the ~31 second periodic sync
process takes a long time to run. Try this patch and let people know if it helps your problem. It will periodically wait for one tick (1ms) every
500 vnodes of processing, which will allow other things to run.

Index: ufs/ffs/ffs_vfsops.c
===================================================================
RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.290.2.16
diff -c -r1.290.2.16 ffs_vfsops.c
*** ufs/ffs/ffs_vfsops.c        9 Oct 2006 19:47:17 -0000       1.290.2.16
--- ufs/ffs/ffs_vfsops.c        25 Apr 2007 01:58:15 -0000
***************
*** 1109,1114 ****
--- 1109,1115 ----
        int softdep_deps;
        int softdep_accdeps;
        struct bufobj *bo;
+       int flushed_count = 0;

        fs = ump->um_fs;
        if (fs->fs_fmod != 0 && fs->fs_ronly != 0) {              /* XXX */
***************
*** 1174,1179 ****
--- 1175,1184 ----
                        allerror = error;
                vput(vp);
                MNT_ILOCK(mp);
+               if (flushed_count++ > 500) {
+                       flushed_count = 0;
+                       msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1);
+               }
        }
        MNT_IUNLOCK(mp);
        /*

-DG

David G. Lawrence
President
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to