Sorry, no need to guess, it was in my monitoring client. 21378 4248 root 4:17PM Sl 43 0.0 0:00.12 0.2 5184 46288 /usr/pkg/libexec/openafs/davolserver -sleep 5/60 -nojumbo
On Mon, Feb 10, 2014 at 12:15 PM, Tracy Di Marco White <genda...@gmail.com>wrote: > Somehow, I still have two of them in my scroll back. > root 4129 0.0 0.2 46288 5124 ? Sl 7:46AM 0:00.02 > /usr/pkg/libexec/openafs/davolserver -sleep 5/60 -nojumbo > root 7155 0.0 1.2 85200 42424 ? Il 8:06AM 1:27.36 > /usr/pkg/libexec/openafs/davolserver -sleep 5/60 -nojumbo > > I'd assume that means you can guess the third. > > > On Mon, Feb 10, 2014 at 7:00 AM, Peter Grandi <p...@afs.list.sabi.co.uk>wrote: > >> > Every night at midnight, we run 'vos backupsys'. For three >> > nights in a row, on one of the servers I've upgraded to 1.6.5 >> > and dafs, I've been getting the following errors, and it >> > mostly stops being a fileserver. >> >> [ ... ] >> > Sun Feb 9 00:00:03 2014 SYNC_getCom: error receiving command >> > Sun Feb 9 00:00:03 2014 FSYNC_com: read failed; dropping connection >> (cnt=493489) >> > Sun Feb 9 00:00:03 2014 _VLockFd: conflicting lock held on fd 225, >> offset 538046785 by pid 4129 (locktype=1) >> > Sun Feb 9 00:00:03 2014 VAttachVolume: another program has vol >> 538046785 locked >> > Sun Feb 9 00:00:03 2014 VPreattachVolumeByVp_r: volume 538046785 not >> in quiescent state (state 2 flags 0x18) >> [ ... ] >> > Sun Feb 9 00:00:03 2014 1 Volser: Clone: Recloning volume 538046785 to >> volume 538046787 >> > Sun Feb 9 00:00:03 2014 SYNC_ask: length field in response >> inconsistent on circuit 'FSSYNC' >> > Sun Feb 9 00:00:03 2014 SYNC_ask: protocol communications failure on >> circuit 'FSSYNC'; attempting reconnect to server >> [ ... ] >> >> That " _VLockFd: conflicting lock held" and "VAttachVolume: >> another program has vol NNNN locked" looks vaguely familiar, and >> in a case that I have seen it was because a DB server was >> offline, and 'vos' took a very very long time to switch to an >> online one. But this was with 1.4 and supposedly 1.6 should have >> a shorter timeout. >> >> In another case that vaguely resembles this there was a race >> between creating a clone and registering it in the VLDB: >> >> http://rt.central.org/rt/Ticket/Display.html?id=131797 >> >> It would be interesting to know what processes 21378, 4129, 7155 >> were doing and why they held a lock on the RW original. >> _______________________________________________ >> OpenAFS-info mailing list >> OpenAFS-info@openafs.org >> https://lists.openafs.org/mailman/listinfo/openafs-info >> > >