Hello, We recently upgraded our OpenAFS servers to 1.6.2, all running on Solaris 10 (Generic_147440-27 sun4v sparc).
Since the buserver upgrade, backups have been failing for various servers / various partitions. Works: fileservers = 1.4.14.1, 1.6.1, 1.6.2 butc = (client side) 1.6.1 buserver = 1.4.14.1 Fails: fileservers = 1.4.14.1, 1.6.1, 1.6.2 butc = (client side) 1.6.1, 1.6.2 buserver = 1.6.2 For a partition (volset) that doesn't complete the 'backup dump', /usr/afs/backup/TL_<port-offset> looks to be waiting for a DumpID from the buserver. --------------------- srv3:/usr/afs/backup:# cat TL_3106 Tue Mar 26 10:30:11 2013: Starting Tape Coordinator: Port offset 3106 Debug level 0 Tue Mar 26 10:30:11 2013: Token expires: Wed Dec 31 19:00:01 1969 Tue Mar 26 10:31:21 2013: Task 3106001: Dump TSM_srv3_f_135.04 --------------------- whereas for those butc/dump processes that proceed, the subsequent lines have more info. --------------------- srv3:/usr/afs/backup:# head TL_3115 Tue Mar 26 10:30:17 2013: Starting Tape Coordinator: Port offset 3115 Debug level 0 Tue Mar 26 10:30:17 2013: Token expires: Wed Dec 31 19:00:01 1969 Tue Mar 26 10:31:40 2013: Task 3115001: Dump TSM_srv3_o_157.26 Tue Mar 26 10:31:42 2013: Task 3115001: Dump TSM_srv3_o_157.26 (DumpID 1364308301) Tue Mar 26 10:31:42 2013: Task 3115001: Starting pass 1 Tue Mar 26 10:31:42 2013: Task 3115001: Volume h.abcd.jchen114.backup (1971521033) not dumped - has not been modified since last dump ... --------------------- The vicep* partitions (or volsets), for which the backup dump/butc hang, are not consistent. If we kill and restart the dump process, some of the previously hung volsets finish while others hang. What info do we need to grab from butc and buserver in order to track the problem? Thanks. -pkd -- Prasad Dharmasena University of Maryland, College Park _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info