On Mon, 7 Dec 2015, Garance A Drosehn wrote: > Hi. > > I've been busy moving our AFS volumes from ancient file servers to > up-to-date file servers. So far this has been going along well, > but last week I ran into an odd error moving one 10.79 GiB file. > > My main question is: Could a problem like this be caused by my > AFS token expiring in the middle of the transfer? Here's the > output from vos-move: > > /usr/sbin/vos move -id <_details_> -verbose > Starting transaction on source volume <__old__> ... done > Allocating new volume id for clone of volume <__old__> ... done > Cloning source volume <__old__> ... done > Ending the transaction on the source volume <__old__> ... done > Starting transaction on the cloned volume <_clone_> ... done > Setting flags on cloned volume <_clone_> ... done > Getting status of cloned volume <_clone_> ... done > Deleting pre-existing destination volume <__old__> ...Creating the > destination volume <__old__> ... done > Setting volume flags on destination volume <__old__> ... done > Dumping from clone <_clone_> on source to volume <__old__> on destination > ...vos move: operation interrupted, cleanup in progress... > clear transaction contexts > Recovery: Releasing VLDB lock on volume <__old__> ... done > Recovery: Ending transaction on clone volume ... done > Recovery: Ending transaction on destination volume ... done > Recovery: Accessing VLDB. > FATAL: VLDB access error: abort cleanup > cleanup complete - user verify desired result > #------>Error-> *** cs=256 *** > > The vos-move command took about 54 minutes. It started after I > had moved several other large volumes, and it happened that my > AFS token expired in the middle of this vos-move. I was doing > some other things in AFS at the time, and the token could not > have been expired longer than a minute or two before I noticed > it. I did a new 'klog', and it was at least five minutes later > before the vos-move terminated. I suspect it was more like > 10-15 minutes, but I didn't really keep track of that. > > So, could the problem have been caused by the token expiring in > the middle of the transfer?
Yes. The client will not create a new connection to pick up the new token, and will continue using the old token until the server notices it is bad and sends a new challenge (usually around expiry+skew window). > At this point, if I do a 'listvol' on both the source and > destination servers, the volume exists on both of them. On > the destination server the volume is marked as 'Off-line'. > If I do a 'vos examine', the volume is listed as being on > the original (source) server, and is also marked as LOCKED. > > I assume that the thing to do right now would be to: > 1. vos-remove the copy which exists on the destination > file server (and which is not shown in vos-examine). > 2. vos-unlock the copy which exists on the original > file server. > 3. Retry the vos-move, this time making sure my AFS token > won't expire in the middle of the transfer! > > Does this seem reasonable? Is there any other checks I should > do before trying those? I was able to read all the data in the > volume (using 'md5sum') without warnings or errors showing up > in any log files on the server. That sounds like a correct procedure. Note that the credentials used by -localauth do not expire; I suggest using that for a long-running move. -Ben _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info