I guess I should elaborate a little The "RX Epoch" is a value chosen by each copy of the RX network stack and is used, in part, to disambiguate different instances of RX running on the same port. In openafs, the RX stack exists inside the RX-using process, not the networking bits in the kernel, so each independent program has a completely independent RX stack.
Each time you run vos, a new RX stack is spun up with a new epoch. The cache manager (afsd) uses an epoch chosen when it was started (i.e. during boot) The fileserver, ptserver, vlserver each have their own RX stack, with an epoch chosen when they were last restarted. On Thu, Jan 14, 2021 at 10:26 AM Ben Carter <b...@pitt.edu> wrote: > > So we are running 1.6 code and we definitely have a problem. However > for us, a sync site is being elected, but doing a vos examine from a > client seems to hang. Actual access to files in AFS seems to be working > fine but we've not restarted any file server processes. > > Ben > > On 1/14/21 10:21 AM, Chaskiel Grundman wrote: > > None of these things is confirmed yet, but based on some analysis and > > testing carnegie mellon has done today: > > > > - The problem is in RX (the transport layer), not any of the applications > > - It likely affects 1.8.0 and newer, but not 1.6 > > -It seems to be triggered by the RX epoch being after the unix time > > 0x60000000 aka 1610612736, aka Thu Jan 14 08:25:36 UTC 2021 > > > > > > So any cache manager and server that has been running since before that > > time will continue to work until they are restarted. Sites may wish to > > try and avoid having critical systems reboot or restart until a fix or > > workaround for this issue is identified. > > > > If anyone has a system running something 1.8.0 or newer where the command > > vos status afs-01.andrew.cmu.edu > > < > https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fafs-01.andrew.cmu.edu%2F&data=04%7C01%7Cbhc%40pitt.edu%7C41b163d418f34672980208d8b8a01ee8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1%7C0%7C637462345143664355%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=yrFiXzq9V9tiqqASL4EDgRrSChdNPbgkOsWeY3SFjvY%3D&reserved=0> > > > -noauth > > > > succeeds, I'd appreciate knowing about it, as it will change this > analysis. > > > -- > Ben Carter > System Engineer/Operations > University of Pittsburgh Information Technology > Office: 412-624-6470 > b...@pitt.edu > > _______________________________________________ > OpenAFS-info mailing list > OpenAFS-info@openafs.org > https://lists.openafs.org/mailman/listinfo/openafs-info >