I guess I should elaborate a little
The "RX Epoch" is a value chosen by each copy of the RX network stack and
is used, in part, to disambiguate different instances of RX running on the
same port.
In openafs, the RX stack exists inside the RX-using process, not the
networking bits in the kernel, so each independent program has a completely
independent RX stack.

Each time you run vos, a new RX stack is spun up with a new epoch.
The cache manager (afsd) uses an epoch chosen when it was started (i.e.
during boot)
The fileserver, ptserver, vlserver each have their own RX stack, with an
epoch chosen when they were last restarted.

On Thu, Jan 14, 2021 at 10:26 AM Ben Carter <b...@pitt.edu> wrote:

>
> So we are running 1.6 code and we definitely have a problem.  However
> for us, a sync site is being elected, but doing a vos examine from a
> client seems to hang.  Actual access to files in AFS seems to be working
> fine but we've not restarted any file server processes.
>
> Ben
>
> On 1/14/21 10:21 AM, Chaskiel Grundman wrote:
> > None of these things is confirmed yet, but based on some analysis and
> > testing carnegie mellon has done today:
> >
> > - The problem is in RX (the transport layer), not any of the applications
> > - It likely affects 1.8.0 and newer, but not 1.6
> > -It seems to be triggered by the RX epoch being after the unix time
> > 0x60000000  aka 1610612736, aka Thu Jan 14 08:25:36 UTC 2021
> >
> >
> > So any cache manager and server that has been running since before that
> > time will continue to work until they are restarted. Sites may wish to
> > try and avoid having critical systems reboot or restart until a fix or
> > workaround for this issue is identified.
> >
> > If anyone has a system running something 1.8.0 or newer where the command
> > vos status afs-01.andrew.cmu.edu
> > <
> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fafs-01.andrew.cmu.edu%2F&data=04%7C01%7Cbhc%40pitt.edu%7C41b163d418f34672980208d8b8a01ee8%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1%7C0%7C637462345143664355%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=yrFiXzq9V9tiqqASL4EDgRrSChdNPbgkOsWeY3SFjvY%3D&reserved=0>
>
> > -noauth
> >
> > succeeds, I'd appreciate knowing about it, as it will change this
> analysis.
>
>
> --
> Ben Carter
> System Engineer/Operations
> University of Pittsburgh Information Technology
> Office: 412-624-6470
> b...@pitt.edu
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>

Reply via email to