Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

2018-03-23 Thread Stephan Wiesand

> On 23. Mar 2018, at 12:27, Kodiak Firesmith  wrote:
> 
> I've also tested gsgatlin's 7.5beta RPMs and they work great.  Any chance 
> we'll see the rh75enotdir patch integrated into a release of 1.6.22.3 soon?  
> I'm wondering if it'll be worth it to manually apply that patch to a rebuild 
> of the official OpenAFS RPMs if this isn't on the block for being merged and 
> released soon - but I don't want to blow the time applying that patch to a 
> re-roll if a fixed official release is forthcoming.

We are planning to release a 1.6.22.3 addressing the ENOTDIR issue with the 
EL7.5 kernel soon after the EL7.5 GA release.

- Stephan

> Thanks!
>  - Kodiak
> 
> 
> On Fri, Mar 2, 2018 at 3:47 AM, Anders Nordin  wrote:
> Hello,
> 
> Is there any progress on this issue? Can we expect a stable release for RHEL 
> 7.5?
> 
> MVH
> Anders
> 
> -Original Message-
> From: openafs-info-ad...@openafs.org [mailto:openafs-info-ad...@openafs.org] 
> On Behalf Of Benjamin Kaduk
> Sent: den 9 februari 2018 01:02
> To: Kodiak Firesmith 
> Cc: openafs-info 
> Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up
> 
> On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote:
> > Hello again All,
> >
> > As part of continued testing, I've been able to confirm that the
> > SystemD double-service startup thing only happens to my hosts when
> > going from RHEL
> > 7.4 to RHEL 7.5beta.  On a test host installed directly as RHEL
> > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the
> > point where OpenAFS "kind of" works.
> 
> Thanks for tracking this down.  The rpm packaging maintainers may want to try 
> to track down why the double-start happens in the upgrade scenario, as that's 
> pretty nasty behavior.
> 
> > What I'm observing is that the openafs client Kernel module (built by
> > DKMS) loads fine, and just so long as you know where you need to go in
> > /afs, you can get there, and you can read and write files and the OpenAFS 
> > 'fs'
> > command works.  But doing an 'ls' of /afs or any path underneath
> > results in
> > "ls: reading directory /afs/: Not a directory".
> >
> > I ran an strace of a good RHEL 7.4 host running ls on /afs, and a RHEL
> > 7.5beta host running ls on /afs and have created pastebins of both, as
> > well as an inline diff.
> >
> > All can be seen at the following locations:
> >
> > works
> > https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ
> >
> > fails
> > https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg
> >
> >
> > diff
> > https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A
> >
> > Hopefully this might help the OpenAFS devs, or someone might know what
> > might be borking on every RHEL 7.5 beta host.  It does fit with what
> > other
> > 7.5 beta users have observed OpenAFS doing.
> 
> Yes, now it seems like all our reports are consistent, and we just have to 
> wait for a developer to get a better look at what Red Hat changed in the 
> kernel that we need to adapt to.
> 
> -Ben
> 
> > Thanks!
> >  - Kodiak
> >
> > On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand
> > 
> > wrote:
> >
> > >
> > > > On 04.Feb 2018, at 02:11, Jeffrey Altman  wrote:
> > > >
> > > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote:
> > > >> I'm relatively new to handling OpenAFS.  Are these problems part
> > > >> of a normal "kernel release; openafs update" cycle and perhaps
> > > >> I'm getting snagged just by being too early of an adopter?  I
> > > >> wanted to raise the alarm on this and see if anything else was
> > > >> needed from me as the reporter of the issue, but perhaps that's
> > > >> an overreaction to what is just part of a normal process I just
> > > >> haven't been tuned into in prior RHEL release cycles?
> > > >
> > > >
> > > > Kodiak,
> > > >
> > > > On RHEL, DKMS is safe to use for kernel modules that restrict
> > > > themselves to using the restricted set of kernel interfaces (the
> > > > RHEL KABI) that Red Hat has designated will be supported across
> > > > the lifespan of the RHEL major version number.  OpenAFS is not
> > > > such a kernel module.  As a result it is vulnerable to breakage each 
> > > > and every time a new kernel is shipped.
> > >
> > > Jeffrey,
> > >
> > > the usual way to use DKMS is to either have it build a module for a
> > > newly installed kernel or install a prebuilt module for that kernel.
> > > It may be possible to abuse it for providing a module built for
> > > another kernel, but I think that won't happen accidentally.
> > >
> > > You may be confusing DKMS with RHEL's "KABI tracking kmods". Those
> > > should be safe to use within a RHEL minor release (and the SL
> > > packaging has been using them like this since EL6.4), but aren't
> > > across minor releases (and that's why the SL packaging modifies the
> > > kmod handling to require a build for 

Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up

2018-03-23 Thread Kodiak Firesmith
I've also tested gsgatlin's 7.5beta RPMs and they work great.  Any chance
we'll see the rh75enotdir patch integrated into a release of 1.6.22.3
soon?  I'm wondering if it'll be worth it to manually apply that patch to a
rebuild of the official OpenAFS RPMs if this isn't on the block for being
merged and released soon - but I don't want to blow the time applying that
patch to a re-roll if a fixed official release is forthcoming.

Thanks!
 - Kodiak


On Fri, Mar 2, 2018 at 3:47 AM, Anders Nordin 
wrote:

> Hello,
>
> Is there any progress on this issue? Can we expect a stable release for
> RHEL 7.5?
>
> MVH
> Anders
>
> -Original Message-
> From: openafs-info-ad...@openafs.org [mailto:openafs-info-admin@ope
> nafs.org] On Behalf Of Benjamin Kaduk
> Sent: den 9 februari 2018 01:02
> To: Kodiak Firesmith 
> Cc: openafs-info 
> Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up
>
> On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote:
> > Hello again All,
> >
> > As part of continued testing, I've been able to confirm that the
> > SystemD double-service startup thing only happens to my hosts when
> > going from RHEL
> > 7.4 to RHEL 7.5beta.  On a test host installed directly as RHEL
> > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the
> > point where OpenAFS "kind of" works.
>
> Thanks for tracking this down.  The rpm packaging maintainers may want to
> try to track down why the double-start happens in the upgrade scenario, as
> that's pretty nasty behavior.
>
> > What I'm observing is that the openafs client Kernel module (built by
> > DKMS) loads fine, and just so long as you know where you need to go in
> > /afs, you can get there, and you can read and write files and the
> OpenAFS 'fs'
> > command works.  But doing an 'ls' of /afs or any path underneath
> > results in
> > "ls: reading directory /afs/: Not a directory".
> >
> > I ran an strace of a good RHEL 7.4 host running ls on /afs, and a RHEL
> > 7.5beta host running ls on /afs and have created pastebins of both, as
> > well as an inline diff.
> >
> > All can be seen at the following locations:
> >
> > works
> > https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ
> >
> > fails
> > https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg
> >
> >
> > diff
> > https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A
> >
> > Hopefully this might help the OpenAFS devs, or someone might know what
> > might be borking on every RHEL 7.5 beta host.  It does fit with what
> > other
> > 7.5 beta users have observed OpenAFS doing.
>
> Yes, now it seems like all our reports are consistent, and we just have to
> wait for a developer to get a better look at what Red Hat changed in the
> kernel that we need to adapt to.
>
> -Ben
>
> > Thanks!
> >  - Kodiak
> >
> > On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand
> > 
> > wrote:
> >
> > >
> > > > On 04.Feb 2018, at 02:11, Jeffrey Altman 
> wrote:
> > > >
> > > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote:
> > > >> I'm relatively new to handling OpenAFS.  Are these problems part
> > > >> of a normal "kernel release; openafs update" cycle and perhaps
> > > >> I'm getting snagged just by being too early of an adopter?  I
> > > >> wanted to raise the alarm on this and see if anything else was
> > > >> needed from me as the reporter of the issue, but perhaps that's
> > > >> an overreaction to what is just part of a normal process I just
> > > >> haven't been tuned into in prior RHEL release cycles?
> > > >
> > > >
> > > > Kodiak,
> > > >
> > > > On RHEL, DKMS is safe to use for kernel modules that restrict
> > > > themselves to using the restricted set of kernel interfaces (the
> > > > RHEL KABI) that Red Hat has designated will be supported across
> > > > the lifespan of the RHEL major version number.  OpenAFS is not
> > > > such a kernel module.  As a result it is vulnerable to breakage each
> and every time a new kernel is shipped.
> > >
> > > Jeffrey,
> > >
> > > the usual way to use DKMS is to either have it build a module for a
> > > newly installed kernel or install a prebuilt module for that kernel.
> > > It may be possible to abuse it for providing a module built for
> > > another kernel, but I think that won't happen accidentally.
> > >
> > > You may be confusing DKMS with RHEL's "KABI tracking kmods". Those
> > > should be safe to use within a RHEL minor release (and the SL
> > > packaging has been using them like this since EL6.4), but aren't
> > > across minor releases (and that's why the SL packaging modifies the
> > > kmod handling to require a build for the minor release in question.
> > >
> > > > There are two types of failures that can occur:
> > > >
> > > > 1. a change results in failure to build the OpenAFS kernel module
> > > >for the new kernel
> > > >
> > > > 2. a change results in the OpenAFS kernel