On 29 Aug 2012, at 16:21, Stephan Wiesand wrote:

> Since SL6, we have we have been using "kABI tracking kmods" for installing 
> the OpenAFS kernel module on clients. For full information on this mechanism, 
> see http://people.redhat.com/jcm/el6/dup/docs/dup_book.pdf . In short, you 
> only have to compile and install the module once, and it will be used with 
> future kernels as long as it doesn't use parts of the ABI that changed.
> 
> Trying this may have been stupid in the first place. If so, happy bashing :-)

Not trying to bash, but you've encountered the problem that is built in to this 
approach.

What RedHat's kABI stuff guarantees is that a small whitelist of function 
signatures will not change across all of the kernels which claim to share the 
same ABI. They got to great lengths to make this the case - radically modifying 
changes that they backport from the mainline kernel so those changes can be 
incorporated without changing their guaranteed kABI. Roughly speaking the 
guarantee is that kernel modules built against one GA kernel will work against 
all kernels in that major version. Each minor version may add new functions, 
but they will never change the signatures of existing whitelisted functions, or 
remove whitelisted functions.

The critical thing to realise is that this guarantee only applies to functions 
on RedHat's whitelist. If you use non-whitelisted functions, you can't rely on 
the ABI guarantees. These functions may go away, or (worse) the number or 
nature of their arguments between any arbitrary kernel release. Because you 
aren't recompiling for each release, you won't notice the change in arguments 
and so will quite possibly end up calling something that expects a struct inode 
with a struct nameidata, or something that needs 5 arguments with 2, and so on.

Needless to say, OpenAFS uses many symbols which aren't on RedHat's whitelist. 
So, you are pretty much sitting on a ticking timebomb, with quite significant 
data integrity ramifications. To be honest, I'm surprised it has taken so long 
for things to break,

> Thanks a lot in advance for any insights.

If you want to track this down, I'd advise building a list of all of the 
symbols that OpenAFS uses that aren't on RedHat's whitelist for EL6 (from 
memory, this is a fairly considerable set), and then look at whether the 
function signatures, or structure definitions used by those function signatures 
have changed with the new kernel revision. That should give you an idea of 
where to look.

However, I suspect that you're going to continue to encounter this problem. 
Unless someone with a support contract can convince RedHat to include all of 
the symbols OpenAFS requires in their whitelist, it just isn't safe to use the 
kABI stuff for OpenAFS.

Cheers,

Simon.

_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Reply via email to