On 29 Aug 2012, at 16:21, Stephan Wiesand wrote: > Since SL6, we have we have been using "kABI tracking kmods" for installing > the OpenAFS kernel module on clients. For full information on this mechanism, > see http://people.redhat.com/jcm/el6/dup/docs/dup_book.pdf . In short, you > only have to compile and install the module once, and it will be used with > future kernels as long as it doesn't use parts of the ABI that changed. > > Trying this may have been stupid in the first place. If so, happy bashing :-)
Not trying to bash, but you've encountered the problem that is built in to this approach. What RedHat's kABI stuff guarantees is that a small whitelist of function signatures will not change across all of the kernels which claim to share the same ABI. They got to great lengths to make this the case - radically modifying changes that they backport from the mainline kernel so those changes can be incorporated without changing their guaranteed kABI. Roughly speaking the guarantee is that kernel modules built against one GA kernel will work against all kernels in that major version. Each minor version may add new functions, but they will never change the signatures of existing whitelisted functions, or remove whitelisted functions. The critical thing to realise is that this guarantee only applies to functions on RedHat's whitelist. If you use non-whitelisted functions, you can't rely on the ABI guarantees. These functions may go away, or (worse) the number or nature of their arguments between any arbitrary kernel release. Because you aren't recompiling for each release, you won't notice the change in arguments and so will quite possibly end up calling something that expects a struct inode with a struct nameidata, or something that needs 5 arguments with 2, and so on. Needless to say, OpenAFS uses many symbols which aren't on RedHat's whitelist. So, you are pretty much sitting on a ticking timebomb, with quite significant data integrity ramifications. To be honest, I'm surprised it has taken so long for things to break, > Thanks a lot in advance for any insights. If you want to track this down, I'd advise building a list of all of the symbols that OpenAFS uses that aren't on RedHat's whitelist for EL6 (from memory, this is a fairly considerable set), and then look at whether the function signatures, or structure definitions used by those function signatures have changed with the new kernel revision. That should give you an idea of where to look. However, I suspect that you're going to continue to encounter this problem. Unless someone with a support contract can convince RedHat to include all of the symbols OpenAFS requires in their whitelist, it just isn't safe to use the kABI stuff for OpenAFS. Cheers, Simon. _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
