Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-26 Thread Julien Cristau
On Sun, Dec 19, 2010 at 19:30:58 +0100, Julien BLACHE wrote:

> I think it would be best if this matter would be decided upon before the
> release of Squeeze, or not too long after it, so as to avoid further
> breakages in early kernel updates for Squeeze.
> 
We're getting close to the squeeze release.  Is the technical committee
going to reach a decision on this?

Cheers,
Julien


signature.asc
Description: Digital signature


Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-05 Thread Julien BLACHE
Don Armstrong  wrote:

Hi,

> Ok. My main concern here is what exactly would happen if we were to
> ignore the ABI change for this particular issue, and then put in place
> some kind of a process where the kernel team could be informed of
> downstream users of the ABI.

The harm is done now, reverting or bumping the ABI at this point only
makes things worse.

>> Full deployment involves over a thousand workstations.
>
> But presumably they're not running a testing version affected by this.

At this time I have no assurance that this issue or a similar issue with
another symbol won't happen again during the Squeeze lifetime, so they
are potentially affected until proven otherwise as far as I'm concerned.

To the thousand machines given above, you can add several hundred
machines part of several HPC clusters; the nodes use external InfiniBand
drivers from ofa-kernel 1.5.2 in the pkg-ofed repository. Having the
cluster fail to come online after a kernel upgrade would be interesting.

We also have servers using the Brocade FC HBA/CNA drivers from Brocade,
due to the 2.6.32 drivers being way out of date (2.6.32->2.6.37 is
ca. 100 commits and needs new firmware files with new names, if anyone
is interested).

>> package is upgraded, we'd still have issues with on-disk modules not
>> matching the running kernel ABI until the machine is rebooted. This
>> can sometimes take two or three weeks if a long-running computation
>> is running on the machine.
>
> Presumably this wouldn't be much of an issue, unless users are going
> to be newly loading these modules. [Which I would hope wouldn't be the
> case if you were running a long-running computation.]

Modules get loaded automatically pretty much all the time on a
workstation: filesystem modules for a USB key or when upgrading grub,
drivers for USB devices, you name it.

>> And I'll ask again: what's the point of the kernel ABI number if we
>> have to use strict dependencies?
>
> Some modules may need strict dependencies if they are using symbols
> not covered by the ABI; this is one possible way that we can resolve
> this issue.

The issue I have with that, other than the fact that it is just plain
wrong, is that all the module packaging tools were built on the premise
that changes to the kernel ABI are reflected by the ABI number. None of
the tools work if that premise doesn't hold true.

JB.

-- 
 Julien BLACHE   |  Debian, because code matters more 
 Debian & GNU/Linux Developer|   
 Public key available on  - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/871v4rwpv1@sonic.technologeek.org



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Russ Allbery
Ben Hutchings  writes:

> DKMS does build real Debian packages.  And that means that OOT module
> sources do not need to be packaged differently depending on where the
> modules will be built.

Oh, huh, I hadn't noticed that.  Thanks for the pointer!  I'll have to
play with that; I'd only previously seen the tarball distribution and
installation mechanism.

The work of providing both the -dkms and the traditional -source package
is fairly trivial and not much of a drain on the packager's time once the
original -source rules have been written.  I'm doing it right now for
multiple packages.  But writing the original -source package rules file is
arcane and very under-documented, so this is potentially a long-term
improvement.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87y670gl1j@windlord.stanford.edu



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Ben Hutchings
On Tue, 2011-01-04 at 17:55 -0800, Russ Allbery wrote:
> Ben Hutchings  writes:
> > On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:
> 
> >> With hundreds of servers, we'd rather not install compilers and DKMS on
> >> every one of them, and with lots of machines, the loss of
> >> reproducibility from separately compiling the modules on every system
> >> is an increasingly large drawback.
> 
> > This is why DKMS has the facility to build packages for installation
> > elsewhere.
> 
> But there would be no purpose served in using DKMS for this.  The only
> place where DKMS has an advantage over building real Debian packages for
> the modules is if you're going to let every machine build its own modules.
[...]

DKMS does build real Debian packages.  And that means that OOT module
sources do not need to be packaged differently depending on where the
modules will be built.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Russ Allbery
Ben Hutchings  writes:
> On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:

>> With hundreds of servers, we'd rather not install compilers and DKMS on
>> every one of them, and with lots of machines, the loss of
>> reproducibility from separately compiling the modules on every system
>> is an increasingly large drawback.

> This is why DKMS has the facility to build packages for installation
> elsewhere.

But there would be no purpose served in using DKMS for this.  The only
place where DKMS has an advantage over building real Debian packages for
the modules is if you're going to let every machine build its own modules.
As soon as you are distributing modules built once to multiple machines,
using DKMS to do that is vaguely absurd: you have to reinvent all the
mechanisms of a repository and package upgrade system, when we already
have a perfectly useful and reasonable one in apt repositories with
package versioning and proper dependencies.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/877heki064@windlord.stanford.edu



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Ben Hutchings
On Tue, 2011-01-04 at 17:23 -0800, Russ Allbery wrote:
> Ben Hutchings  writes:
> 
> > Do pay attention.  We were discussing the implications of changing our
> > current practice of trying to avoid ABI bumps during freeze and stable
> > updates.  We would then probably change the uname release (the ABI
> > identifier) in each version of the package.
> 
> This is certainly becoming more appealing with DKMS, but with my Stanford
> sysadmin hat on, I have to admit that we'd find it rather annoying if the
> ABI changed in stable.  I think that may be a good way to go in unstable
> and testing up to the release, but it would be very nice to not do that
> after the release.

However, the upstream policy for stable updates does not support this.

> With hundreds of servers, we'd rather not install compilers and DKMS on
> every one of them, and with lots of machines, the loss of reproducibility
> from separately compiling the modules on every system is an increasingly
> large drawback.

This is why DKMS has the facility to build packages for installation
elsewhere.

> We currently build internal packages (from the *-source
> packages provided by Debian) for those external modules that we use so
> that we can deploy the same thing everywhere, and having to rebuild
> modules for every kernel update and deploy those new builds with the
> kernel update would be fairly annoying. With that system, we know for
> sure that if the module mysteriously fails on one system but not on
> others, it's not because it's a weird build or has some other compilation
> issue.
> 
> In fact, we know almost exactly how annoying it would be, since Red Hat
> has this policy, and it's been a major pain.  The handling of the kernel
> versioning in stable is currently one of the major selling points for
> Debian over Red Hat for us.
[...]

Note that Red Hat does maintain the ABI for most functions, even though
it change the uname release.  If you package OOT modules using the 'KMP'
macros for RPM, binary modules will be sym-linked into a 'weak-updates'
subdirectory for a newer kernel if their symbol dependencies are still
met.

We could try to implement something like that in Debian.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Russ Allbery
Ben Hutchings  writes:

> Do pay attention.  We were discussing the implications of changing our
> current practice of trying to avoid ABI bumps during freeze and stable
> updates.  We would then probably change the uname release (the ABI
> identifier) in each version of the package.

This is certainly becoming more appealing with DKMS, but with my Stanford
sysadmin hat on, I have to admit that we'd find it rather annoying if the
ABI changed in stable.  I think that may be a good way to go in unstable
and testing up to the release, but it would be very nice to not do that
after the release.

With hundreds of servers, we'd rather not install compilers and DKMS on
every one of them, and with lots of machines, the loss of reproducibility
from separately compiling the modules on every system is an increasingly
large drawback.  We currently build internal packages (from the *-source
packages provided by Debian) for those external modules that we use so
that we can deploy the same thing everywhere, and having to rebuild
modules for every kernel update and deploy those new builds with the
kernel update would be fairly annoying.  With that system, we know for
sure that if the module mysteriously fails on one system but not on
others, it's not because it's a weird build or has some other compilation
issue.

In fact, we know almost exactly how annoying it would be, since Red Hat
has this policy, and it's been a major pain.  The handling of the kernel
versioning in stable is currently one of the major selling points for
Debian over Red Hat for us.  The very few times an ABI change was forced
in Debian stable due to some security issue, we had to put a fair bit of
work into making sure that everything was upgraded properly everywhere to
the new ABI.

(So thank you very much for all the work that you put into maintaining the
ABI!)

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87k4iki1o2@windlord.stanford.edu



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Don Armstrong
On Tue, 04 Jan 2011, Julien BLACHE wrote:
> Don Armstrong  wrote:
> > Julien: Are you currently shipping a kernel in production which
> > would be affected by this change if we don't change the ABI
> > number? Or does this only affect cases where you are testing
> > squeeze? Could it be
> 
> I have 30 beta-testers that are affected by this issue on the
> workstations they have started using for their everyday work.
> Although it's still a beta phase, at this point, these workstations
> are to be considered "in production" given the users have basically
> made the switch now.

Ok. My main concern here is what exactly would happen if we were to
ignore the ABI change for this particular issue, and then put in place
some kind of a process where the kernel team could be informed of
downstream users of the ABI.

From my current understanding, the ABI number is only meant to cover
some of the symbols which can be used externally, not all of them.
[Specifically, those that the kernel team are aware of being used
externally.]

> Full deployment involves over a thousand workstations.

But presumably they're not running a testing version affected by this.

> > worked around by using DKMS or similar with prebuilt binaries and
> > requiring exact kernel version dependencies?
> 
> DKMS is useless if the ABI number doesn't change, in its current
> form. If DKMS was changed to rebuild all modules when the kernel
> package is upgraded, we'd still have issues with on-disk modules not
> matching the running kernel ABI until the machine is rebooted. This
> can sometimes take two or three weeks if a long-running computation
> is running on the machine.

Presumably this wouldn't be much of an issue, unless users are going
to be newly loading these modules. [Which I would hope wouldn't be the
case if you were running a long-running computation.]

> As to using strict dependencies... it makes all of the above even
> worse.

Certainly; there's a cost to be born on both sides. The most important
thing to avoid from my perspective is a kernel which when booted has
modules that cannot be loaded.
 
> And I'll ask again: what's the point of the kernel ABI number if we
> have to use strict dependencies?

Some modules may need strict dependencies if they are using symbols
not covered by the ABI; this is one possible way that we can resolve
this issue.

> Seriously?

Lets restrict ourselves to discussing the technical issues and
possible solutions instead of rhetorical flourishes.


Don Armstrong

-- 
The computer allows you to make mistakes faster than any other
invention, with the possible exception of handguns and tequila
 -- Mitch Ratcliffe

http://www.donarmstrong.com  http://rzlab.ucr.edu


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110104230510.gn4...@rzlab.ucr.edu



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Ben Hutchings
On Tue, Jan 04, 2011 at 12:28:22PM +0100, Julien BLACHE wrote:
> Don Armstrong  wrote:
[...]
> > worked around by using DKMS or similar with prebuilt binaries and
> > requiring exact kernel version dependencies?
> 
> DKMS is useless if the ABI number doesn't change, in its current
> form. If DKMS was changed to rebuild all modules when the kernel package
> is upgraded, we'd still have issues with on-disk modules not matching
> the running kernel ABI until the machine is rebooted. This can sometimes
> take two or three weeks if a long-running computation is running on the
> machine.
> 
> We switched to DKMS to reduce the maintenance cost associated with
> prebuilt binaries. We'd rather not come back to that if we can help
> it. It also adds a delay to kernel updates that we'd rather avoid.
> 
> As to using strict dependencies... it makes all of the above even
> worse.
> 
> And I'll ask again: what's the point of the kernel ABI number if we have
> to use strict dependencies? Seriously?
[...]
 
Do pay attention.  We were discussing the implications of changing our
current practice of trying to avoid ABI bumps during freeze and stable
updates.  We would then probably change the uname release (the ABI
identifier) in each version of the package.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
  - Albert Camus


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110104223042.gh3...@decadent.org.uk



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-04 Thread Julien BLACHE
Don Armstrong  wrote:

Hi,

> Ok. For some reason, I hadn't originally noticed that this was
> concerning an OOT module which Debian itself didn't actually
> distribute. [Julien: I'm correct in that, right?] But that's probably
> fine.

You are correct.

> Julien: Are you currently shipping a kernel in production which would
> be affected by this change if we don't change the ABI number? Or does
> this only affect cases where you are testing squeeze? Could it be

I have 30 beta-testers that are affected by this issue on the
workstations they have started using for their everyday work. Although
it's still a beta phase, at this point, these workstations are to be
considered "in production" given the users have basically made the
switch now.

Full deployment involves over a thousand workstations.

> worked around by using DKMS or similar with prebuilt binaries and
> requiring exact kernel version dependencies?

DKMS is useless if the ABI number doesn't change, in its current
form. If DKMS was changed to rebuild all modules when the kernel package
is upgraded, we'd still have issues with on-disk modules not matching
the running kernel ABI until the machine is rebooted. This can sometimes
take two or three weeks if a long-running computation is running on the
machine.

We switched to DKMS to reduce the maintenance cost associated with
prebuilt binaries. We'd rather not come back to that if we can help
it. It also adds a delay to kernel updates that we'd rather avoid.

As to using strict dependencies... it makes all of the above even
worse.

And I'll ask again: what's the point of the kernel ABI number if we have
to use strict dependencies? Seriously?

We need a kernel ABI numbering we can rely on.

JB.

-- 
 Julien BLACHE   |  Debian, because code matters more 
 Debian & GNU/Linux Developer|   
 Public key available on  - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87oc7wgb6x@sonic.technologeek.org



Re: Bug#607368: Please decide how kernel ABI should be managed

2011-01-03 Thread Don Armstrong
On Mon, 27 Dec 2010, Ben Hutchings wrote:
> On Sun, 2010-12-26 at 15:55 -0800, Don Armstrong wrote:
> > Ok. And am I correct in assuming that if the ABI change would
> > break an OOT module, you would normally change the ABI number?
> 
> In the time I've been involved in the kernel team, I haven't yet
> seen a case where a bug fix required an ABI change that I knew would
> break an OOT module.

So in this case, if it was clear that the change would have broken an
OOT module, the kernel team would normally either postpone the change,
or change the ABI number.

> Anything distributed by Debian should meet those qualifications, but
> users such as Julien also care about modules from other sources. I
> normally use Google Code Search to check for OOT modules using
> symbols that have changed ABI and which I think might be ignorable.

Ok. For some reason, I hadn't originally noticed that this was
concerning an OOT module which Debian itself didn't actually
distribute. [Julien: I'm correct in that, right?] But that's probably
fine.
 
> > How are the symbols that those OOT modules use communicated to the
> > kernel team?
> 
> They aren't.

Would putting the onus on OOT maintainers to maintain such a list be
of benefit to the kernel maintainer team?

> > What does the kernel maintainer team feel should be done by the
> > maintainer in this case to ensure continuity of upgrades and rebuilds
> > of the OOT modules?
> [...]
> 
> We recommend that OOT module package makes use of DKMS. DKMS
> includes hook scripts to trigger rebuilding OOT modules
> automatically for each new kernel ABI version, if the end user or
> administrator installs the module source and the appropriate
> linux-headers package. In a more tightly controlled environment
> where such packages should not be installed on production servers,
> the administrator must rebuild modules elsewhere and deploy them
> along with the kernel upgrade. DKMS provides various means for this.

Makes sense. What about this case? What should Julien do?

Julien: Are you currently shipping a kernel in production which would
be affected by this change if we don't change the ABI number? Or does
this only affect cases where you are testing squeeze? Could it be
worked around by using DKMS or similar with prebuilt binaries and
requiring exact kernel version dependencies?


Don Armstrong

-- 
I don't care how poor and inefficient a little country is; they like
to run their own business.  I know men that would make my wife a
better husband than I am; but, darn it, I'm not going to give her to
'em.
 -- The Best of Will Rogers

http://www.donarmstrong.com  http://rzlab.ucr.edu


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110104045638.gg5...@teltox.donarmstrong.com



Re: Bug#607368: Please decide how kernel ABI should be managed

2010-12-26 Thread Ben Hutchings
On Sun, 2010-12-26 at 15:55 -0800, Don Armstrong wrote:
> On Sun, 26 Dec 2010, Ben Hutchings wrote:
> > On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> > > or possibly by using Breaks: for all of the affected out-of-tree
> > > modules where the change wasn't wide-spread enough to bump the ABI
> > > number.
> > 
> > No. Firstly, if we know that an ABI change would break an OOT module
> > then we try to avoid making that change.
> 
> Ok. And am I correct in assuming that if the ABI change would break an
> OOT module, you would normally change the ABI number?

In the time I've been involved in the kernel team, I haven't yet seen a
case where a bug fix required an ABI change that I knew would break an
OOT module.

I understand that in the past the kernel team has deferred such bug
fixes and eventually applied such deferred changes as a batch while
changing the ABI number, after coordinating with affected people (such
as the d-i and CD teams).

> Which OOT modules are important enough to result in ABI number
> changes?

We don't have a formal policy but I think we consider OOT modules that
(1) appear to be used in production and (2) have published source code
for at least the part that directly uses kernel symbols.

Anything distributed by Debian should meet those qualifications, but
users such as Julien also care about modules from other sources.  I
normally use Google Code Search to check for OOT modules using symbols
that have changed ABI and which I think might be ignorable.

> How are the symbols that those OOT modules use communicated to the
> kernel team?

They aren't.

> What does the kernel maintainer team feel should be done by the
> maintainer in this case to ensure continuity of upgrades and rebuilds
> of the OOT modules?
[...]

We recommend that OOT module package makes use of DKMS.  DKMS includes
hook scripts to trigger rebuilding OOT modules automatically for each
new kernel ABI version, if the end user or administrator installs the
module source and the appropriate linux-headers package.  In a more
tightly controlled environment where such packages should not be
installed on production servers, the administrator must rebuild modules
elsewhere and deploy them along with the kernel upgrade.  DKMS provides
various means for this.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Re: Bug#607368: Please decide how kernel ABI should be managed

2010-12-26 Thread Don Armstrong
On Sun, 26 Dec 2010, Ben Hutchings wrote:
> On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> > or possibly by using Breaks: for all of the affected out-of-tree
> > modules where the change wasn't wide-spread enough to bump the ABI
> > number.
> 
> No. Firstly, if we know that an ABI change would break an OOT module
> then we try to avoid making that change.

Ok. And am I correct in assuming that if the ABI change would break an
OOT module, you would normally change the ABI number?

Which OOT modules are important enough to result in ABI number
changes?

How are the symbols that those OOT modules use communicated to the
kernel team?

What does the kernel maintainer team feel should be done by the
maintainer in this case to ensure continuity of upgrades and rebuilds
of the OOT modules?

> > A slightly wilder alternative, is to Provides:
> > linux-kernel-abi-2.6.32-vmware-5 or something for out-of-tree
> > modules which aren't going to be covered by the main ABI, but are
> > important enough to require compatibility.
> 
> I refuse to support any specific OOT module in this way unless paid
> to do so. I expect that other kernel team members will tell you the
> same.

I personally don't think a Provides: solution is going to be feasible
for technical reasons, and coordination reasons. Lets restrict
ourselves to discussing the technical reasons why a solution is
infeasible, rather than possible monetary impetus required to
implement them.


Don Armstrong

-- 
No matter how many instances of white swans we may have observed, this
does not justify the conclusion that all swans are white.
 -- Sir Karl Popper _Logic of Scientific Discovery_

http://www.donarmstrong.com  http://rzlab.ucr.edu


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101226235512.ga5...@teltox.donarmstrong.com



Re: Bug#607368: Please decide how kernel ABI should be managed

2010-12-26 Thread Ben Hutchings
On Sun, 2010-12-26 at 12:23 -0800, Don Armstrong wrote:
> On Sun, 26 Dec 2010, Ben Hutchings wrote:
> > On Thu, 2010-12-23 at 12:08 -0800, Don Armstrong wrote:
> > > On Sun, 19 Dec 2010, Julien BLACHE wrote:
> > > > I think it would be best if this matter would be decided upon before
> > > > the release of Squeeze, or not too long after it, so as to avoid
> > > > further breakages in early kernel updates for Squeeze.
> > > 
> > > I have a couple of (possibly naïve) questions that would help me
> > > understand the space of solutions here.
> > > 
> > > 1) What is the kernel ABI currently used to indicate?
> > 
> > The ABI *number* indicates a range of versions within which newer
> > versions are likely to remain compatible with modules built for an
> > older version.
> 
> So currently there is no guarantee that a specific ABI maintains any
> kind of compatibility for out of tree modules; it is a best effort
> based on the kernel maintainer's understanding of what symbols have
> changed and what out of tree (or even in-tree) modules are affected.
> 
> Do the kernel maintainers currently track compatibility of in-tree
> modules for modules which may reasonably be loaded during the lifetime
> of the install? [I'm thinking of removable device drivers, things like
> KVM, etc.]

Not specifically.  *Most* modules will remain compatible, but we expect
users to reboot shortly after a kernel upgrade.

[...]
> What I think is missing now, is a discussion of which cases where
> changing the ABI number is necessary for proper functioning, and which
> cases of malfunction we feel are acceptable, and which are not.
> 
> For in tree modules, all of the problems that would occur from
> upgrading a kernel where the ABI had changed (but not the number) can
> be resolved by rebooting. I'm personally a bit concerned that these
> errors may be a bit disconcerting to our users, but that may be
> something we decide to live with and document.
> 
> For out of tree modules, these problems can either be resolved by
> changing the ABI number,

Yes.

> or possibly by using Breaks: for all of the
> affected out-of-tree modules where the change wasn't wide-spread
> enough to bump the ABI number.

No.  Firstly, if we know that an ABI change would break an OOT module
then we try to avoid making that change.  Therefore, if an ABI change
does break an OOT module then we would not know that we should add the
Breaks relation.  Also, we now recommend that OOT module sources are
packaged using dkms, which means the module binaries are *not* packaged
and no such relation can be declared.

> A slightly wilder alternative, is to
> Provides: linux-kernel-abi-2.6.32-vmware-5 or something for
> out-of-tree modules which aren't going to be covered by the main ABI,
> but are important enough to require compatibility.
[...]

I refuse to support any specific OOT module in this way unless paid to
do so.  I expect that other kernel team members will tell you the same.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Re: Bug#607368: Please decide how kernel ABI should be managed

2010-12-26 Thread Russ Allbery
Don Armstrong  writes:

> So currently there is no guarantee that a specific ABI maintains any
> kind of compatibility for out of tree modules; it is a best effort based
> on the kernel maintainer's understanding of what symbols have changed
> and what out of tree (or even in-tree) modules are affected.

I feel like I should note here that I've been maintaining a complex
out-of-tree kernel module for Debian for many years now (openafs) and am
also involved in maintaining the non-free NVIDIA modules, and I can't
remember ever having the kernel ABI break for those modules without the
ABI number changing.  It's probably happened and I just don't remember it,
but certainly not enough to be memorable.

*Upstream* has caused us all sorts of problems from time to time because
of taking public symbols and making them GPL-only (OpenAFS predates Linux
and the core of the source is licensed under a free but GPL-incompatible
license, which also affects the kernel module), but the Debian kernel
maintainers have always done a great job at maintaining ABI guarantees,
insofar as my packages are affected.  The only problem that I recall with
the ABI numbering was the unfortunate use of -trunk as an ABI version
during the squeeze development cycle, and there mostly because -trunk
sorted inappropriately after regular ABI numbers were introduced, not
because of an inherent problem with the use of that technique in unstable.

So while I do recognize that there was a problem with an out-of-tree
module that brought this particular bug to the technical committee, I have
to say that with my out-of-tree module maintainer hat on the kernel team
seems to, by and large, be doing a good job of maintaining the kernel ABI
already.  That inclines me against supporting any major change in how this
is handled.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87y67c45rx@windlord.stanford.edu



Re: Bug#607368: Please decide how kernel ABI should be managed

2010-12-26 Thread Don Armstrong
On Sun, 26 Dec 2010, Ben Hutchings wrote:
> On Thu, 2010-12-23 at 12:08 -0800, Don Armstrong wrote:
> > On Sun, 19 Dec 2010, Julien BLACHE wrote:
> > > I think it would be best if this matter would be decided upon before
> > > the release of Squeeze, or not too long after it, so as to avoid
> > > further breakages in early kernel updates for Squeeze.
> > 
> > I have a couple of (possibly naïve) questions that would help me
> > understand the space of solutions here.
> > 
> > 1) What is the kernel ABI currently used to indicate?
> 
> The ABI *number* indicates a range of versions within which newer
> versions are likely to remain compatible with modules built for an
> older version.

So currently there is no guarantee that a specific ABI maintains any
kind of compatibility for out of tree modules; it is a best effort
based on the kernel maintainer's understanding of what symbols have
changed and what out of tree (or even in-tree) modules are affected.

Do the kernel maintainers currently track compatibility of in-tree
modules for modules which may reasonably be loaded during the lifetime
of the install? [I'm thinking of removable device drivers, things like
KVM, etc.]

> I think I should explain at this point the trade-off we're trying to
> make.
> 
> As you know, the kernel-space ABI is volatile and upstream has no
> intention of maintaining it, even within a stable/long-term series.
> Build configuration changes may also change the ABI in unexpected
> ways. Therefore it is generally not practical to maintain ABI within
> a single upstream version.

Right.
 
> Changing the ABI number requires (1) changing the package names and
> (2) rebuilding out-of-tree modules. (1) means linux-2.6 must go
> through the NEW queue and also disrupts d-i development (the latter
> problem may be reduced within the wheezy release cycle). It also
> requires end users and administrators to explicitly remove old
> kernel image packages. (2) should not be a huge burden so long as
> the modules are packaged using dkms, but auto- rebuilding relies on
> having a toolchain installed. Therefore we do not like to change the
> ABI number during a stable release or the preceding freeze.

So from what I can see, the ideal situation is to not change the
kernel ABI number unless we absolutely have to.

What I think is missing now, is a discussion of which cases where
changing the ABI number is necessary for proper functioning, and which
cases of malfunction we feel are acceptable, and which are not.

For in tree modules, all of the problems that would occur from
upgrading a kernel where the ABI had changed (but not the number) can
be resolved by rebooting. I'm personally a bit concerned that these
errors may be a bit disconcerting to our users, but that may be
something we decide to live with and document.

For out of tree modules, these problems can either be resolved by
changing the ABI number, or possibly by using Breaks: for all of the
affected out-of-tree modules where the change wasn't wide-spread
enough to bump the ABI number. A slightly wilder alternative, is to
Provides: linux-kernel-abi-2.6.32-vmware-5 or something for
out-of-tree modules which aren't going to be covered by the main ABI,
but are important enough to require compatibility. Alternatively, we
can ignore them, and require that end-users of these out of tree
modules know that they must upgrade their out-of-tree modules in
lockstep with the kernel.

Which in-tree modules should we change the ABI number for?

Which out-of-tree modules?

How does an out-of-tree module writer know? How can they promote their
module to get a Breaks or Provides or whatever?


Don Armstrong

-- 
It has always been Debian's philosophy in the past to stick to what
makes sense, regardless of what crack the rest of the universe is
smoking.
 -- Andrew Suffield in 20030403211305.gd29...@doc.ic.ac.uk

http://www.donarmstrong.com  http://rzlab.ucr.edu


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101226202304.gx5...@teltox.donarmstrong.com



Bug#607368: Please decide how kernel ABI should be managed

2010-12-24 Thread Don Armstrong
On Sun, 19 Dec 2010, Julien BLACHE wrote:
> I think it would be best if this matter would be decided upon before
> the release of Squeeze, or not too long after it, so as to avoid
> further breakages in early kernel updates for Squeeze.

I have a couple of (possibly naïve) questions that would help me
understand the space of solutions here.

1) What is the kernel ABI currently used to indicate? Where do we
specify what it guarantees?

2) What are all of the options for handling this situation?
Specifically, how should a package maintainer who is maintaining a
out-of-tree module which uses symbols from the kernel handle them
through an upgrade which changes the symbols? If the symbols need to
be covered by the ABI, how can the maintainer get them covered by ABI?
What should they do in cases when they are not covered by the ABI?

My main concern is that there seems to be no way for oot modules like
the vmware modules to sanely keep in step with the kernel ABI. While
this may not be a concern for kernel upstream, it's something that we
would ideally deal with to avoid issues for our users on upgrades.


Don Armstrong

-- 
He no longer wished to be dead. At the same time, it cannot be said
that he was glad to be alive. But at least he did not resent it. He
was alive, and the stubbornness of this fact had little by little
begun to fascinate him -- as if he had managed to outlive himself, as
if he were somehow living a posthumous life.
 -- Paul Auster _City of Glass_

http://www.donarmstrong.com  http://rzlab.ucr.edu



--
To UNSUBSCRIBE, email to debian-ctte-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101223200804.gq5...@teltox.donarmstrong.com


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101223200804.gq5...@teltox.donarmstrong.com



Bug#607368: Please decide how kernel ABI should be managed

2010-12-19 Thread Julien BLACHE
reopen 607368
tags 607368 - wontfix
reassign 607368 tech-ctte
retitle 607368 Please decide how kernel ABI should be managed
thanks

Hi,

I am hereby asking the tech-ctte to decide how the kernel ABI should be
managed.

Case in point: the kernel team decided to ignore changes to the smp_ops
symbol in 2.6.32-28 which broke external modules (vmware) without any
prior warning.

I am worried that this is going to happen again during the lifetime of
Squeeze, silently breaking working setups upon reboot after a kernel
update, even though the new kernel carries the same ABI number as the
previous one.

I do agree that it is fine to ignore changes to symbols that are only
exported and used inside a self-contained group of modules to which no
additional modules will ever need to be added.

I disagree with the kernel team's take that it is OK for them to ignore
symbol changes in all other cases, especially for symbols exported by
the core kernel (like smp_ops).

This kind of silent breakage is a nightmare from an ops standpoint and
it does have a cost for our users. The ABI number should guarantee that
upgrading from a revision of linux-image to another carrying the same
ABI number will not cause any breakage with external modules built for
this ABI.

As the kernel team made it clear that they make their decision partly
based on symbol usage, I'd like to highlight once again, for the
specific case of smp_ops, that VMware modules aren't exactly pet modules
that only a few of our users care about. There is ample proof of this on
several web forums and mailing-lists dedicated to either VMware or
Debian.

I am seeking a generic ruling by the tech-ctte to ensure that the kernel
ABI number remains meaningful and dependable.

I think it would be best if this matter would be decided upon before the
release of Squeeze, or not too long after it, so as to avoid further
breakages in early kernel updates for Squeeze.

Thanks,

JB.

-- 
 Julien BLACHE   |  Debian, because code matters more 
 Debian & GNU/Linux Developer|   
 Public key available on  - KeyID: F5D6 5169 
 GPG Fingerprint : 935A 79F1 C8B3 3521 FD62 7CC7 CD61 4FD7 F5D6 5169 



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87zks1ty1p@sonic.technologeek.org