Re: 3.16.0 Debian kernel hang

2015-12-05 Thread Duncan
Austin S Hemmelgarn posted on Fri, 04 Dec 2015 08:08:58 -0500 as
excerpted:

> On 2015-12-04 05:00, Russell Coker wrote:
>>
>> When I mounted the filesystem with a 4.2.0 kernel it said "The free
>> space cache file (1103101952) is invalid, skip it" and then things
>> worked.  Now that the machine is running 4.2.0 everything is fine.
>>
>> I know that there are no plans to backport things to 3.16 and I don't
>> think the Debian people are going to be very interested in this.  So
>> this message is a FYI for users, maybe consider not using the
>> Debian/Jessie kernel for BTRFS systems.
>>
> I'd suggest extending that suggestion to:
> If you're not using an Enterprise distro (RHEL, SLES, CentOS, OEL), then
> you should probably be building your own kernel, ideally using upstream
> sources.

My personal recommendation differs from that somewhat, in both directions.

The first thing to consider is that in terms of this list, at least, 
while btrfs is considered to be stabilizING, , it's not yet either fully 
stable or mature.  It's "good enough for daily use" provided you're 
following in any case reasonable admin backup policies that respect the 
general admin rule that if the data is of more value than the time and 
resources required to back it up, then it IS backed up, and conversely, 
that if it's not backed up to a particular level, you are by your actions 
defining it as worth less than the time and trouble to do that Nth level 
backup while taking into account the risk factor of actually having to 
use it.

But, that assumes keeping "reasonably" current with both the kernel and 
tools.  Here, the recommendation is much relaxed from what it used to be, 
but the assumption and recommendation is that you'll either follow the 
current kernel, being no more than one release series behind (with 4.3 
out, you might still be on 4.2) unless you have a specific bug (btrfs or 
otherwise) that's still being addressed, OR at least follow the upstream 
kernel LTS series, again, being no further than one such series behind 
(the 4.1 and 3.18 LTS series are thus currently covered, with 4.4 already 
taken on as another LTS series, so those on 3.18 should be well into 
their 4.1 upgrade preparations as 4.4 should be out around Christmas).

The reasoning is that development remains quite rapid, and older kernels 
both have known and long fixed bugs, and in the case of older LTS kernels 
(from 3.12 at least, as that's when the experimental label was stripped) 
which should still at least be getting the critical fixes, are simply too 
prehistoric code to reasonably support on-list, as too much has changed 
since then and btrfs is /not/ yet fully stable.

_But_, that's in terms of the mainline kernel and list support.  Some 
distros, primarily enterprise but the same idea applies to distros in 
general, have chosen to support btrfs on their old "stable series" 
kernels, where they presumably backport the critical btrfs patches along 
with other critical kernel patches.

If some btrfs users choose to accept their distro's claims of support on 
older kernels at face value, that's between them and the distro.  But 
here's the thing, this list is focused on the mainline kernel, and many 
here don't know or particularly care what specific patches random distro 
Y may have backported... or not.  So users that choose to use their 
distro's kernels, particularly older kernel series not well supported on-
list, really need to be looking to their distros for that support, not 
the list, because they're not running versions well supported by the list.


So contrary to Austin's recommendation, mine would be, yeah, run your 
distro's kernel if it's within the general list supported range mentioned 
above.  You don't _have_ to build your own kernel, and if it's within the 
two most current kernel series or the two most recent LTS kernel series, 
we'll do our best on-list to help.

But if you choose to run kernels outside that list-supported range, 
regardless of whether you're running distro kernels or building your own, 
you really shouldn't be relying on this list for support, because while 
we'll generally still try to do our best to help, it's out of our focus 
range and the level of support simply isn't going to be as good as it 
would if you were running newer kernel series, as a result.

For enterprise distro users as well as other distros running ancient 
kernels, that means your best support may well be from your distro, and 
particularly for enterprise distros, that's often what you're paying good 
money to get, so you should be able to expect/demand that support, or you 
can take that money elsewhere.

So my recommendation doesn't really distinguish enterprise users from 
others, except that enterprise users are both often running older kernels 
and paying good money for support.  Enterprise distro or not, however, if 
users choose to run older kernels, they can expect a lower level of 
practical list support because we're simp

Re: 3.16.0 Debian kernel hang

2015-12-04 Thread Austin S Hemmelgarn

On 2015-12-04 09:26, Russell Coker wrote:

On Sat, 5 Dec 2015 12:53:07 AM Austin S Hemmelgarn wrote:

The only reason I'm not running Unstable kernels on my Debian systems is
because I run some Xen servers and upgrading Xen is problemmatic.  Linode
is moving from Xen to KVM so I guess I should consider doing the
same.  If I migrate my Xen servers to KVM I can use newer kernels with
less risk.


That's interesting, that must be something with how they do kernel
development in Debian, because I've never had any issues upgrading
either Xen or Linux on any of the systems I've run Xen on, and I
directly track mainline (with a small number of patches) for Linux, and
stay relatively close to mainline with Xen (Gentoo doesn't have all that
many patches on top of the regular release for Xen, aside from XSA
patches).


I don't think that Debian does anything wrong in this regard.  It's just that
my experience of Xen is that it is fragile at the best of times.  The fact
that Red Hat packaged the Xen kernel in the Linux kernel package is a major
indication of Xen problems IMHO, the concept of Xen is that it shouldn't be
tied to a Linux kernel.
In the case of Red Hat, that's probably the way it's done because that's 
originally what was needed to make things work.  Early versions of Xen 
very much did need a special version of Linux running as Domain 0. 
Coupling things like that also simplifies testing for the developers at 
Red hat, as they then only need to test one combination, instead of a 
big matrix of features.  Less to test means they can test more 
thoroughly, which means they can provide a better guarantee that things 
will work without intervention right out of the box, which is important 
for enterprise distros.


Xen is supposed to be decoupled from the version of the Domain 0 kernel, 
and in most of my experience with it, they do a pretty good job.  90% of 
the issues I've heard of personally have been with patched versions put 
together by Linux distros, not with an upstream release.


If you haven't had Xen issues then I think you have been lucky.

I have personally had issues using Debian as Domain 0 and keeping Xen up 
to date myself, but all of those issues vanished when I switched to 
Gentoo for that purpose (well, they vanished when I switched to NetBSD, 
but haven't resurfaced since I switched from that to Gentoo Linux after 
about a week of pulling my hair out from fighting with BSD). I'm 
admittedly not doing anything other than small purpose built PV domains 
for service isolation in most cases (although I do use a dedicated PV 
domain for testing kernel patches from time to time), but that really 
shouldn't have any impact.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: 3.16.0 Debian kernel hang

2015-12-04 Thread Russell Coker
On Sat, 5 Dec 2015 12:53:07 AM Austin S Hemmelgarn wrote:
> > The only reason I'm not running Unstable kernels on my Debian systems is
> > because I run some Xen servers and upgrading Xen is problemmatic.  Linode
> > is moving from Xen to KVM so I guess I should consider doing the
> > same.  If I migrate my Xen servers to KVM I can use newer kernels with
> > less risk.
> 
> That's interesting, that must be something with how they do kernel 
> development in Debian, because I've never had any issues upgrading 
> either Xen or Linux on any of the systems I've run Xen on, and I 
> directly track mainline (with a small number of patches) for Linux, and 
> stay relatively close to mainline with Xen (Gentoo doesn't have all that 
> many patches on top of the regular release for Xen, aside from XSA
> patches).

I don't think that Debian does anything wrong in this regard.  It's just that 
my experience of Xen is that it is fragile at the best of times.  The fact 
that Red Hat packaged the Xen kernel in the Linux kernel package is a major 
indication of Xen problems IMHO, the concept of Xen is that it shouldn't be 
tied to a Linux kernel.

If you haven't had Xen issues then I think you have been lucky.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.16.0 Debian kernel hang

2015-12-04 Thread Austin S Hemmelgarn

On 2015-12-04 08:42, Russell Coker wrote:

On Sat, 5 Dec 2015 12:08:58 AM Austin S Hemmelgarn wrote:

I know that there are no plans to backport things to 3.16 and I don't
think the Debian people are going to be very interested in this.  So
this message is a FYI for users, maybe consider not using the
Debian/Jessie kernel for BTRFS systems.


I'd suggest extending that suggestion to:
If you're not using an Enterprise distro (RHEL, SLES, CentOS, OEL), then
you should probably be building your own kernel, ideally using upstream
sources.


There are lots of ways of dealing with this.

Debian development doesn't stop.  Anyone who is running a Jessie system can
easily run a kernel from Testing or Unstable (which really isn't particularly
unstable).  It's generally expected that Debian user-space will work with a
kernel from +- one release of Debian.  Also every time I've tried it Debian
has worked well with a CentOS kernel of a similar version.
Well yes, that does usually work, but that doesn't mean that it keeps up 
with mainline very well.  Back when I used Debian on a regular basis, I 
ran the 'unstable' kernels, and they still lagged behind mainline by at 
least a minor version, and often more than that.  And there have been 
cases where things got horribly broken in mainline due to lack of proper 
vetting of code (Most recent example being the insanity with the 
clustered MD code, which broke non-clustered soft raid for at least two 
major releases), which prevents them from safely keeping up-to-date with 
mainline.


The only reason I'm not running Unstable kernels on my Debian systems is
because I run some Xen servers and upgrading Xen is problemmatic.  Linode is
moving from Xen to KVM so I guess I should consider doing the same.  If I
migrate my Xen servers to KVM I can use newer kernels with less risk.
That's interesting, that must be something with how they do kernel 
development in Debian, because I've never had any issues upgrading 
either Xen or Linux on any of the systems I've run Xen on, and I 
directly track mainline (with a small number of patches) for Linux, and 
stay relatively close to mainline with Xen (Gentoo doesn't have all that 
many patches on top of the regular release for Xen, aside from XSA patches).





smime.p7s
Description: S/MIME Cryptographic Signature


Re: 3.16.0 Debian kernel hang

2015-12-04 Thread Russell Coker
On Sat, 5 Dec 2015 12:08:58 AM Austin S Hemmelgarn wrote:
> > I know that there are no plans to backport things to 3.16 and I don't
> > think the Debian people are going to be very interested in this.  So
> > this message is a FYI for users, maybe consider not using the
> > Debian/Jessie kernel for BTRFS systems.
> 
> I'd suggest extending that suggestion to:
> If you're not using an Enterprise distro (RHEL, SLES, CentOS, OEL), then 
> you should probably be building your own kernel, ideally using upstream 
> sources.

There are lots of ways of dealing with this.

Debian development doesn't stop.  Anyone who is running a Jessie system can 
easily run a kernel from Testing or Unstable (which really isn't particularly 
unstable).  It's generally expected that Debian user-space will work with a 
kernel from +- one release of Debian.  Also every time I've tried it Debian 
has worked well with a CentOS kernel of a similar version.

The only reason I'm not running Unstable kernels on my Debian systems is 
because I run some Xen servers and upgrading Xen is problemmatic.  Linode is 
moving from Xen to KVM so I guess I should consider doing the same.  If I 
migrate my Xen servers to KVM I can use newer kernels with less risk.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.16.0 Debian kernel hang

2015-12-04 Thread Austin S Hemmelgarn

On 2015-12-04 05:00, Russell Coker wrote:

One of my test laptops started hanging on mounting the root filesystem.  I
think that it had experience an unexpected power outage prior to that which
may have caused corruption.

When I tried to mount the root filesystem the mount process would stick in D
state, there would be no disk IO, and the computer would get hot - presumably
due to kernel CPU use even though "top" didn't seem to indicate that.

When I mounted the filesystem with a 4.2.0 kernel it said "The free space cache
file (1103101952) is invalid, skip it" and then things worked.  Now that the
machine is running 4.2.0 everything is fine.

I know that there are no plans to backport things to 3.16 and I don't think
the Debian people are going to be very interested in this.  So this message is
a FYI for users, maybe consider not using the Debian/Jessie kernel for BTRFS
systems.


I'd suggest extending that suggestion to:
If you're not using an Enterprise distro (RHEL, SLES, CentOS, OEL), then 
you should probably be building your own kernel, ideally using upstream 
sources.


Ubuntu is notorious for picking 'stable' kernels that then fail to be 
marked by kernel.org as LTS, Debian picks kernels that are multiple 
versions old by the time they make a release, and I've heard similar 
from other non-enterprise distros that don't inherently make you build 
your own kernel.  Even among ones that you have to build the kernel 
yourself anyway, there are issues (Gentoo for example doesn't often mark 
new kernels as stable, even when they are perfectly usable for pretty 
much everyone).




smime.p7s
Description: S/MIME Cryptographic Signature


3.16.0 Debian kernel hang

2015-12-04 Thread Russell Coker
One of my test laptops started hanging on mounting the root filesystem.  I 
think that it had experience an unexpected power outage prior to that which 
may have caused corruption.

When I tried to mount the root filesystem the mount process would stick in D 
state, there would be no disk IO, and the computer would get hot - presumably 
due to kernel CPU use even though "top" didn't seem to indicate that.

When I mounted the filesystem with a 4.2.0 kernel it said "The free space cache 
file (1103101952) is invalid, skip it" and then things worked.  Now that the 
machine is running 4.2.0 everything is fine.

I know that there are no plans to backport things to 3.16 and I don't think 
the Debian people are going to be very interested in this.  So this message is 
a FYI for users, maybe consider not using the Debian/Jessie kernel for BTRFS 
systems.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html