Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-23 Thread Aurelien Jarno
Source: linux
Version: 4.9.82-1+deb9u2
Severity: critical
Justification: causes serious data corruption

DSA has installed the latest security kernel (4.9.82-1+deb9u2) on the
Debian POWER8 machines running ppc64el. While they boot correctly, then
programs segfault randomly (apt, sbuild, systemd, etc...). Passing
no_rfi_flush to the command line does not change anything. Looking more
in details, things looks scarying as some code actually get wrongly
executed. Here are some build logs examples:
- 
https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519399908&raw=0
- 
https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519396907&raw=0
- 
https://buildd.debian.org/status/fetch.php?pkg=tk8.5&arch=ppc64el&ver=8.5.19-3&stamp=1519362938&raw=0

While in the above case the packages fail to build from source, I guess
there are also some cases of undetected corruptions.

I'll try to run the 4.9.80-2 kernel at some point to narrow down the
issue.



Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-26 Thread Frédéric Bonnard
Hi,
I got this as well, not immediatly though but adding some
parallelization to the build helped. I'll look into this as well.

F.

On Fri, 23 Feb 2018 19:52:35 +0100, Aurelien Jarno  wrote:
> Source: linux
> Version: 4.9.82-1+deb9u2
> Severity: critical
> Justification: causes serious data corruption
> 
> DSA has installed the latest security kernel (4.9.82-1+deb9u2) on the
> Debian POWER8 machines running ppc64el. While they boot correctly, then
> programs segfault randomly (apt, sbuild, systemd, etc...). Passing
> no_rfi_flush to the command line does not change anything. Looking more
> in details, things looks scarying as some code actually get wrongly
> executed. Here are some build logs examples:
> - 
> https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519399908&raw=0
> - 
> https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519396907&raw=0
> - 
> https://buildd.debian.org/status/fetch.php?pkg=tk8.5&arch=ppc64el&ver=8.5.19-3&stamp=1519362938&raw=0
> 
> While in the above case the packages fail to build from source, I guess
> there are also some cases of undetected corruptions.
> 
> I'll try to run the 4.9.80-2 kernel at some point to narrow down the
> issue.
> 
> 


pgpQ3dkhsGrnB.pgp
Description: PGP signature


Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-26 Thread Breno Leitao
Hi,

On 02/23/2018 03:52 PM, Aurelien Jarno wrote:

> DSA has installed the latest security kernel (4.9.82-1+deb9u2) on the
> Debian POWER8 machines running ppc64el. While they boot correctly, then
> programs segfault randomly (apt, sbuild, systemd, etc...). Passing
> no_rfi_flush to the command line does not change anything. Looking more
> in details, things looks scarying as some code actually get wrongly
> executed. Here are some build logs examples:
> - 
> https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519399908&raw=0
> - 
> https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519396907&raw=0
> - 
> https://buildd.debian.org/status/fetch.php?pkg=tk8.5&arch=ppc64el&ver=8.5.19-3&stamp=1519362938&raw=0
> 
> While in the above case the packages fail to build from source, I guess
> there are also some cases of undetected corruptions.
> 
> I'll try to run the 4.9.80-2 kernel at some point to narrow down the
> issue.

I talked to the powerpc maintainer about this problem, and in fact this is a 
knew
problem, since the 4.4 patches were 'backported' to 4.9 without success.

This is already fixed and in the stable tree already:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/arch/powerpc?h=linux-4.9.y

I understand that the commit ids are:
 * 3146a32b39cd78722869bca6e839b3c59155e012
 * efe8bc07c47fff196bbc0822e249a27ae0574d24
 * ec0084d082137b73460303b39f4089970a213ad7

But I suppose that Debian will do a full merge with the stable tree, then, I 
expect
that the next release will just work.



Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-26 Thread Aurelien Jarno
tag 891249 + fixed-upstream
thanks

On 2018-02-26 11:01, Breno Leitao wrote:
> Hi,
> 
> On 02/23/2018 03:52 PM, Aurelien Jarno wrote:
> 
> > DSA has installed the latest security kernel (4.9.82-1+deb9u2) on the
> > Debian POWER8 machines running ppc64el. While they boot correctly, then
> > programs segfault randomly (apt, sbuild, systemd, etc...). Passing
> > no_rfi_flush to the command line does not change anything. Looking more
> > in details, things looks scarying as some code actually get wrongly
> > executed. Here are some build logs examples:
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519399908&raw=0
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519396907&raw=0
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=tk8.5&arch=ppc64el&ver=8.5.19-3&stamp=1519362938&raw=0
> > 
> > While in the above case the packages fail to build from source, I guess
> > there are also some cases of undetected corruptions.
> > 
> > I'll try to run the 4.9.80-2 kernel at some point to narrow down the
> > issue.
> 
> I talked to the powerpc maintainer about this problem, and in fact this is a 
> knew
> problem, since the 4.4 patches were 'backported' to 4.9 without success.
> 
> This is already fixed and in the stable tree already:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/arch/powerpc?h=linux-4.9.y
> 
> I understand that the commit ids are:
>  * 3146a32b39cd78722869bca6e839b3c59155e012
>  * efe8bc07c47fff196bbc0822e249a27ae0574d24
>  * ec0084d082137b73460303b39f4089970a213ad7
> 
> But I suppose that Debian will do a full merge with the stable tree, then, I 
> expect
> that the next release will just work.

Thanks for the quick answer. I confirm that these commit are indeed in
the 4.9.84 stable release, which has been released yesterday. I guess
3146a32b39cd78722869bca6e839b3c59155e012 is the one which fixes the data
corruption.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-26 Thread Frédéric Bonnard
Hi again,
it looks that there was missing bit in some earlier patch included in 4.9
stable kernel : 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.9.y&id=3146a32b39cd78722869bca6e839b3c59155e012

I tested with this single patch on top of 4.9.82-1+deb9u2 and I could do
some heavy linux compilation without issue.

The latest upstream 4.9.84 has that fix. 

F.

On Mon, 26 Feb 2018 12:33:56 +0100, Frédéric Bonnard  wrote:
> Hi,
> I got this as well, not immediatly though but adding some
> parallelization to the build helped. I'll look into this as well.
> 
> F.
> 
> On Fri, 23 Feb 2018 19:52:35 +0100, Aurelien Jarno  wrote:
> > Source: linux
> > Version: 4.9.82-1+deb9u2
> > Severity: critical
> > Justification: causes serious data corruption
> > 
> > DSA has installed the latest security kernel (4.9.82-1+deb9u2) on the
> > Debian POWER8 machines running ppc64el. While they boot correctly, then
> > programs segfault randomly (apt, sbuild, systemd, etc...). Passing
> > no_rfi_flush to the command line does not change anything. Looking more
> > in details, things looks scarying as some code actually get wrongly
> > executed. Here are some build logs examples:
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519399908&raw=0
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=python-msgpack&arch=ppc64el&ver=0.5.1-1&stamp=1519396907&raw=0
> > - 
> > https://buildd.debian.org/status/fetch.php?pkg=tk8.5&arch=ppc64el&ver=8.5.19-3&stamp=1519362938&raw=0
> > 
> > While in the above case the packages fail to build from source, I guess
> > there are also some cases of undetected corruptions.
> > 
> > I'll try to run the 4.9.80-2 kernel at some point to narrow down the
> > issue.
> > 
> > 


pgpl53nDOZ6N2.pgp
Description: PGP signature


Processed: Re: Bug#891249: linux: unstable kernel/data corruption on ppc64el

2018-02-26 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> tag 891249 + fixed-upstream
Bug #891249 [src:linux] linux: unstable kernel/data corruption on ppc64el
Added tag(s) fixed-upstream.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
891249: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=891249
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems