Re: Compiler segfault when building the kernel

2018-07-15 Thread Celejar
On Tue, 13 Jun 2017 20:41:25 +0300
Adrian Bunk  wrote:

> On Tue, Jun 13, 2017 at 11:57:55AM -0400, Celejar wrote:
> > On Mon, 12 Jun 2017 10:45:17 +0300
> > Adrian Bunk  wrote:
> > 
> > > On Fri, Jun 09, 2017 at 07:58:12AM -0400, Celejar wrote:
> > > > Hi,

...

> > > > line "root_cmd = fakeroot") without problem. Recently, the builds have
> > > > begun to fail with messages like these:
> > 
> > ...
> > 
> > > > > ./include/linux/rcu_sync.h:29:48: internal compiler error: 
> > > > > Segmentation fault
> > > > >  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> > > > > ^

...

> > > > > The bug is not reproducible, so it is likely a hardware or OS problem.

...

> "internal compiler error that is not 100% reproducible" - at that point 
> it is nearly certain that the underlying problem is a hardware problem.

Just for the record, extensive testing with memtest86 and memtest86+
confirmed that one of my DIMMs was bad. I've replaced it, and have not
had a recurrence of the problem (although I now usually offload kernel
compilaton to a different machine anyway).

Thanks, Adrian.

Celejar



Re: Compiler segfault when building the kernel

2017-06-13 Thread Adrian Bunk
On Tue, Jun 13, 2017 at 11:57:55AM -0400, Celejar wrote:
> On Mon, 12 Jun 2017 10:45:17 +0300
> Adrian Bunk  wrote:
> 
> > On Fri, Jun 09, 2017 at 07:58:12AM -0400, Celejar wrote:
> > > Hi,
> > > 
> > > I've been building kernels (vanilla from upstream) for years with
> > > kernel-package (typical command line: "time make-kpkg -j2 --initrd
> > > --revision 1.custom kernel_image"; .kernel-pkg.conf contains just the
> > > line "root_cmd = fakeroot") without problem. Recently, the builds have
> > > begun to fail with messages like these:
> 
> ...
> 
> > > > ./include/linux/rcu_sync.h:29:48: internal compiler error: Segmentation 
> > > > fault
> > > >  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> > > > ^
> > > > Please submit a full bug report,
> > > > with preprocessed source if appropriate.
> > > > See  for instructions.
> > > >   CC  fs/posix_acl.o
> > > > The bug is not reproducible, so it is likely a hardware or OS problem.
> 
> ...
> 
> > > This occurred immediately following a cleaning of the source tree
> > > ("make-kpkg ... clean"), the first one I've done in quite some time, so
> > > I'm pretty sure that that's what triggered this, whatever the
> > > underlying problem actually is.
> > > 
> > > Googling suggests that this sort of thing can be triggered by race
> > > conditions caused by build systems improper handling of
> > > concurrency,e.g.:
> > > 
> > > https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem
> > 
> > That is just an incorrect answer from some random person.
> > 
> > Missing dependencies produce different kinds of errors,
> > never internal compiler errors.
> 
> The suggestion is not that the bug is caused by the missing
> dependencies, but rather that there's an underlying bug getting hit,
> and the fact that it's not reporducible is due to a race condition
> caused by improper concurrency handling.
>...

I fully understand this suggestion - I have seen (and fixed) countless 
cased of such dependency issues.

The errors they cause are different.

"internal compiler error that is not 100% reproducible" - at that point 
it is nearly certain that the underlying problem is a hardware problem.

> Celejar

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed



Re: Compiler segfault when building the kernel

2017-06-13 Thread Celejar
On Mon, 12 Jun 2017 10:45:17 +0300
Adrian Bunk  wrote:

> On Fri, Jun 09, 2017 at 07:58:12AM -0400, Celejar wrote:
> > Hi,
> > 
> > I've been building kernels (vanilla from upstream) for years with
> > kernel-package (typical command line: "time make-kpkg -j2 --initrd
> > --revision 1.custom kernel_image"; .kernel-pkg.conf contains just the
> > line "root_cmd = fakeroot") without problem. Recently, the builds have
> > begun to fail with messages like these:

...

> > > ./include/linux/rcu_sync.h:29:48: internal compiler error: Segmentation 
> > > fault
> > >  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> > > ^
> > > Please submit a full bug report,
> > > with preprocessed source if appropriate.
> > > See  for instructions.
> > >   CC  fs/posix_acl.o
> > > The bug is not reproducible, so it is likely a hardware or OS problem.

...

> > This occurred immediately following a cleaning of the source tree
> > ("make-kpkg ... clean"), the first one I've done in quite some time, so
> > I'm pretty sure that that's what triggered this, whatever the
> > underlying problem actually is.
> > 
> > Googling suggests that this sort of thing can be triggered by race
> > conditions caused by build systems improper handling of
> > concurrency,e.g.:
> > 
> > https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem
> 
> That is just an incorrect answer from some random person.
> 
> Missing dependencies produce different kinds of errors,
> never internal compiler errors.

The suggestion is not that the bug is caused by the missing
dependencies, but rather that there's an underlying bug getting hit,
and the fact that it's not reporducible is due to a race condition
caused by improper concurrency handling.

> > For the last year or so, I've been building with -j2, so I tried again
> > without it. I still got the same error, but when I once again did a
> > clean and then rebuilt without -j2, the build succeeded.
> > 
> > Any ideas? Is this a bug I should be filing against kernel-package (or
> > anywhere else)?
> 
> Based on what you describe (the problem is not reproducible and the 
> problem started recently), there is a nearly 100% chance that it is

As I mentioned, the one significant recent change of which I am aware
is the fact that this is the first time in a very long time that I've
done a full kernel build (i.e., one preceded by a 'clean' of the source
tree). I am aware of no other changes.

> caused by a hardware defect on your machine.

I suppose that's always possible. This is a Lenovo W550s workstation,
purchased refurbished about a year ago.

> Were there any hardware changes or was there a a move of the machine recently?

No hardware changes. The machine is a laptop - it's moved often.

> Are all fans still working?

There's only one fan, and it seems to be working fine. It goes up to
several thousand RPM, to a maximum of about 4K when under heavy
(artificial) load, e.g. sysbench. This has been the behavior for as
long as I've had the system.

> Do all temperatures look normal?

Yes. Mid 40s C when at rest, climbing as high as the high 70s when under
full load. Once again, this has always been the behavior of this system.

> Do all capacitors on the mainboard look OK?

Haven't opened it.

> Does a RAM testing tool like memtest86 succeed?

Will test when I have a chance - perhaps overnight tonight.

Celejar



Re: Compiler segfault when building the kernel

2017-06-12 Thread Adrian Bunk
On Fri, Jun 09, 2017 at 07:58:12AM -0400, Celejar wrote:
> Hi,
> 
> I've been building kernels (vanilla from upstream) for years with
> kernel-package (typical command line: "time make-kpkg -j2 --initrd
> --revision 1.custom kernel_image"; .kernel-pkg.conf contains just the
> line "root_cmd = fakeroot") without problem. Recently, the builds have
> begun to fail with messages like these:
> 
> *
> 
> > In file included from ./include/linux/percpu-rwsem.h:8:0,
> >  from ./include/linux/fs.h:30,
> >  from ./include/linux/pagemap.h:8,
> >  from block/partitions/check.h:1,
> >  from block/partitions/msdos.c:23:
> > ./include/linux/rcu_sync.h:29:48: internal compiler error: Segmentation 
> > fault
> >  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> > ^
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> > See  for instructions.
> >   CC  fs/posix_acl.o
> > The bug is not reproducible, so it is likely a hardware or OS problem.
> 
> *
> 
> > In file included from ./include/linux/linkage.h:6:0,
> >  from ./include/linux/kernel.h:6,
> >  from ./include/linux/list.h:8,
> >  from ./include/linux/module.h:9,
> >  from lib/fonts/font_8x16.c:8:
> > ./include/linux/export.h:63:22: internal compiler error: Segmentation fault
> >   static const struct kernel_symbol __ksymtab_##sym  \
> >   ^
> > ./include/linux/export.h:93:25: note: in expansion of macro 
> > ‘___EXPORT_SYMBOL’
> >  #define __EXPORT_SYMBOL ___EXPORT_SYMBOL
> >  ^
> > ./include/linux/export.h:97:2: note: in expansion of macro ‘__EXPORT_SYMBOL’
> >   __EXPORT_SYMBOL(sym, "")
> >   ^
> > lib/fonts/font_8x16.c:4633:1: note: in expansion of macro ‘EXPORT_SYMBOL’
> >  EXPORT_SYMBOL(font_vga_8x16);
> >  ^
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> 
> This occurred immediately following a cleaning of the source tree
> ("make-kpkg ... clean"), the first one I've done in quite some time, so
> I'm pretty sure that that's what triggered this, whatever the
> underlying problem actually is.
> 
> Googling suggests that this sort of thing can be triggered by race
> conditions caused by build systems improper handling of
> concurrency,e.g.:
> 
> https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem

That is just an incorrect answer from some random person.

Missing dependencies produce different kinds of errors,
never internal compiler errors.

> For the last year or so, I've been building with -j2, so I tried again
> without it. I still got the same error, but when I once again did a
> clean and then rebuilt without -j2, the build succeeded.
> 
> Any ideas? Is this a bug I should be filing against kernel-package (or
> anywhere else)?

Based on what you describe (the problem is not reproducible and the 
problem started recently), there is a nearly 100% chance that it is
caused by a hardware defect on your machine.

Were there any hardware changes or was there a a move of the machine recently?
Are all fans still working?
Do all temperatures look normal?
Do all capacitors on the mainboard look OK?
Does a RAM testing tool like memtest86 succeed?
...

> Celejar

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed



Re: Compiler segfault when building the kernel

2017-06-11 Thread Celejar
On Sun, 11 Jun 2017 14:52:55 +0200
deloptes  wrote:

> Celejar wrote:
> 
> > So I'll have to decide whether to report against kernel-package, gcc,
> > or not at all.
> 
> build a simple and clean chroot and test there. If bug present report if not
> use the kernel build there on your current system.
> 
> Avoid mixing build and production systems if possible - chroot is cheep and
> handy.

Thanks. Will try further testing when I get a chance.

Celejar



Re: Compiler segfault when building the kernel

2017-06-11 Thread deloptes
Celejar wrote:

> So I'll have to decide whether to report against kernel-package, gcc,
> or not at all.

build a simple and clean chroot and test there. If bug present report if not
use the kernel build there on your current system.

Avoid mixing build and production systems if possible - chroot is cheep and
handy.

regards





Re: Compiler segfault when building the kernel

2017-06-10 Thread kamaraju kusumanchi
On Sat, Jun 10, 2017 at 10:51 PM, Celejar  wrote:
> On Sat, 10 Jun 2017 09:37:39 -0400
> kamaraju kusumanchi  wrote:
>
>> On Fri, Jun 9, 2017 at 7:58 AM, Celejar  wrote:
>> > Any ideas? Is this a bug I should be filing against kernel-package (or
>> > anywhere else)?
>>
>> Two things
>>
>> 1) Does the problem go away if you upgrade to the latest compiler?
>> Based on the error message, I believe you are using gcc 4.9? But it is
>> not clear to me what the exact version of the compiler is and the
>> debian distribution you are using.
>
> Debian stable, with a bunch of stuff from backports, and some from
> unstable. From gcc -v:
>
> gcc version 4.9.2 (Debian 4.9.2-10)

ok. If upgrading to stretch (currently testing) is an option, please
consider it. It has gcc 6.3.0 and this specific bug might have been
fixed since 4.9.2. Just so you have an idea, gcc 4.9.2 was released on
October 30, 2014; gcc 6.3.0 was released on December 21, 2016.

Another thing to note is that even though Stretch is currently the
testing distribution, it is scheduled to be released as Stable on June
17, 2017 (i.e. in 7 days). It is also in very good shape with only 66
release-critical bugs as per https://bugs.debian.org/release-critical/
.


>> 2) The very fact the the compiler is failing with an "internal"
>> compiler error indicates that it is a bug in the compiler. I would
>> report it against the compiler package. But before you report, try to
>> reproduce the problem with as smaller test case as possible. If you
>> just say, my kernel compilation is failing with "internal compiler
>> error" it will be very difficult to reproduce by others and will take
>> longer for it to be fixed.
>
> Thank you. The reason that I thought that it might not be a compiler
> bug is because of the claim here that it can also be caused by a build
> system that isn't designed to be run concurrently:
>
> https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem
>
> So I'll have to decide whether to report against kernel-package, gcc,
> or not at all.

The bug may have been triggered by a build system. But ultimately, the
fix has to go into the compiler. So, gcc is the correct package to
report.

-- 
Kamaraju S Kusumanchi | http://raju.shoutwiki.com/wiki/Blog



Re: Compiler segfault when building the kernel

2017-06-10 Thread Celejar
On Sat, 10 Jun 2017 09:37:39 -0400
kamaraju kusumanchi  wrote:

> On Fri, Jun 9, 2017 at 7:58 AM, Celejar  wrote:
> > Any ideas? Is this a bug I should be filing against kernel-package (or
> > anywhere else)?
> 
> Two things
> 
> 1) Does the problem go away if you upgrade to the latest compiler?
> Based on the error message, I believe you are using gcc 4.9? But it is
> not clear to me what the exact version of the compiler is and the
> debian distribution you are using.

Debian stable, with a bunch of stuff from backports, and some from
unstable. From gcc -v:

gcc version 4.9.2 (Debian 4.9.2-10) 

> 2) The very fact the the compiler is failing with an "internal"
> compiler error indicates that it is a bug in the compiler. I would
> report it against the compiler package. But before you report, try to
> reproduce the problem with as smaller test case as possible. If you
> just say, my kernel compilation is failing with "internal compiler
> error" it will be very difficult to reproduce by others and will take
> longer for it to be fixed.

Thank you. The reason that I thought that it might not be a compiler
bug is because of the claim here that it can also be caused by a build
system that isn't designed to be run concurrently:

https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem

So I'll have to decide whether to report against kernel-package, gcc,
or not at all.

Celejar



Re: Compiler segfault when building the kernel

2017-06-10 Thread kamaraju kusumanchi
On Fri, Jun 9, 2017 at 7:58 AM, Celejar  wrote:
> Any ideas? Is this a bug I should be filing against kernel-package (or
> anywhere else)?

Two things

1) Does the problem go away if you upgrade to the latest compiler?
Based on the error message, I believe you are using gcc 4.9? But it is
not clear to me what the exact version of the compiler is and the
debian distribution you are using.

2) The very fact the the compiler is failing with an "internal"
compiler error indicates that it is a bug in the compiler. I would
report it against the compiler package. But before you report, try to
reproduce the problem with as smaller test case as possible. If you
just say, my kernel compilation is failing with "internal compiler
error" it will be very difficult to reproduce by others and will take
longer for it to be fixed.

raju
-- 
Kamaraju S Kusumanchi | http://raju.shoutwiki.com/wiki/Blog



Compiler segfault when building the kernel

2017-06-09 Thread Celejar
Hi,

I've been building kernels (vanilla from upstream) for years with
kernel-package (typical command line: "time make-kpkg -j2 --initrd
--revision 1.custom kernel_image"; .kernel-pkg.conf contains just the
line "root_cmd = fakeroot") without problem. Recently, the builds have
begun to fail with messages like these:

*

> In file included from ./include/linux/percpu-rwsem.h:8:0,
>  from ./include/linux/fs.h:30,
>  from ./include/linux/pagemap.h:8,
>  from block/partitions/check.h:1,
>  from block/partitions/msdos.c:23:
> ./include/linux/rcu_sync.h:29:48: internal compiler error: Segmentation fault
>  enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
> ^
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See  for instructions.
>   CC  fs/posix_acl.o
> The bug is not reproducible, so it is likely a hardware or OS problem.

*

> In file included from ./include/linux/linkage.h:6:0,
>  from ./include/linux/kernel.h:6,
>  from ./include/linux/list.h:8,
>  from ./include/linux/module.h:9,
>  from lib/fonts/font_8x16.c:8:
> ./include/linux/export.h:63:22: internal compiler error: Segmentation fault
>   static const struct kernel_symbol __ksymtab_##sym  \
>   ^
> ./include/linux/export.h:93:25: note: in expansion of macro ‘___EXPORT_SYMBOL’
>  #define __EXPORT_SYMBOL ___EXPORT_SYMBOL
>  ^
> ./include/linux/export.h:97:2: note: in expansion of macro ‘__EXPORT_SYMBOL’
>   __EXPORT_SYMBOL(sym, "")
>   ^
> lib/fonts/font_8x16.c:4633:1: note: in expansion of macro ‘EXPORT_SYMBOL’
>  EXPORT_SYMBOL(font_vga_8x16);
>  ^
> Please submit a full bug report,
> with preprocessed source if appropriate.

This occurred immediately following a cleaning of the source tree
("make-kpkg ... clean"), the first one I've done in quite some time, so
I'm pretty sure that that's what triggered this, whatever the
underlying problem actually is.

Googling suggests that this sort of thing can be triggered by race
conditions caused by build systems improper handling of
concurrency,e.g.:

https://askubuntu.com/questions/343490/the-bug-is-not-reproducible-so-it-is-likely-a-hardware-or-os-problem

For the last year or so, I've been building with -j2, so I tried again
without it. I still got the same error, but when I once again did a
clean and then rebuilt without -j2, the build succeeded.

Any ideas? Is this a bug I should be filing against kernel-package (or
anywhere else)?

Celejar