Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-19 Thread Steven Chamberlain
Package: src:ruby1.8
Version: 1.8.7.352-2
Severity: serious
Tags: sid wheezy
User: debian-bsd@lists.debian.org
Usertags: kfreebsd
X-Debbugs-Cc: k...@debian.org
X-Debbugs-Cc: debian-bsd@lists.debian.org
Justification: fails to build from source (but built successfully in the
past)

Hi,

On 20/05/12 01:19, Cyril Brulebois wrote:
> https://buildd.debian.org/status/logs.php?arch=kfreebsd-amd64&pkg=ruby1.8&ver=1.8.7.358-2

Seems that this issue *rarely* happens during kfreebsd-i386 builds too
(in the same place, but test_safe_04 isn't necessarily at fault).

https://buildd.debian.org/status/fetch.php?pkg=ruby1.8&arch=kfreebsd-i386&ver=1.8.7.352-2&stamp=1313126333
:
> test_safe_04(TestERBCoreWOStrScan): .
> E: Caught signal 'Terminated': terminating immediately
> make[1]: *** [test-all] Terminated
> make: *** [common-post-build-arch] Terminated
> test_cd(TestFileUtils): Build killed with signal TERM after 150 minutes of 
> inactivity

When I try this myself, I hit segfaults in the testsuite before it even
gets that far. :(


The result of the test-all suite is ignored anyway.  Something was added
for ruby1.9.1, to time out any tests that hang -- maybe we could use it
here too:

http://anonscm.debian.org/gitweb/?p=collab-maint/ruby1.9.1.git;a=commitdiff;h=6c64e43924695aec1f995202a032fb2e0e955eb3

Also #593139 might have something relevant to fixing ruby1.8.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
steven@kfreebsd-i386:~/ruby1.8-1.8.7.358$ gdb ruby1.8 -c ruby1.8.core -s 
debian/libruby1.8-dbg/usr/lib/debug/usr/bin/ruby1.8
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-kfreebsd-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/steven/ruby1.8-1.8.7.358/ruby1.8...done.
[New process 100385]
[New process 101043]
[New process 101042]
Core was generated by `ruby1.8'.
Program terminated with signal 6, Aborted.
#0  0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
(gdb) thread apply all bt

Thread 3 (process 101042):
#0  0x282c1202 in poll () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#1  0x281869ee in __pthread_manager () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x in ?? ()

Thread 2 (process 101043):
#0  0x2818c272 in nanosleep () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#1  0x28187e0f in __pthread_timedsuspend_new_clk () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x28185bce in pthread_cond_timedwait@GLIBC_2.3 () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#3  0x280967b9 in thread_timer (dummy=0xbfbf81f8) at eval.c:12325
#4  0x28186671 in pthread_start_thread () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#5  0x in ?? ()

Thread 1 (process 100385):
#0  0x282c95f6 in syscall () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#1  0x2818937b in pthread_kill () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#2  0x281893b6 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#3  0x2822e624 in raise () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#4  0x282316c3 in abort () from /lib/i386-kfreebsd-gnu/i686/cmov/libc.so.0.1
#5  0x28091929 in rb_bug (fmt=fmt@entry=0x28132286 "Segmentation fault") at 
error.c:213
#6  0x28100469 in sigsegv (sig=) at signal.c:634
#7  sigsegv (sig=11) at signal.c:622
#8  0x2818bb47 in __pthread_sighandler () from 
/lib/i386-kfreebsd-gnu/i686/cmov/libpthread.so.0
#9  
#10 0x in ?? ()
#11 0x2c10742c in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-20 Thread Steven Chamberlain
found 673594 1.8.7.352-2
tags 673594 + patch
thanks

Hi,

What about using the attached patch to time out the test-all suite if it
hangs, as was done for ruby1.9.1, because its exit status is ignored
anyway (some failures are expected, on all arches).

I think a workaround like this is needed to at least fix the FTBFS since
there are security patches and s390x stuff all waiting on it.  The
version of ruby1.8 in testing seems to already have this problem (it
only built for kfreebsd-i386 on the 4th attempt).

We can separately follow up on working out why some of the tests hang;
probably thread-related races in eglibc and/or the tests themselves.

Thanks,
Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
diff --git a/debian/control b/debian/control
index c8d77e0..db229e5 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: ruby
 Priority: optional
 Maintainer: akira yamada 
 Uploaders: Daigo Moriwaki , Lucas Nussbaum , Antonio Terceiro 
-Build-Depends: cdbs (>= 0.4.106), debhelper (>= 5), autotools-dev, autoconf, m4, quilt (>= 0.40), patch, bison, binutils (>= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (>= 0.9.6b), file
+Build-Depends: cdbs (>= 0.4.106), debhelper (>= 5), autotools-dev, autoconf, m4, quilt (>= 0.40), patch, bison, binutils (>= 2.14.90.0.7), libgdbm-dev, libncurses5-dev, libreadline-gplv2-dev, tcl-dev, tk-dev, zlib1g-dev, libssl-dev (>= 0.9.6b), file, coreutils
 Standards-Version: 3.9.2
 Homepage: http://www.ruby-lang.org/
 Vcs-Git: git://git.debian.org/collab-maint/ruby1.8.git
diff --git a/debian/rules b/debian/rules
index e238759..1456921 100755
--- a/debian/rules
+++ b/debian/rules
@@ -62,7 +62,7 @@ DEB_MAKE_BUILD_TARGET = all test
 
 common-post-build-arch::
 ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
-	-make test-all
+	-timeout 1200 make test-all
 endif
 
 


Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-20 Thread Steven Chamberlain
Whereas the buildds experience hangs during some tests, I see segfaults
instead.  This sometimes happens even before the first test has been run.

This small Ruby testcase results in segfault 50% of the time under
ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2:

> require 'thread'
> Thread.new do
> foo = "bar"
> end

(Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP)

Attached are outputs from ktrace for a success and from a failure;  then
I've tried to diff them.  There seems to be a race whereby thread0 tries
to call thr_kill on thread2, but if that happens too late thread2 will
trigger a segfault instead.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org
--- ok.txt  2012-05-20 20:56:17.734917958 +0100
+++ fail.txt2012-05-20 20:58:12.337235026 +0100
@@ -356,7 +356,7 @@
  ruby1.8  RET   open 3
  ruby1.8  CALL  read(0x3,0xbfbfe61c,0x4)
  ruby1.8  GIO   fd 3 read 4 bytes
- 0x f0be 5f81  
|.._.|
+ 0x d52b 6642  
|.+fB|
 
  ruby1.8  RET   read 4
  ruby1.8  CALL  close(0x3)
@@ -411,7 +411,7 @@
  ruby1.8  CALL  gettimeofday(0xbfbfe5b8,0)
  ruby1.8  RET   gettimeofday 0
  ruby1.8  CALL  getpid
- ruby1.8  RET   getpid 50320/0xc490
+ ruby1.8  RET   getpid 50346/0xc4aa
  ruby1.8  CALL  break(0x808d000)
  ruby1.8  RET   break 0
  ruby1.8  CALL  sigaction(SIGINT,0xbfbfe564,0xbfbfe5b8)
@@ -750,65 +750,49 @@
  ruby1.8  RET   sigprocmask 0
  ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
  ruby1.8  RET   clock_gettime 0
+ ruby1.8  CALL  sigprocmask(SIG_SETMASK,0x28b68e90,0)
+ ruby1.8  RET   sigprocmask 0
+ ruby1.8  CALL  clock_gettime(0,0x28b68f30)
+ ruby1.8  RET   clock_gettime 0
+ ruby1.8  CALL  sigprocmask(SIG_BLOCK,0,0x28b68e80)
+ ruby1.8  RET   sigprocmask 0
+ ruby1.8  PSIG  SIGSEGV caught handler=0x2818ba50 mask=0x8000 
code=0x1
+ ruby1.8  CALL  sigprocmask(SIG_UNBLOCK,0x28b68ea0,0x28b68e90)
+ ruby1.8  RET   sigprocmask 0
+ ruby1.8  CALL  write(0x2,0xbfbfbf9c,0xb)
+ ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+ ruby1.8  RET   clock_gettime 0
+ ruby1.8  GIO   fd 2 wrote 11 bytes
+ "test.rb:4: "
  ruby1.8  CALL  nanosleep(0x28b68eb0,0)
- ruby1.8  CALL  thr_kill(,SIG(null))
- ruby1.8  RET   thr_kill 0
- ruby1.8  RET   nanosleep -1 errno 4 Interrupted system call
- ruby1.8  CALL  sigprocmask(SIG_SETMASK,0,0xbfbfdc4c)
- ruby1.8  PSIG  SIG(null) caught handler=0x28188860 mask=0x7ffefeff 
code=0x10001
+ ruby1.8  RET   write 11/0xb
+ ruby1.8  RET   nanosleep -1 errno 22 Invalid argument
+ ruby1.8  CALL  write(0x2,0x2813751f,0x6)
+ ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+ ruby1.8  GIO   fd 2 wrote 6 bytes
+ "[BUG] "
+ ruby1.8  RET   clock_gettime 0
+ ruby1.8  RET   write 6
+ ruby1.8  CALL  nanosleep(0x28b68eb0,0)
+ ruby1.8  RET   nanosleep -1 errno 22 Invalid argument
+ ruby1.8  CALL  write(0x2,0xbfbf98a0,0x12)
+ ruby1.8  CALL  clock_gettime(0,0x28b68eb8)
+ ruby1.8  GIO   fd 2 wrote 18 bytes
+ "Segmentation fault"
+ ruby1.8  RET   clock_gettime 0
+ ruby1.8  RET   write 18/0x12
+ ruby1.8  CALL  nanosleep(0x28b68eb0,0)
+ ruby1.8  CALL  write(0x2,0xbfbf98a0,0x3d)
+ ruby1.8  GIO   fd 2 wrote 61 bytes
+ "
+(2012-02-08 patchlevel 358) [i486-kfreebsd-gnu]
+   
+ "
+ ruby1.8  RET   write 61/0x3d
+ ruby1.8  CALL  sigprocmask(SIG_UNBLOCK,0xbfbfbf50,0)
  ruby1.8  RET   sigprocmask 0
- ruby1.8  CALL  sigprocmask(SIG_SETMASK,0x28b68e80,0)
- ruby1.8  CALL  sigsuspend(0xbfbfdc4c)
- ruby1.8  RET   sigprocmask 0
- ruby1.8  CALL  thr_kill(,SIG(null))
- ruby1.8  RET   thr_kill 0
- ruby1.8  PSIG  SIG(null) caught handler=0x28188860 mask=0x8000 
code=0x10001
- ruby1.8  CALL  thr_kill(,SIG(null))
- ruby1.8  RET   sigsuspend JUSTRETURN
- ruby1.8  RET   thr_kill 0
- ruby1.8  CALL  sigreturn(0xbfbfd930)
+ ruby1.8  CALL  thr_kill(,SIGIOT)
+ ruby1.8  RET   thr_kill 0
+ ruby1.8  PSIG  SIGIOT SIG_DFL code=0x10001
  ruby1.8  RET   poll -1 errno 4 Interrupted system call
- ruby1.8  RET   sigreturn JUSTRETURN
- ruby1.8  PSIG  SIG(null) caught handler=0x28188710 mask=0xfffefeef 
code=0x10001
- ruby1.8  CALL  thr_exit(0x807c1c0)
- ruby1.8  CALL  write(0x4,0xbfbfdc8c,0x24)
- ruby1.8  CALL  sigreturn(0x807a850)
- ruby1.8  GIO   fd 4 wrote 36 bytes
- 0x a01a 3328 0100  0204  44a1 0e00 0060 0828 bc2a 1828 70fd 
1628 d0d5 1728 e0dc bfbf  
|..3(D`.(.*.(p..(...(|
-
- ruby1.8  RET   sigreturn JUSTRETURN
- ruby1.8  RET   write 36/0x24
- ruby1.8  CALL  poll(0x807ac34,0x1,0x7d0)
- ruby1.8  RET   poll 1
- ruby1.8  CALL  sigaction(SIGINT,0xbfbfe548,0xbfbfe59c)
- ruby1.8  CALL  read(0x3,0x807ac00,0x24)
- ruby1.8  RET   sigaction 0
- ruby1.8  GIO   fd 3 read 36 bytes
- 0x a01a 3328 0100  0204  44a1 0e00 0060 0828 bc2a 1828 70fd 
1628 d0d5 1728 e0dc bfbf  
|..3(D`.(.*.(p..(...(|
-
- rub

Re: Bug#673594: ruby1.8: FTBFS[kfreebsd-*]: test-all hangs/segfaults

2012-05-24 Thread Antonio Terceiro
clone 673594 -1
severity -1 important
retitle -1 ruby1.8: threaded code segfaults under kfreebsd-*
tags 673594 + pending
thanks

Hi Steven,

Steven Chamberlain escreveu isso aĆ­:
> Whereas the buildds experience hangs during some tests, I see segfaults
> instead.  This sometimes happens even before the first test has been run.
> 
> This small Ruby testcase results in segfault 50% of the time under
> ruby1.8 1.8.7.358-2, but always succeeds with ruby1.9.1 1.9.3.0-2:
> 
> > require 'thread'
> > Thread.new do
> > foo = "bar"
> > end
> 
> (Measured out of 100 runs, on kfreebsd-i386 with 4-way SMP)
> 
> Attached are outputs from ktrace for a success and from a failure;  then
> I've tried to diff them.  There seems to be a race whereby thread0 tries
> to call thr_kill on thread2, but if that happens too late thread2 will
> trigger a segfault instead.

Thanks for the patch. I am preparing an upload to workaround the test
timeout and make the FTBFS go away ASAP.

If you can prepare a patch to fix the race condition, please attach it
to the new bug report which I am creating by cloning this one.

-- 
Antonio Terceiro 


signature.asc
Description: Digital signature