Re: HPPA and lenny (ruby1.9 build problems)

2009-01-06 Thread Helge Deller
dann frazier wrote:
 On Tue, Jan 06, 2009 at 12:46:34AM +0100, Helge Deller wrote:
 CC: linux-paric mailing list

 Peter Palfrader wrote:
 On Mon, 05 Jan 2009, dann frazier wrote:

 On Tue, Dec 23, 2008 at 11:43:22AM +0100, Helge Deller wrote:
 Peter Palfrader wrote:
 Helge Deller schrieb am Dienstag, dem 23. Dezember 2008:

 Patch in parisc git tree:
 http://git.kernel.org/?p=linux/kernel/git/kyle/parisc-2.6.git;a=commitdiff;h=378fe7c4cc619b561409206605c723c05358edac;hp=6c4dfa8f8bcf032137aacb3640d7dd9d75b2b607
 So just using an SMP kernel should also work?
 Probably yes, since some other developers tried initially to reproduce
 the problem, but they couldn't (as it seems they were running on newer
 SMP machines). But I don't have a SMP server which is why I can't test
 myself...
 Unfortunately, it looks like we're still having problems on the
 buildds w/ 2.6.26 SMP kernels:
   
 http://buildd.debian.org/build.php?pkg=ruby1.9ver=1.9.0.2-9arch=hppafile=log

 The build doesn't take the system down, but does still hang
 indefinitely while running miniruby - though the hang location varies.

 I'll prepare a UP kernel for one of the buildds w/ the
 up-optimization-removal patch just to see if it improves things. I
 don't see why it would, other than it seemed to solve the problem on
 my test box when I first tested the patch.
 It seemed to fix the problem for me as well.
 
 fyi, I tested w/ a 2.6.26 32-bit UP kernel w/ the
 up-optimization-removal patch, and received another hang:
  
 http://buildd.debian.org/fetch.cgi?pkg=ruby1.9;ver=1.9.0.2-9;arch=hppa;stamp=1231212073

Yes, that's the same I can reproduce here as well.
It's AFAICS not the ProtectionID trap kernel bug any longer, which is good :-)

 In principle looking at the logs it looks more like a userspace bugs
 due to threading functions.
 Anyway, I'll try to reproduce it here as well.
 FWIW, I had some additional irq locking code in load_context(), maybe 
 this helps...?
 
 I'd be happy to test it if you can point me to a changeset.

Sorry, nothing yet.
As it does not seem to be related to the Protection ID trap, they are probably
useless anyway.
Overall, this is what I see when running dpkg-buildpackage for ruby1.9:
test_load.rb .
test_exception.rb 
test_thread.rb .
here it hangs

r...@c3000:~/cvs/ruby/ruby1.9-1.9.0.2# ps -efww
root 15817 15815  0 13:36 pts/000:00:00 /usr/bin/perl 
/usr/bin/dpkg-buildpackage
root 25673 3  0 14:56 pts/000:00:00 
/mnt/sdb4/cvs/ruby/ruby1.9-1.9.0.2/miniruby 
-I/mnt/sdb4/cvs/ruby/ruby1.9-1.9.0.2/lib 
-I/mnt/sdb4/cvs/ruby/ruby1.9-1.9.0.2/.ext/common -I./- 
-r/mnt/sdb4/cvs/ruby/ruby1.9-1.9.0.2/ext/purelib.rb -W0 bootstraptest.tmp.rb
root 25676 25673  0 14:56 pts/000:00:00 [miniruby] defunct
root 25892  2014  0 17:16 pts/100:00:00 ps -efwww
root 29832 15817  0 14:46 pts/000:00:00 /usr/bin/make -f debian/rules 
binary
root 32188 29832  0 14:55 pts/000:00:00 make test
root 3 32188  0 14:55 pts/000:00:00 ./miniruby -I./lib 
-I.ext/common -I./- -r./ext/purelib.rb ./bootstraptest/runner.rb 
--ruby=./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb  -q
root 32223 3  0 14:55 pts/000:00:00 ./miniruby -I./lib 
-I.ext/common -I./- -r./ext/purelib.rb ./bootstraptest/runner.rb 
--ruby=./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb  -q
root 32224 32223  0 14:55 pts/000:00:00 ./miniruby -I./lib 
-I.ext/common -I./- -r./ext/purelib.rb ./bootstraptest/runner.rb 
--ruby=./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb  -q

r...@c3000:~/cvs/ruby/ruby1.9-1.9.0.2# strace -p 3
Process 3 attached - interrupt to quit
_newselect(7, [6], NULL, NULL, NULL^C unfinished ...
Process 3 detached

r...@c3000:~/cvs/ruby/ruby1.9-1.9.0.2# strace -p 32223
Process 32223 attached - interrupt to quit
restart_syscall(... resuming interrupted call ...) = 0
getppid()   = 3
poll([{fd=3, events=POLLIN}], 1, 2000)  = 0 (Timeout)
getppid()   = 3
poll([{fd=3, events=POLLIN}], 1, 2000^C unfinished ...
Process 32223 detached

r...@c3000:~/cvs/ruby/ruby1.9-1.9.0.2# strace -p 32224
Process 32224 attached - interrupt to quit
nanosleep({0, 1000}, {0, 7191145})  = 0
nanosleep({0, 1000}, {0, 7191145})  = 0
nanosleep({0, 1000}, {0, 7191145})  = 0
nanosleep({0, 1000}, {0, 7191145})  = 0
...

So, it's probably somehow a threading-related problem.
I'm not sure yet, why the miniruby PID 25676 is defunct.

Needs quite some debugging, but we still have threading problems on hppa. 

 Yeah, penalosa got stuck again today, this was on the console:
 Does panalosa has the patched kernel (same one as the one on peri) ?
 
 Both machines were running an unpatched SMP 2.6.26 until I upgraded
 penalosa for the test I refer to above. The thinking being that -
 though these machines are single CPU - the SMP version should avoid
 the UP 

Re: HPPA and lenny (ruby1.9 build problems)

2009-01-05 Thread Helge Deller
CC: linux-paric mailing list

Peter Palfrader wrote:
 On Mon, 05 Jan 2009, dann frazier wrote:
 
 On Tue, Dec 23, 2008 at 11:43:22AM +0100, Helge Deller wrote:
 Peter Palfrader wrote:
 Helge Deller schrieb am Dienstag, dem 23. Dezember 2008:

 Patch in parisc git tree:
 http://git.kernel.org/?p=linux/kernel/git/kyle/parisc-2.6.git;a=commitdiff;h=378fe7c4cc619b561409206605c723c05358edac;hp=6c4dfa8f8bcf032137aacb3640d7dd9d75b2b607
 So just using an SMP kernel should also work?
 Probably yes, since some other developers tried initially to reproduce
 the problem, but they couldn't (as it seems they were running on newer
 SMP machines). But I don't have a SMP server which is why I can't test
 myself...
 Unfortunately, it looks like we're still having problems on the
 buildds w/ 2.6.26 SMP kernels:
   
 http://buildd.debian.org/build.php?pkg=ruby1.9ver=1.9.0.2-9arch=hppafile=log

 The build doesn't take the system down, but does still hang
 indefinitely while running miniruby - though the hang location varies.

 I'll prepare a UP kernel for one of the buildds w/ the
 up-optimization-removal patch just to see if it improves things. I
 don't see why it would, other than it seemed to solve the problem on
 my test box when I first tested the patch.

It seemed to fix the problem for me as well.
In principle looking at the logs it looks more like a userspace bugs
due to threading functions.
Anyway, I'll try to reproduce it here as well.
FWIW, I had some additional irq locking code in load_context(), maybe 
this helps...?

 Yeah, penalosa got stuck again today, this was on the console:

Does panalosa has the patched kernel (same one as the one on peri) ?
The protection ID traps shouldn't happen any longer, and from the buildd
logs on peri it does seem like that the ProtID traps don't happen there.

Helge

 ...
 [18255061.952000] 
   
 [18255240.024000] install (pid 15737): Protection id trap (code 27) at
 4038d203  

 [18255240.116000] Backtrace:  
   
 [18255240.148000] 
   
 [18255258.72] dpkg-deb (pid 15897): Protection id trap (code 27) at 
 
 4035defb  
   
 [18255258.812000] Backtrace:  
   
 [18255258.844000] 
   
 [18260696.284000] dpkg-deb (pid 13540): Protection id trap (code 27) at 
 
 00026f1b  
   
 [18260696.376000] Backtrace:  
   
 [18260696.408000] 
   
 [18289955.716000] [ cut here ]
   
 [18289955.776000] kernel BUG at fs/inode.c:262!

I think this bug is unrelated to the ruby1.9 issue.
 
 [18289955.824000] 
   
 [18289955.848000]  YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI   
   
 [18289955.908000] PSW: 1100 Tainted: G  D 
   
 [18289955.988000] r00-03  00ff0804ff0f 401e7888 401e78a4 
 a7d1d750  
  
 [18289956.084000] r04-07  405c9660 a7d1d750 404a5d40 
 00012db86400  
  
 [18289956.184000] r08-11  f000 00017800 000dc2f7 
 00012f871828  
  
 [18289956.284000] r12-15  7f9d7000 000e7d58 81a4 
 0001  
  
 [18289956.384000] r16-19  000ad800 000ad800 000f4648 
 40501e4c  
  
 [18289956.48] r20-23  080f 080f 00012e623b40 
   
  
 [18289956.58] r24-27  00012f93a2c8  a7d1d750 
 405c9660  
  
 [18289956.68] r28-31  0002 7d800690 7d8006c0 
 002ac810  
  
 [18289956.78] sr00-03  03ab8800   
 03da5800  
 
 [18289956.88] sr04-07     
   

Re: HPPA and lenny (ruby1.9 build problems)

2009-01-05 Thread dann frazier
On Tue, Jan 06, 2009 at 12:46:34AM +0100, Helge Deller wrote:
 CC: linux-paric mailing list
 
 Peter Palfrader wrote:
  On Mon, 05 Jan 2009, dann frazier wrote:
  
  On Tue, Dec 23, 2008 at 11:43:22AM +0100, Helge Deller wrote:
  Peter Palfrader wrote:
  Helge Deller schrieb am Dienstag, dem 23. Dezember 2008:
 
  Patch in parisc git tree:
  http://git.kernel.org/?p=linux/kernel/git/kyle/parisc-2.6.git;a=commitdiff;h=378fe7c4cc619b561409206605c723c05358edac;hp=6c4dfa8f8bcf032137aacb3640d7dd9d75b2b607
  So just using an SMP kernel should also work?
  Probably yes, since some other developers tried initially to reproduce
  the problem, but they couldn't (as it seems they were running on newer
  SMP machines). But I don't have a SMP server which is why I can't test
  myself...
  Unfortunately, it looks like we're still having problems on the
  buildds w/ 2.6.26 SMP kernels:

  http://buildd.debian.org/build.php?pkg=ruby1.9ver=1.9.0.2-9arch=hppafile=log
 
  The build doesn't take the system down, but does still hang
  indefinitely while running miniruby - though the hang location varies.
 
  I'll prepare a UP kernel for one of the buildds w/ the
  up-optimization-removal patch just to see if it improves things. I
  don't see why it would, other than it seemed to solve the problem on
  my test box when I first tested the patch.
 
 It seemed to fix the problem for me as well.

fyi, I tested w/ a 2.6.26 32-bit UP kernel w/ the
up-optimization-removal patch, and received another hang:
 
http://buildd.debian.org/fetch.cgi?pkg=ruby1.9;ver=1.9.0.2-9;arch=hppa;stamp=1231212073

 In principle looking at the logs it looks more like a userspace bugs
 due to threading functions.
 Anyway, I'll try to reproduce it here as well.
 FWIW, I had some additional irq locking code in load_context(), maybe 
 this helps...?

I'd be happy to test it if you can point me to a changeset.

  Yeah, penalosa got stuck again today, this was on the console:
 
 Does panalosa has the patched kernel (same one as the one on peri) ?

Both machines were running an unpatched SMP 2.6.26 until I upgraded
penalosa for the test I refer to above. The thinking being that -
though these machines are single CPU - the SMP version should avoid
the UP optimization code.

 The protection ID traps shouldn't happen any longer, and from the buildd
 logs on peri it does seem like that the ProtID traps don't happen there.

There were no protection trap messages in penalosa's dmesg after the
above hang. In fact, it contains nothing other than bootup messages.

 Helge

Thanks for all your help so far - its really appreciated.

-- 
dann frazier


-- 
To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org