Re: Fix for sparc64 cpu hangs.

2007-12-17 Thread Josip Rodin
On Mon, Dec 17, 2007 at 01:57:55AM -0800, David Miller wrote:
> From: Josip Rodin <[EMAIL PROTECTED]>
> Date: Mon, 17 Dec 2007 10:40:05 +0100
> 
> > The machine was stuck with a white blank screen on the monitor, and a
> > dysfunctional keyboard (USB). It's a standard keyboard that came with the
> > same machine, and otherwise works fine. It's completely dead - not only
> > normal key combinations (including Alt+SysRq), but pressing Caps Lock
> > doesn't turn on its LED, and Stop+A doesn't work either.
> > 
> > The machine doesn't have any ports for the old-style Sun keyboards, so I'm
> > out of options...
> 
> You're more likely to get a good crash dump on the serial
> console, if that is something you can use.

I tried setting the console devices to rsc-console in PROM, but the RSC
never sees anything other than the initial kernel output, and doesn't
relay any input to the machine. Have you tried that on your 280R?

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix for sparc64 cpu hangs.

2007-12-17 Thread Josip Rodin
On Sun, Dec 16, 2007 at 11:48:31PM +0100, Josip Rodin wrote:
> > One problem I was pointed to was the build failure of erlang. Here the
> > created erlc binary segfaults with a bus error.
> > 
> > - this only happens on US III machines, works fine on US II.
> > 
> > - on lebrun it doesn't happen on the first call of erlc, but after
> > several successful runs of it - see
> > http://buildd.debian.org/fetch.cgi?&pkg=erlang&ver=1%3A11.b.5dfsg-11&arch=sparc&stamp=1197012623&file=log
> 
> BTW, we've had lebrun crash three times since November 12th, the last time
> a few hours ago. The first time I just cycled it via RSC, the second time
> someone looked at it and the console was blank (!) after which we cycled it.
> I'm going over there tomorrow morning to see what's on the console now.

The machine was stuck with a white blank screen on the monitor, and a
dysfunctional keyboard (USB). It's a standard keyboard that came with the
same machine, and otherwise works fine. It's completely dead - not only
normal key combinations (including Alt+SysRq), but pressing Caps Lock
doesn't turn on its LED, and Stop+A doesn't work either.

The machine doesn't have any ports for the old-style Sun keyboards, so I'm
out of options...

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix for sparc64 cpu hangs.

2007-12-16 Thread Josip Rodin
On Sun, Dec 16, 2007 at 11:41:19PM +0100, Bernd Zeimetz wrote:
> >> I'll leave the kernel running and make sure the machine gets some more
> >> users and load during the next days.
> > 
> > Thanks for testing, let me know if any more issues trigger.
> 
> One problem I was pointed to was the build failure of erlang. Here the
> created erlc binary segfaults with a bus error.
> 
> - this only happens on US III machines, works fine on US II.
> 
> - on lebrun it doesn't happen on the first call of erlc, but after
> several successful runs of it - see
> http://buildd.debian.org/fetch.cgi?&pkg=erlang&ver=1%3A11.b.5dfsg-11&arch=sparc&stamp=1197012623&file=log

BTW, we've had lebrun crash three times since November 12th, the last time
a few hours ago. The first time I just cycled it via RSC, the second time
someone looked at it and the console was blank (!) after which we cycled it.
I'm going over there tomorrow morning to see what's on the console now.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bus error while building Erlang

2007-11-25 Thread Josip Rodin
On Sun, Nov 25, 2007 at 03:53:37PM +0100, BERTRAND Joël wrote:
> >Recent build of erlang package failed on sparc architecture
> >(http://buildd.debian.org/build.php?&pkg=erlang&arch=sparc&file=log).
> >I cannot find a reason looking at the log file, and I don't have an
> >access to any sparc machine to debug the failure.
> 
>   I have seen the same bus error with gcc 4.x (internal error that 
> returns a bus error on Sbus based sparc _and_ sparc64 linux boxes). I'm 
> trying to find this bug for a long time without any success. I never 
> have seen this bug on PCI based sparc64 boxes. Please note that I'm not 
> sure that this trouble comes from Sbus subsystem ;-) I suspect a bug in 
> sparc/sparc64 kernel.

FWIW, the box that produced that log linked above was running 2.6.23.8, and
it's a sparc64 CPU with PCI.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix for sparc64 cpu hangs.

2007-11-12 Thread Josip Rodin
On Sat, Nov 10, 2007 at 10:13:28PM -0800, David Miller wrote:
> From: Josip Rodin <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2007 15:25:46 +0100
> 
> > I applied the patch, rebooted into the new kernel, and let lebrun run its
> > buildd, but the apt package fetching method constantly times out trying to
> > reach incoming.debian.org - that (unrelated) server is having a downtime.
> > So unfortunately I can't properly test this long-awaited fix :)
> > 
> > But I did the artificial tests, like running dpkg-query --search libc.so.6
> > in loops, and this seems to work well. Thanks a lot!
> 
> Please let me know if things go smoothly when the
> build becomes active again.

It became functional again a couple of hours ago, and for now everything
seems just fine, it's happily churning away, load hovers around 1, memory
usage seems normal, and nothing's getting stuck.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix for sparc64 cpu hangs.

2007-11-07 Thread Josip Rodin
On Tue, Nov 06, 2007 at 09:13:56PM -0800, David Miller wrote:
> > [FUTEX]: Fix address computation in compat code.
> Here is an updated patch:

I applied the patch, rebooted into the new kernel, and let lebrun run its
buildd, but the apt package fetching method constantly times out trying to
reach incoming.debian.org - that (unrelated) server is having a downtime.
So unfortunately I can't properly test this long-awaited fix :)

But I did the artificial tests, like running dpkg-query --search libc.so.6
in loops, and this seems to work well. Thanks a lot!

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-11-02 Thread Josip Rodin
On Thu, Nov 01, 2007 at 09:55:44PM -0700, David Miller wrote:
> > I'm working on a kernel patch for 2.6.23 that will allow you to get
> > some useful debugging information in situations like this.
> >
> > I'll try to get you that patch by the end of tonight.
> 
> As promised, here is the patch below.
> echo "g" >/proc/sysrq-trigger
> 
> So when you get a stuck process or whatever, trigger this and
> send the output :-)

Great. Here you go, three of them, while the load was 3 and this process was
stuck:

buildd   10813  100  0.8 987368 17504 ?RN   14:44 155:49 dpkg-query 
--search libpthread.so.0 libdl.so.2 libstdc++.so.6 libm.so.6 libgcc_s.so.1 
libc.so.6 libFLAC.so.8 libid3tag.so.0 libz.so.1 libmad.so.0 libglib-2.0.so.0 
libmikmod.so.2 libsndfile.so.1 libvorbis.so.0 libogg.so.0 libvorbisfile.so.3

-- 
 2. That which causes joy or happiness.
Nov  2 17:01:52 lebrun kernel: SysRq : Show Global CPU Regs
Nov  2 17:01:52 lebrun kernel:   CPU[  0]: TSTATE[] 
TPC[] TNPC[] TASK[NULL:-1]
Nov  2 17:01:52 lebrun kernel:  
TPC[sparc64_realfault_common+0x8/0x20]
Nov  2 17:01:52 lebrun kernel: * CPU[  1]: TSTATE[] 
TPC[] TNPC[] TASK[sh:12919]
Nov  2 17:02:04 lebrun kernel: SysRq : Show Global CPU Regs
Nov  2 17:02:04 lebrun kernel:   CPU[  0]: TSTATE[80009604] 
TPC[00407924] TNPC[00407928] TASK[dpkg-query:10813]
Nov  2 17:02:04 lebrun kernel:  
TPC[sparc64_realfault_common+0x8/0x20]
Nov  2 17:02:04 lebrun kernel: * CPU[  1]: TSTATE[] 
TPC[] TNPC[] TASK[sh:12928]
Nov  2 17:17:02 lebrun kernel: SysRq : Show Global CPU Regs
Nov  2 17:17:02 lebrun kernel:   CPU[  0]: TSTATE[] 
TPC[00407924] TNPC[00407928] TASK[dpkg-query:10813]
Nov  2 17:17:02 lebrun kernel:  
TPC[sparc64_realfault_common+0x8/0x20]
Nov  2 17:17:02 lebrun kernel: * CPU[  1]: TSTATE[] 
TPC[] TNPC[] TASK[sh:16444]


Re: unkillable dpkg-query processes

2007-11-01 Thread Josip Rodin
Hi,

lebrun.d.o hasn't crashed in a while now, but it has this in the
process list:

buildd2382  0.0  0.2   8144  4736 ?Ss   Oct30   0:00 /usr/bin/perl 
/usr/bin/buildd
buildd2407  0.0  0.5  13920 11296 ?SN   Oct30   0:10  \_ 
/usr/bin/perl /usr/bin/sbuild --batch --stats-dir=/home/buildd/
buildd   18174  0.0  0.0  0 0 ?ZNs  Oct30   0:00  \_ [su] 

buildd   23305  100  1.6 1007296 33288 ?   RN   Oct30 3507:30 dpkg-query 
--status squashfs-source

At the same time:

% free
 total   used   free sharedbuffers cached
Mem:   20730402021224  51816  0 196808  21144
-/+ buffers/cache:1803272 269768
Swap:  10486881041048584
% uptime
 22:38:36 up 2 days, 10:53,  1 user,  load average: 3.00, 3.01, 3.00

Given that it's still not catatonic, can I do something to provide some
debugging information?

(BTW, I'm subscribed to the sparclinux list now.)

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-29 Thread Josip Rodin
Hi,

(Sorry for breaking the threading - I didn't subscribe to the list,
I just found this in the web archive. I should probably subscribe... :)

David Miller wrote:
> Ok, since I have a 280R just like Josip, I think a good plan
> is for him to show me the commands he used to create the
> build root where he can trigger bad things.

I can't be 100% sure, because it was James Troup who initially set it up,
but I believe that the chroot on lebrun.d.o was set up by just doing
something mundane like running debootstrap, more specifically something
like this:

sudo debootstrap lenny /mnt http://ftp.us.debian.org/debian

I conclude this because it has a var/log/bootstrap.log in it,
dated 2007-06-19 12:15, which has:

Selecting previously deselected package base-files.
(Reading database ... 0 files and directories currently installed.)
Unpacking base-files (from .../base-files_4.0.0_sparc.deb) ...
[...]
Setting up build-essential (11.3) ...

And it also has a var/log/dpkg.log which has:

2007-06-19 12:13:10 install base-files  4.0.0
[...]
2007-06-19 12:15:23 status installed build-essential 11.3

Again I can't be 100% sure of the exact command line used, but that
really should be it :)

After that, dpkg.log in the chroot also has a purge of the 'procps' package,
and an installation of the 'sparc-utils' package. A few hours after those
two, a random selection of package installations starts - the buildd went
online.

I'd try doing a debootstrap of lenny (that's Debian "testing"),
and then inside it, run one or more of those 'dpkg-query -S libc.so.6'.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


install target in the kernel makefiles

2007-10-28 Thread Josip Rodin
Hi,

Is there any reason why the kernel makefiles don't support the 'install'
target on sparc, just like they do on x86(_64)? The standard installkernel(8)
wrapping works fine.

I posted this question a while ago to the debian-sparc list[1] and
Martin Habets replied with a fairly trivial patch[2], but saying it
was not accepted upstream. Can it be accepted now, please? :)

-- 
 2. That which causes joy or happiness.

[1] http://lists.debian.org/debian-sparc/2007/04/msg5.html
[2] http://lists.debian.org/debian-sparc/2007/04/msg00015.html
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-26 Thread Josip Rodin
On Sat, Oct 27, 2007 at 12:30:56AM +0200, Bernd Zeimetz wrote:
> > Josip, do you guys have libnss-db or similar in use on the buildd
> > machine?
> 
> They have, that's what Debian's userdir-ldap uses.

No, I have to correct you, this machine isn't part of that setup
(at least not yet).

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-26 Thread Josip Rodin
On Fri, Oct 26, 2007 at 03:01:24PM -0700, David Miller wrote:
> One thing I notice in the debian bug report is a mention of libnss-db
> 
> So I did some testing here and without libnss-db installed, running
> dpkg-query does not use futexes at all.
> 
> But once I install libnss-db and enable it (by running 'make' under
> /var/lib/misc then editing /etc/nsswitch.conf to make 'db' get
> searched first) indeed dpkg-query starts using futexes via the
> libnss-db library.
> 
> Josip, do you guys have libnss-db or similar in use on the buildd
> machine?

lebrun.d.o doesn't have libnss-db installed, neither outside nor inside
the chroot, sorry.

Both setups have the default /etc/nsswitch.conf that searches 'db' before
'files' for protocols, services, ethers, rpc, but that's it.

BTW, would you benefit from having an account on this machine?

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-25 Thread Josip Rodin
On Thu, Oct 25, 2007 at 05:07:36PM +0200, joy wrote:
> > If you try, within that troublesome build-root, a few times to try to
> > fork off a couple hundred:
> > 
> > dpkg-query --something python-2.5
> > 
> > or whatever, can you get some of processes to wedge under that
> > build root?
> 
> I did this in a chrooted bash:
> 
> for i in $(seq 0 100); do (dpkg-query -s python2.5-minimal > /dev/null &); 
> done
> 
> And now the machine went catatonic. :(
> 
> Thankfully the console is still vaguely operational - I can enter my
> username to log in, but I can't get the Password prompt to appear.
> Magic SysRq still works - if you need any output from it, tell me.

The machine continued in this state for a couple of hours or so, it didn't
come back to life. When I went to check up on it, the kernel showed one
message on the console - OOM killer killed a make process. I then gave up,
used SysRq to S+U+B, and it booted again, and I was able to retrieve the
following data from kern.log that is in the attachment. Hope that helps.

-- 
 2. That which causes joy or happiness.
Oct 25 17:04:09 lebrun kernel: SysRq : Emergency Sync
Oct 25 17:04:20 lebrun kernel: SysRq : HELP : loglevel0-8 reBoot tErm Full kIll 
saK showMem Nice showPc show-all-timers(Q) unRaw Sync showTasks Unmount 
shoW-blocked-tasks 
Oct 25 17:04:20 lebrun kernel: SysRq : Show Memory
Oct 25 17:04:20 lebrun kernel: Mem-info:
Oct 25 17:04:20 lebrun kernel: Normal per-cpu:
Oct 25 17:04:20 lebrun kernel: CPU0: Hot: hi:   90, btch:  15 usd:   0   
Cold: hi:   30, btch:   7 usd:   0
Oct 25 17:04:20 lebrun kernel: CPU1: Hot: hi:   90, btch:  15 usd:   4   
Cold: hi:   30, btch:   7 usd:  24
Oct 25 17:04:20 lebrun kernel: Active:202209 inactive:46687 dirty:39 
writeback:279 unstable:0
Oct 25 17:04:20 lebrun kernel:  free:723 slab:2826 mapped:2986 pagetables:875 
bounce:0
Oct 25 17:04:20 lebrun kernel: Normal free:5616kB min:5760kB low:7200kB 
high:8640kB active:1619760kB inactive:371344kB present:2077352kB 
pages_scanned:178 all_unreclaimable? no
Oct 25 17:04:20 lebrun kernel: lowmem_reserve[]: 0 0
Oct 25 17:04:21 lebrun kernel: Normal: 780*8kB 11*16kB 1*32kB 1*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 6512kB
Oct 25 17:04:21 lebrun kernel: Swap cache: add 251630, delete 187227, find 
26426/42924, race 80+86
Oct 25 17:04:21 lebrun kernel: Free swap  = 174880kB
Oct 25 17:04:24 lebrun kernel: Total swap = 1048688kB
Oct 25 17:04:24 lebrun kernel: Free swap:   174648kB
Oct 25 17:04:24 lebrun kernel: 261865 pages of RAM
Oct 25 17:04:24 lebrun kernel: 3001 reserved pages
Oct 25 17:04:24 lebrun kernel: 155176 pages shared
Oct 25 17:04:24 lebrun kernel: 64407 pages swap cached
Oct 25 17:04:24 lebrun kernel: 39 pages dirty
Oct 25 17:04:24 lebrun kernel: 124 pages writeback
Oct 25 17:04:24 lebrun kernel: 2986 pages mapped
Oct 25 17:04:24 lebrun kernel: 2826 pages slab
Oct 25 17:04:24 lebrun kernel: 875 pages pagetables
Oct 25 17:05:01 lebrun kernel: SysRq : Emergency Sync
Oct 25 17:05:04 lebrun kernel: SysRq : HELP : loglevel0-8 reBoot tErm Full kIll 
saK showMem Nice showPc show-all-timers(Q) unRaw Sync showTasks Unmount 
shoW-blocked-tasks 
Oct 25 17:05:07 lebrun kernel: SysRq : Show Blocked State
Oct 25 17:05:07 lebrun kernel:   taskPC stack   pid 
father
Oct 25 17:05:07 lebrun kernel: kswapd0   D 00528bc8 0   181 
 2
Oct 25 17:05:07 lebrun kernel: Call Trace:
Oct 25 17:05:08 lebrun kernel:  [006258e0] io_schedule+0x2c/0x38
Oct 25 17:05:08 lebrun kernel:  [00528bc8] get_request_wait+0x11c/0x15c
Oct 25 17:12:13 lebrun kernel:  [0052a220] ges+0x144/0x258
Oct 25 17:12:13 lebrun kernel:  [0048cf34] __alloc_pages+0x1b0/0x330
Oct 25 17:12:13 lebrun kernel:  [0049f50c] 
read_swap_cache_async+0x40/0x150
Oct 25 17:12:13 lebrun kernel:  [00495908] swapin_readahead+0x3c/0x7c
Oct 25 17:12:13 lebrun kernel:  [004973b4] handle_mm_fault+0x3fc/0x7cc
Oct 25 17:12:13 lebrun kernel:  [0044e084] do_sparc64_fault+0x314/0x594
Oct 25 17:12:13 lebrun kernel:  [0040794c] 
sparc64_realfault_common+0x18/0x20
Oct 25 17:12:13 lebrun kernel:  [00015078] 0x15080
Oct 25 17:12:13 lebrun kernel: dpkg-queryD 00528bc8 0  3924 
 1
Oct 25 17:12:13 lebrun kernel: Call Trace:
Oct 25 17:12:13 lebrun kernel:  [006258e0] io_schedule+0x2c/0x38
Oct 25 17:12:13 lebrun kernel:  [00528bc8] get_request_wait+0x11c/0x15c
Oct 25 17:12:13 lebrun kernel:  [0052a220] __make_request+0x5f0/0x6a8
Oct 25 17:12:13 lebrun kernel:  [00526bac] 
generic_make_request+0x2f8/0x31c
Oct 25 17:12:13 lebrun kernel:  [00526cd4] submit_bio+0x104/0x10c
Oct 25 17:12:13 lebrun kernel:  [0049f30c] swap_writepage+0xa4/0xb4
Oct 25 17:12:13 lebrun kernel:  [004918c4] shrink_page_list+0x410/0x6f4
Oct 25 17:12:13 lebrun kernel:  [004922c8] shrink_zone+0x720/0xa38
Oct 25 17:12:13 lebrun kernel:  [0

Re: unkillable dpkg-query processes

2007-10-25 Thread Josip Rodin
On Wed, Oct 24, 2007 at 04:36:03PM -0700, David Miller wrote:
> > > 2) compiler used to build kernel and is it SMP?
> > 
> > gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
> > 
> > I've no idea if that compiler is SMP, if you want I'll ask someone else.
> 
> I was asking if the running kernel were SMP or not, I'll assume it is
> :-)

Oh, sorry, that sentence structure confused me (not a native speaker).

> > There doesn't appear to be a pattern, on this machine at least - I just let
> > the buildd run, building whatever comes up, and after a few hours it
> > inevitably runs into a wall.
> 
> If you try, within that troublesome build-root, a few times to try to
> fork off a couple hundred:
> 
>   dpkg-query --something python-2.5
> 
> or whatever, can you get some of processes to wedge under that
> build root?

I did this in a chrooted bash:

for i in $(seq 0 100); do (dpkg-query -s python2.5-minimal > /dev/null &); done

And now the machine went catatonic. :(

Thankfully the console is still vaguely operational - I can enter my
username to log in, but I can't get the Password prompt to appear.
Magic SysRq still works - if you need any output from it, tell me.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-25 Thread Josip Rodin
On Wed, Oct 24, 2007 at 11:41:13PM -0700, David Miller wrote:
> Josip, give this debugging patch a try.  It is against 2.6.23.1
> but it should apply to most recent kernels.

OK, after resurrecting the machine once again (it had died in the meantime,
reliably as ever), I did:

patching file kernel/futex.c
Hunk #1 succeeded at 1877 (offset 3 lines).
Hunk #2 succeeded at 1903 (offset 3 lines).
Hunk #3 succeeded at 1926 (offset 3 lines).

> It should give you debugging messages in the kernel log that
> start with "FUTEX_BUG" if the debugging code triggers.
> 
> Please post just a few samples of whatever it spits out.

It's been running with the patched kernel for some 6.5 hours now, no
problems yet. I'll let you know as soon as it starts to misbehave.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-10-24 Thread Josip Rodin
On Wed, Oct 24, 2007 at 03:58:29PM -0700, David Miller wrote:
> I know, I've seen this report a million times :-)

Oh, I know you know, I mailed you a while ago and you told me to mail
the mailing list :)

> I can't reproduce it, I've even tried the fabled test case
> where you spawn thousands of dpkg-query instances and it never
> does anything wrong on my Niagara boxes.
> 
> So something is different about your environment than mine.
> 
> Let's see if there is some aspect of the environment that
> contributed to the problem occurring.  Please reproduce
> with 2.6.23-final and then list (I know this is redundant,
> just humor me :-):

Confirming that the machine could reproduce the problem with 2.6.23.1.
(I can send over the .config if it matters.)

> 1) system type

A Sun Fire 280R, with two CPU boards, each carrying a TI UltraSparc III
(Cheetah), and 2 GB of RAM. If you need more info, just say.

(Bernd Zeimetz has previously suggested that the problem is linked to
the processor type, the USIII.)

> 2) compiler used to build kernel and is it SMP?

gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

I've no idea if that compiler is SMP, if you want I'll ask someone else.

> 3) glibc in use
> 4) compiler used to build running glibc

In that particular chroot, it's:

chroot-unstable% lib/libc-2.6.1.so
GNU C Library stable release version 2.6.1, by Roland McGrath et al.
[...]
Compiled by GNU CC version 4.2.1 (Debian 4.2.1-5).
Compiled on a Linux >>2.6.17-rc1<< system on 2007-09-04.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
software FPU emulation by Richard Henderson, Jakub Jelinek and
others
[...]

Outside of that chroot, it's:

% /lib/libc-2.3.6.so 
GNU C Library stable release version 2.3.6, by Roland McGrath et al.
[...]
Compiled by GNU CC version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21).
Compiled on a Linux 2.6.18 system on 2007-03-01.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
linuxthreads-0.10 by Xavier Leroy
BIND-8.2.3-T5B
libthread_db work sponsored by Alpha Processor Inc
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
software FPU emulation by Richard Henderson, Jakub Jelinek and
others
Thread-local storage support included.
[...]

> If you have a reproducable test case, that's even better.

There doesn't appear to be a pattern, on this machine at least - I just let
the buildd run, building whatever comes up, and after a few hours it
inevitably runs into a wall.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unkillable dpkg-query processes

2007-10-24 Thread Josip Rodin
Hi,

(I forgot to send this before...)

We've been having grave issues with a few of our sparc build daemon machines
in Debian. Something causes dpkg-query(8) processes, otherwise harmless, to
run amok and allocate too much memory, but keep running and become resilient
to killing. They eventually push the machine to the point where you can only
ping it, but all the userland and the console is dead.

I'm the admin of one of those machines, lebrun.debian.org, and all the
details I noticed so far are at http://bugs.debian.org/433187

I'm certainly no expert on the kernel, but I'd be happy to help by providing
debug information about the issue. Any suggestions will be most welcome,
because at this point that machine lives a sorry life of
reboot->buildd runs for a few hours->machine goes dead->reboot...

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


arch/sparc64/kernel/iommu_common.c:237: error: implicit declaration of function 'next_sg'

2007-10-24 Thread Josip Rodin
Hi,

Just tried 2.6.24-rc1... but:

arch/sparc64/kernel/iommu_common.c: In function 'prepare_sg':
arch/sparc64/kernel/iommu_common.c:237: error: implicit declaration of function 
'next_sg'
cc1: warnings being treated as errors
arch/sparc64/kernel/iommu_common.c:237: warning: assignment makes pointer from 
integer without a cast
make[1]: *** [arch/sparc64/kernel/iommu_common.o] Error 1

The error looks like a simple typo - should this be sg_next(), like
elsewhere in the file? It compiles the file when I replace that.

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html