Re: Sparc architecture requalification
On Mon, 22 May 2006, Gustavo Franco wrote: Is there a simple way to reproduce this critical bug in a ultra1 (yeah!) ? Btw, i've some suggestions to easily identify and backport the .17-rcX fix: No, it's not easy. Neither I nor Clint Adams were able to reproduce it locally on similar machines. Yet it was killing two different buildds every time it tried to build openoffice and other large packages. So far the only person who can reproduce it reliably (even with 2.6.17-rc3) is Blars Blarson. If you are lucky (for some values of lucky :-), you can hit another bug which probably affects ultra1: esp scsi driver is busted and dies with DMA errors on any significant disk activity. Martin Habets is currently looking into it. - Ask David S. Miller He's aware of this problem. - If he can't tell us the exact commit, we can isolate the problem using git bisect[0] In a ultra1 the bisect game wll took ages for me, so i couldn't do that, just reproduce the bug with a older kernel and test a patched .16. So far there was only tentative agreement to adopt 2.6.16 for etch (at least, that's my perception of the situation). Certain arguments against 2.6.16 were presented on debian-kernel mailing list. For sparc, 2.6.16 is a lose-lose situation, because a) the status of the 2.6.16 kernel with respect to SMP crash is largely unknown, and testing it out extensively on buildds is not very feasible; and b) 2.6.17 is the first kernel which contains the support for Sun's new Niagara processor. That support is not trivially backportable, so if 2.6.16 is adopted as the etch kernel, we might have to copy over the whole sparc64 directory from 2.6.17 and hope that we can make it work. Best regards, Jurij Smakov[EMAIL PROTECTED] Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: Sparc architecture requalification
On 5/21/06, Steve Langasek <[EMAIL PROTECTED]> wrote: On Sat, May 20, 2006 at 05:34:08AM +, Aurelien Jarno wrote: (...) > - The kernel failures (that occurs only on SMP boxes) seems to be gone, > at least on the build daemons. I don't know what has been done (if > somebody know, please tell us), but the two packages that were killing > the buildds (ie glibc and openoffice.org) are now building correctly (4 > last uploads for the glibc, last upload for openoffice.org). What's been done is to install a kernel which is newer than any that are actually available in sid or etch. The fact that this seems to fix the problem is a positive step in the right direction, but it's not sufficient for the release qual as it leaves us with very low confidence in the usability of the port when we can't use the Debian kernels for etch on any of the relevant project machines.[1] So the ideal solution is that, now that we have a known-working version, someone determines whether 2.6.16 includes the same fixes and if not, gets them backported to 2.6.16 for etch. Is there a simple way to reproduce this critical bug in a ultra1 (yeah!) ? Btw, i've some suggestions to easily identify and backport the .17-rcX fix: - Ask David S. Miller - If he can't tell us the exact commit, we can isolate the problem using git bisect[0] In a ultra1 the bisect game wll took ages for me, so i couldn't do that, just reproduce the bug with a older kernel and test a patched .16. [0] = http://www.kernel.org/pub/software/scm/git/docs/howto/isolate-bugs-with-bisect.txt Hope that helps, -- stratus
Re: Sparc architecture requalification
On Sat, May 20, 2006 at 05:34:08AM +, Aurelien Jarno wrote: > It has been a long time since the sparc status on the architecture > requalification page [1] has been updated. A few things seems to have > changed: > - There is now 3 sparc buildds (mrpurply, spontini and auric), so I > think the "buildd redundancy" box could be set to green. Yes, this appears to be correct; I checked with Ryan about this at DebConf, and we do seem to have full redundancy now for sparc buildds. > - The kernel failures (that occurs only on SMP boxes) seems to be gone, > at least on the build daemons. I don't know what has been done (if > somebody know, please tell us), but the two packages that were killing > the buildds (ie glibc and openoffice.org) are now building correctly (4 > last uploads for the glibc, last upload for openoffice.org). What's been done is to install a kernel which is newer than any that are actually available in sid or etch. The fact that this seems to fix the problem is a positive step in the right direction, but it's not sufficient for the release qual as it leaves us with very low confidence in the usability of the port when we can't use the Debian kernels for etch on any of the relevant project machines.[1] So the ideal solution is that, now that we have a known-working version, someone determines whether 2.6.16 includes the same fixes and if not, gets them backported to 2.6.16 for etch. There is also the question of having appropriate kernel images on the buildds for the remainder of sarge's term as "stable", but I don't see any way that this should be a blocker for sparc's inclusion as an etch release arch if the *current* buildd kernel problems don't make sparc unreleasable package-wise. > If the kernel failures still appear to be present, would it be possible > to qualify the port for non-SMP only? AIUI most of the sparc hardware people want to *use* Debian on is SMP kit, so I think it would be a shame to call a UP port releasable but would certainly take the opinions of the sparc porters into consideration. Cheers, -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. [EMAIL PROTECTED] http://www.debian.org/ [1] independent of whether DSA actually uses stock Debian kernels on most Debian systems, which TTBOMK is actually not the case signature.asc Description: Digital signature
Re: Sparc architecture requalification
On Sat, 20 May 2006, Aurelien Jarno wrote: - The kernel failures (that occurs only on SMP boxes) seems to be gone, at least on the build daemons. I don't know what has been done (if somebody know, please tell us), but the two packages that were killing the buildds (ie glibc and openoffice.org) are now building correctly (4 last uploads for the glibc, last upload for openoffice.org). James Troup mentioned, that the buildds stopped dying (*knock on wood*) after 2.6.17-rc kernels were installed on them. Best regards, Jurij Smakov[EMAIL PROTECTED] Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Sparc architecture requalification
Hi all, It has been a long time since the sparc status on the architecture requalification page [1] has been updated. A few things seems to have changed: - There is now 3 sparc buildds (mrpurply, spontini and auric), so I think the "buildd redundancy" box could be set to green. - The kernel failures (that occurs only on SMP boxes) seems to be gone, at least on the build daemons. I don't know what has been done (if somebody know, please tell us), but the two packages that were killing the buildds (ie glibc and openoffice.org) are now building correctly (4 last uploads for the glibc, last upload for openoffice.org). If the kernel failures still appear to be present, would it be possible to qualify the port for non-SMP only? Bye, Aurelien [1] http://release.debian.org/etch_arch_qualify.html -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian GNU/Linux developer | Electrical Engineer `. `' [EMAIL PROTECTED] | [EMAIL PROTECTED] `-people.debian.org/~aurel32 | www.aurel32.net -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]