Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Nov 26, 2008, at 1:12 PM, Ken Smith wrote: Unfortunately no. As John indicated in the earlier thread BIOS issues tend to be extremely hard to diagnose and so far it seems like its specific to this one motherboard. Given this problem does cause issues with installs I'd be willing to provide ISOs built at the point we've done the Errata Notice that fixes the problem. But its too nebulous an issue to hold up the release itself for. It does *not* cause an issue with installs. Installs work fine. It prevents booting an installed operating system. This appears to affect *ALL* of the Intel multi-cpu motherboards, including 3 generations of Rackable systems. The only reason it is nebulous is because absolutely nobody bothered to investigate the issue. I've been asking for what information would help. I've offered to setup serial consoles, or even ship systems, to anyone who would work on this problem. This is very big problem that will affect thousands of freebsd servers. Ken, the complete lack of action taken by FreeBSD to even CONSIDER investigating a significant bug reported during the testing process is shocking. And it truly puts a lie to those who continue to claim that we should be more active in the testing process. Every time I have done this, I'd found significant issues that affect a significant portion of the user base and COMPLETELY prevent deployment of a given release, and absolutely nothing has been done to even investigate the reports, nevermind address them. Congradulations. Good Job. If you aren't going to accept bug reports, why exactly do you release testing candidates at all? -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Mon, 2008-12-01 at 10:20 -0800, Jo Rhett wrote: On Nov 26, 2008, at 1:12 PM, Ken Smith wrote: Unfortunately no. As John indicated in the earlier thread BIOS issues tend to be extremely hard to diagnose and so far it seems like its specific to this one motherboard. Given this problem does cause issues with installs I'd be willing to provide ISOs built at the point we've done the Errata Notice that fixes the problem. But its too nebulous an issue to hold up the release itself for. It does *not* cause an issue with installs. Installs work fine. It prevents booting an installed operating system. This appears to affect *ALL* of the Intel multi-cpu motherboards, including 3 generations of Rackable systems. Understood, I guess I wasn't quite specific enough. The machine not being able to boot what got installed on its disk I consider an install problem. To date this is the first mention I've seen of it affecting more than one specific machine type. I might have missed it but I can't recall you mentioning this affected more than one particular machine. And it does not seem to affect *ALL* of the Intel multi-cpu motherboards. The only reason it is nebulous is because absolutely nobody bothered to investigate the issue. I've been asking for what information would help. I've offered to setup serial consoles, or even ship systems, to anyone who would work on this problem. Both John and Xin Li have chimed in on the two threads I've seen that are related to this specific topic. John diagnosed it as a issue with the BIOS. That's what makes it a nebulous problem. When working on those sorts of things most people liken it to Whack-a-mole. This is very big problem that will affect thousands of freebsd servers. Its still not clear it will affect thousands of servers. The same set of changes got made to stable/7 as were done to stable/6, and the test builds for the 7.1 release have been seeing much more testing than the test builds for the 6.4 release. If the problem was as wide-spread as you're suggesting we'd likely have seen a lot more reports and that factored into the decision about whether to go ahead or not. This all left me with a decision. My choices were to back out the BTX changes that were known to fix boot issues with certain motherboards and enabled booting from USB devices or leave things as they are. The motherboards that didn't boot with the older code had no work-around. The motherboards that did boot with the older code but not the newer code do have a work-around (use the old loader). Decisions like that suck, no matter which choice I make it's wrong. Holding the release until all bios issues get resolved isn't a viable option because of the Whack-a-mole thing mentioned above. Fix it for one and two break. It takes a lot of time/work to settle into what seems to work for the widest set of machines. Ken, the complete lack of action taken by FreeBSD to even CONSIDER investigating a significant bug reported during the testing process is shocking. And it truly puts a lie to those who continue to claim that we should be more active in the testing process. Every time I have done this, I'd found significant issues that affect a significant portion of the user base and COMPLETELY prevent deployment of a given release, and absolutely nothing has been done to even investigate the reports, nevermind address them. Congradulations. Good Job. If you aren't going to accept bug reports, why exactly do you release testing candidates at all? So you're saying John and Xin Li's responses (Xin Li's questions still un-answered) to you show a complete lack to even consider investigating it? I know from past email threads your preference is for 6.X right now but as a test point if you aren't totally fried over this whole thing it would still be useful to know for sure if the issue exists with 7.1 test builds. If yes it eliminates a variety of possibilities and helps focus on the exact problem. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | signature.asc Description: This is a digitally signed message part
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Dec 1, 2008, at 11:30 AM, Ken Smith wrote: Both John and Xin Li have chimed in on the two threads I've seen that are related to this specific topic. John diagnosed it as a issue with the BIOS. That's what makes it a nebulous problem. When working on those sorts of things most people liken it to Whack-a-mole. Diagnosed without testing. John never asked for any more information than the page fault description from me. When I asked what else to test and offered to supply systems for testing he stopped responding. Xin Li proposed a work-around that would have castrated the systems. It might work, but it wasn't a useful workaround so I deferred testing and focused on trying to get someone to address the real problem. This is very big problem that will affect thousands of freebsd servers. Its still not clear it will affect thousands of servers. Um... Rackable. Rackable ships cabinets full of systems to people that run FreeBSD. They don't sell to home or small corporate users, period. Any problem that affects a standard Rackable build will by definition affect thousands of systems. (much like any standard Dell or HP server build) This all left me with a decision. My choices were to back out the BTX changes that were known to fix boot issues with certain motherboards and enabled booting from USB devices or leave things as they are. Or do some more testing and determine the problem and fix it. I had a stack of systems demonstrating the problem. I could have shipped one to each freebsd developer you wanted to work on it. If you were willing to identify the affect source code and relevant gdb traps I would have happily worked on the source directly if that is what it took. I would test. I would supply console access and build systems. I would ship them to anyone who wanted one in their hot little hands. I would investigate the source code myself with a mere hour of here's the relevant bits you need to consider training. You could have done *anything* that suited your needs for testing. Instead you did nothing. The motherboards that didn't boot with the older code had no work-around. The motherboards that did boot with the older code but not the newer code do have a work-around (use the old loader). Not true. I tested this, installing the old loader and it did not change the problem. As reported. Decisions like that suck, no matter which choice I make it's wrong. Holding the release until all bios issues get resolved isn't a viable option because of the Whack-a-mole thing mentioned above. Fix it for one and two break. It takes a lot of time/work to settle into what seems to work for the widest set of machines. Break the boot loader for a very wide variety of systems rather than spend EVEN A SINGLE HOUR trying to diagnose the boot problem? Ken, your diagnosis here would make sense if ANY diagnosis had been attempted. This could be a trivial problem. It could be solved with 5 minutes of actually looking at it. What happened here is that you proceeded WITHOUT EVEN TRYING. So you're saying John and Xin Li's responses (Xin Li's questions still un-answered) to you show a complete lack to even consider investigating it? No actual diagnosis was done. I'm sorry, but if I pull my car up to my mechanic's garage and he makes a diagnosis of no idea what's wrong without even popping the hood, yeah that counts as didn't even consider investigating Worse yet, I would happily have done all of the grunt work for the investigation. But I'm not going to start by reading the source tree and making guesses where to look. If someone had given me some useful tests to do, I would have done them. I know from past email threads your preference is for 6.X right now Not my preference, my ability to justify the evaluation and testing costs based on the support available for a given release. 7.0 doesn't work on this hardware at all. No, I haven't tested 7.1 because 6.4 was the easier testing target and I had thought that the security team was working on fixing the support model. So now we have the brilliance strategy of a long-term support -REL that we will never be able to use. The same stupid stunt that gave us 6.1 which was unusable and 6.2 which worked great but expired at the same time as 6.1. Etc and such forth. 6.5 will likely be short term support again, but the first release we can consider for deployment. but as a test point if you aren't totally fried over this whole thing it would still be useful to know for sure if the issue exists with 7.1 test builds. If yes it eliminates a variety of possibilities and helps focus on the exact problem. I'm not burnt, but testing 7.1 has no meaningful relevance to my day job until we have a reasonable and working support mechanism. And given that I really pulled out the stops to make sure we had hardware for
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jo Rhett wrote: On Dec 1, 2008, at 11:30 AM, Ken Smith wrote: Both John and Xin Li have chimed in on the two threads I've seen that are related to this specific topic. John diagnosed it as a issue with the BIOS. That's what makes it a nebulous problem. When working on those sorts of things most people liken it to Whack-a-mole. Diagnosed without testing. John never asked for any more information than the page fault description from me. When I asked what else to test and offered to supply systems for testing he stopped responding. Xin Li proposed a work-around that would have castrated the systems. It might work, but it wasn't a useful workaround so I deferred testing and focused on trying to get someone to address the real problem. What I proposed is, to *narrow down* the problem so we can diagnose further, since nobody has idea at the moment about how the problem was, we do need to have further information, or, to get the whole 6.3-6.4 diff reviewed, which is (in my opinion) not an optimal use of developers' time. Cheers, - -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkk0SEwACgkQi+vbBBjt66AbmACeLJgUrf3fp9yNyUXV/T/YvCxT WDkAoL745HKpJw0CogTcZDdvbkMck3uG =0Fg4 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Dec 1, 2008, at 12:25 PM, Xin LI wrote: What I proposed is, to *narrow down* the problem so we can diagnose further, since nobody has idea at the moment about how the problem was, we do need to have further information, or, to get the whole 6.3-6.4 diff reviewed, which is (in my opinion) not an optimal use of developers' time. I got your request at the beginning of a vacation period where I was out of town. I had explicitly requested that 6.4 be blocked for this issue. I didn't think that just my problem would be enough to hold it up, but I apparently never even considered that -REL would happen without even responding to my request. Since nobody had responded to my request, and several posts had gone out about more testing for 7.1 (which had the same loader and the same problems) I assumed that 6.4 was similarly delayed. Had anyone said you needed this information pronto I would have canceled my Thanksgiving plans and spent the day in the lab testing this for you. For that matter, I had already pulled a diff of 6.3 to 6.4 and was working my way through it trying to find the relevant parts. If you would have identified the relevant portions, I would have happily tried backing out some of the changes on a per-component basis to figure it out. In short, tell me what you wanted/needed, and I would have done it ASAP. It's apparently irrelevant now. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
On Mon, 2008-11-24 at 13:39 -0800, Jo Rhett wrote: Given the nature of this bug, can I persuade someone to mark this as blocking 6.4-RELEASE ? Unfortunately no. As John indicated in the earlier thread BIOS issues tend to be extremely hard to diagnose and so far it seems like its specific to this one motherboard. Given this problem does cause issues with installs I'd be willing to provide ISOs built at the point we've done the Errata Notice that fixes the problem. But its too nebulous an issue to hold up the release itself for. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | signature.asc Description: This is a digitally signed message part
Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
This is now filed as PR 129149 http://www.freebsd.org/cgi/query-pr.cgi?pr=129149 Given the nature of this bug, can I persuade someone to mark this as blocking 6.4-RELEASE ? On Nov 5, 2008, at 3:41 PM, Jo Rhett wrote: On Oct 27, 2008, at 8:51 AM, John Baldwin wrote: On Friday 24 October 2008 02:48:13 pm Jo Rhett wrote: So I booted up by CD and used Fixit mode to switch the system to boot via serial (keyboard detached), but this gathered me even less. /boot.config: -Dh Consoles: internal video/keyboard serial port BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS drive D: is disk2 BIOS 639kB/4062144kB available memory FreeBSD/i386 bootstrap loader, Revision 1.1 ([EMAIL PROTECTED] Plugging back in the monitor after lockup showed only a single char more: ([EMAIL PROTECTED] This confirms it is hanging in one of the two BIOS routines to output a character. One thing you can do would be to boot up and do the following: dd if=/dev/mem bs=0x400 count=1 of=idt.out dd if=/dev/mem bs=64k iseek=15 count=1 of=bios.out Then place those files some place I can fetch them. Both files are at http://support.netconsonance.com/freebsd/ FYI, this is notable -- the keyboard does not respond at the boot prompt. I mean the menu where you can escape to the loader prompt, with the fat freebsd ascii art. No keyboard presses are observed here. This is also true for the boot menu on the 6.4 installation CD too. No problems with 6.2 or 6.3 -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jo Rhett wrote: This is now filed as PR 129149 http://www.freebsd.org/cgi/query-pr.cgi?pr=129149 Given the nature of this bug, can I persuade someone to mark this as blocking 6.4-RELEASE ? My wild guess is that this is somehow related to SMP handling since the installation process would install a SMP kernel, but the default CD-ROM kernel is UP for 6.x. Could you please try if you have the same problem with UP kernel? (Copy from LiveCD or something) On Nov 5, 2008, at 3:41 PM, Jo Rhett wrote: On Oct 27, 2008, at 8:51 AM, John Baldwin wrote: On Friday 24 October 2008 02:48:13 pm Jo Rhett wrote: So I booted up by CD and used Fixit mode to switch the system to boot via serial (keyboard detached), but this gathered me even less. /boot.config: -Dh Consoles: internal video/keyboard serial port BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS drive D: is disk2 BIOS 639kB/4062144kB available memory FreeBSD/i386 bootstrap loader, Revision 1.1 ([EMAIL PROTECTED] Plugging back in the monitor after lockup showed only a single char more: ([EMAIL PROTECTED] This confirms it is hanging in one of the two BIOS routines to output a character. One thing you can do would be to boot up and do the following: dd if=/dev/mem bs=0x400 count=1 of=idt.out dd if=/dev/mem bs=64k iseek=15 count=1 of=bios.out Then place those files some place I can fetch them. Both files are at http://support.netconsonance.com/freebsd/ FYI, this is notable -- the keyboard does not respond at the boot prompt. I mean the menu where you can escape to the loader prompt, with the fat freebsd ascii art. No keyboard presses are observed here. This is also true for the boot menu on the 6.4 installation CD too. No problems with 6.2 or 6.3 -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkrIc8ACgkQi+vbBBjt66BVUACcDLDK7Ubugt2sto8WKAYfxF0L 93cAoI3bJ/7YcKQeVUmWTO9R2tOCOf6W =dEk9 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
So boot from CD, go to LIVE filesystem, mount my root and copy only / boot/kernel? Are there any other modules I should copy, or settings I should change? On Nov 24, 2008, at 1:51 PM, Xin LI wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jo Rhett wrote: This is now filed as PR 129149 http://www.freebsd.org/cgi/query-pr.cgi?pr=129149 Given the nature of this bug, can I persuade someone to mark this as blocking 6.4-RELEASE ? My wild guess is that this is somehow related to SMP handling since the installation process would install a SMP kernel, but the default CD- ROM kernel is UP for 6.x. Could you please try if you have the same problem with UP kernel? (Copy from LiveCD or something) On Nov 5, 2008, at 3:41 PM, Jo Rhett wrote: On Oct 27, 2008, at 8:51 AM, John Baldwin wrote: On Friday 24 October 2008 02:48:13 pm Jo Rhett wrote: So I booted up by CD and used Fixit mode to switch the system to boot via serial (keyboard detached), but this gathered me even less. /boot.config: -Dh Consoles: internal video/keyboard serial port BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS drive D: is disk2 BIOS 639kB/4062144kB available memory FreeBSD/i386 bootstrap loader, Revision 1.1 ([EMAIL PROTECTED] Plugging back in the monitor after lockup showed only a single char more: ([EMAIL PROTECTED] This confirms it is hanging in one of the two BIOS routines to output a character. One thing you can do would be to boot up and do the following: dd if=/dev/mem bs=0x400 count=1 of=idt.out dd if=/dev/mem bs=64k iseek=15 count=1 of=bios.out Then place those files some place I can fetch them. Both files are at http://support.netconsonance.com/freebsd/ FYI, this is notable -- the keyboard does not respond at the boot prompt. I mean the menu where you can escape to the loader prompt, with the fat freebsd ascii art. No keyboard presses are observed here. This is also true for the boot menu on the 6.4 installation CD too. No problems with 6.2 or 6.3 -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - -- Xin LI [EMAIL PROTECTED]http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkrIc8ACgkQi+vbBBjt66BVUACcDLDK7Ubugt2sto8WKAYfxF0L 93cAoI3bJ/7YcKQeVUmWTO9R2tOCOf6W =dEk9 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Can I get a committer to mark this bug as blocking 6.4-RELEASE ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jo Rhett wrote: So boot from CD, go to LIVE filesystem, mount my root and copy only /boot/kernel? Yes. Are there any other modules I should copy, or settings I should change? You should probably overwrite the whole /boot/kernel directory, i.e. rename /boot/kernel to /boot/kernel.old. BTW could you also test if 7.1-PRERELEASE exhibit the same issue? On Nov 24, 2008, at 1:51 PM, Xin LI wrote: Jo Rhett wrote: This is now filed as PR 129149 http://www.freebsd.org/cgi/query-pr.cgi?pr=129149 Given the nature of this bug, can I persuade someone to mark this as blocking 6.4-RELEASE ? My wild guess is that this is somehow related to SMP handling since the installation process would install a SMP kernel, but the default CD-ROM kernel is UP for 6.x. Could you please try if you have the same problem with UP kernel? (Copy from LiveCD or something) On Nov 5, 2008, at 3:41 PM, Jo Rhett wrote: On Oct 27, 2008, at 8:51 AM, John Baldwin wrote: On Friday 24 October 2008 02:48:13 pm Jo Rhett wrote: So I booted up by CD and used Fixit mode to switch the system to boot via serial (keyboard detached), but this gathered me even less. /boot.config: -Dh Consoles: internal video/keyboard serial port BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS drive D: is disk2 BIOS 639kB/4062144kB available memory FreeBSD/i386 bootstrap loader, Revision 1.1 ([EMAIL PROTECTED] Plugging back in the monitor after lockup showed only a single char more: ([EMAIL PROTECTED] This confirms it is hanging in one of the two BIOS routines to output a character. One thing you can do would be to boot up and do the following: dd if=/dev/mem bs=0x400 count=1 of=idt.out dd if=/dev/mem bs=64k iseek=15 count=1 of=bios.out Then place those files some place I can fetch them. Both files are at http://support.netconsonance.com/freebsd/ FYI, this is notable -- the keyboard does not respond at the boot prompt. I mean the menu where you can escape to the loader prompt, with the fat freebsd ascii art. No keyboard presses are observed here. This is also true for the boot menu on the 6.4 installation CD too. No problems with 6.2 or 6.3 -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkrKMoACgkQi+vbBBjt66AARgCbBHYl8WpX4jjoJrRbrKjJUMPg lvsAnRlA6be6C62yQNrmNdLhWbOsCBAF =DiYt -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]