Re: [9fans] crashing 9vx
OK, somebody sent a hint that it might make sense to take the -O3 out of the make flags. Done. Result: I can now get through this command: hget -v http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2 |[2]aux/statusbar plan9.iso without an explosion. This is weird. I just built 9vx on FreeBSD without the -O3. But instead of being more stable, that one crashed on startup, like Charles reported. Namely: 9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 eip=80b973c esp=493ffac0 aborting, to dump core. With the -O3, the crashes are rare, and seem to be associated with heavy I/O. BLS
[9fans] Ken's FS: WORM Superblock read failed
Hi, I've been messing around with setting up a Ken's file server with old hardware that's accumulated over the years. Yesterday, I configured a file server with two IDE drives as ch0fh2. The setup seemed to go OK, but one of the last things on the screen before it brought the server up is in the subject line: WORM Superblock read failed. It also said something about block 2 (I know the numeral `2' was in there). When I booted this morning and let it go through the whole process, I got: tag = Tnone/0: expected Tbuck/2697 -- flushed (2697) panic: cwinit: ??? tag c bucket (Embarrassing. The ??? indicates where I can't read my own handwriting.) Any insight would be appreciated. Here's some more setup info: Motherboard is ECS L4S5MG with SiS 650/961 chipset Hard Drives are both Western Digital WD800JB (80G ATA drives) The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if Erik's fs has mods that are relevant. I haven't tried his contrib version yet. Thanks. Happy long weekend to those of you who have one. Greg
Re: [9fans] crashing 9vx
Here are your debug options: case '1': singlethread = 1; break; case 'A': doabort++; break; case 'B': abortonfault++; break; case 'K': tracekdev++; break; case 'F': nofork = 1; break; case 'M': tracemmu++; break; case 'P': traceprocs++; break; case 'S': tracesyscalls++; break; case 'U': nuspace = atoi(EARGF(usage())); break; case 'X': vx32_debugxlate++; break; On Sun, May 30, 2010 at 2:46 PM, Brian L. Stuart blstu...@bellsouth.net wrote: This is weird. I just built 9vx on FreeBSD without the -O3. But instead of being more stable, that one crashed on startup, like Charles reported. Namely: 9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 eip=80b973c esp=493ffac0 aborting, to dump core. -X is super-slick. But you can probably see what can be done here. I wonder if you could run -X with your immediate failure and put it on pastebin.com or similar. ron
Re: [9fans] crashing 9vx
You also have to recompile vx library. Phil Brian L. Stuart wrote: OK, somebody sent a hint that it might make sense to take the -O3 out of the make flags. Done. Result: I can now get through this command: hget -v http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2 |[2]aux/statusbar plan9.iso without an explosion. This is weird. I just built 9vx on FreeBSD without the -O3. But instead of being more stable, that one crashed on startup, like Charles reported. Namely: 9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 eip=80b973c esp=493ffac0 aborting, to dump core. With the -O3, the crashes are rare, and seem to be associated with heavy I/O. BLS
Re: [9fans] Ken's FS: WORM Superblock read failed
Yesterday, I configured a file server with two IDE drives as ch0fh2. The setup seemed to go OK, but one of the last things on the screen before it brought the server up is in the subject line: WORM Superblock read failed. It also said something about block 2 (I know the numeral `2' was in there). When I booted this morning and let it go through the whole process, I got: tag = Tnone/0: expected Tbuck/2697 -- flushed (2697) panic: cwinit: ??? tag c bucket if you booted from one of these drives, you may have stepped on the partition table and/or configuration. have you tried recover main at the configuration prompt? i'd recommend making the cache relatively small. p10.10 might be a good plan. Motherboard is ECS L4S5MG with SiS 650/961 chipset Hard Drives are both Western Digital WD800JB (80G ATA drives) The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if Erik's fs has mods that are relevant. I haven't tried his contrib version yet. if you do use it, you can partition the drives with prep and/or fdisk and use partition names. percentage is pretty sloppy when you get to an 80gb drive. i found that absolute sector numbers are error prone for humans to type and sharing an aoe target makes sense. here's my main configuration filsys main cpe2.0kcachee2.1 - erik
Re: [9fans] PCI SATA Controller Cards
Anyone have any recommendations for a PCI SATA controller card that will work with Plan 9? if pci's (not pcie) is the only option, you'd likely do better for cheeper getting an atom motherboard and using the onboard ports. marvell 88sx pci/pci-x cards will work, but they're relatively expensive. ahci interfaced cards should also work, but i didn't turn up any pci cards with an ahci interface in a quick search. - erik
Re: [9fans] crashing 9vx
On Sun May 30 12:00:25 EDT 2010, x...@bouyapop.org wrote: You also have to recompile vx library. Phil Brian L. Stuart wrote: OK, somebody sent a hint that it might make sense to take the -O3 out of the make flags. Done. Result: I can now get through this command: hget -v http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2 |[2]aux/statusbar plan9.iso without an explosion. This is weird. I just built 9vx on FreeBSD without the -O3. But instead of being more stable, that one crashed on startup, like Charles reported. Namely: 9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 eip=80b973c esp=493ffac0 aborting, to dump core. With the -O3, the crashes are rare, and seem to be associated with heavy I/O. you may be right, but it seems too easy to blame gcc. a better fit for the facts so far would seem to me that 9vx' locking is broken. the optimization may just put more pressure on broken locking. putting a couple of prints in the startup will also change timing. if you can eliminate or cause a crash by adding or removing prints, then you can be sure that there is a locking/timing problem in 9vx. although that doesn't prove that gcc is blameless, it would be a reasonable assumption. - erik
Re: [9fans] Ken's FS: WORM Superblock read failed
I didn't try recover, but since the config I really want is a partitioned one like you suggest, and the errors I reported didn't make anyone immediately think dying hard drive; I think I'll try getting your (Erik's) fs from contrib. I assume I can do something like this: Put the two hard drives destined to be on the file server on my existing fossil-only machine (I have a hard drive with that somewhere from the last few days of experimentation). Diskpart and diskprep those drives with, for example, 9fat, cache, and worm on the drive destined to be h0 on the server and 9fat, other and worm on h2. Using some appropriate sizes (approx 10% of disk size going to cache and other with h0.worm and h2.worm the same size for mirroring). Put 9load, plan9.ini, a kernel from fs on h0.9fat (Something missing? I guess I just want to emulate what pc/bootfloppy would do for me, but I want to copy those things to h0.9fat.) Move the disks to the file server, boot, watch everything come up just fine, and config something like: filsys main cph0.cachef{ph0.wormh2.worm} filsys dump o filsys other ph2.other I'd appreciate it if someone would let me know if I'm way off base here. Otherwise, I'll probably give this a shot this afternoon or tomorrow. Thanks. Greg On Sun, May 30, 2010 at 11:56 AM, erik quanstrom quans...@quanstro.netwrote: Yesterday, I configured a file server with two IDE drives as ch0fh2. The setup seemed to go OK, but one of the last things on the screen before it brought the server up is in the subject line: WORM Superblock read failed. It also said something about block 2 (I know the numeral `2' was in there). When I booted this morning and let it go through the whole process, I got: tag = Tnone/0: expected Tbuck/2697 -- flushed (2697) panic: cwinit: ??? tag c bucket if you booted from one of these drives, you may have stepped on the partition table and/or configuration. have you tried recover main at the configuration prompt? i'd recommend making the cache relatively small. p10.10 might be a good plan. Motherboard is ECS L4S5MG with SiS 650/961 chipset Hard Drives are both Western Digital WD800JB (80G ATA drives) The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if Erik's fs has mods that are relevant. I haven't tried his contrib version yet. if you do use it, you can partition the drives with prep and/or fdisk and use partition names. percentage is pretty sloppy when you get to an 80gb drive. i found that absolute sector numbers are error prone for humans to type and sharing an aoe target makes sense. here's my main configuration filsys main cpe2.0kcachee2.1 - erik
Re: [9fans] Ken's FS: WORM Superblock read failed
filsys main cph0.cachef{ph0.wormh2.worm} filsys dump o filsys other ph2.other should be filsys main cph0cachef{ph0wormh1worm} filsys dump o filsys other ph1other also recommend against using the f fake worm device. it makes changing the size of the device (say by 1 sector) excessively difficult, and i'm unclear on the actual upside; if the fake worm device saves you, things are already pear shaped. - erik
Re: [9fans] crashing 9vx
On Sun, 30 May 2010 17:59:49 +0200 Philippe Anel x...@bouyapop.org wrote: You also have to recompile vx library. Doesn't help. Brian L. Stuart wrote: With the -O3, the crashes are rare, and seem to be associated with heavy I/O. When I run s9fes (Scheme 9 from Empty Space) tests, some of them fail or 9vx crashes and AFAIK they don't do much I/O. They all pass on Plan9. Just for kicks I compiled 9vx with clang-2.7. With -O3 it comes up fine. The initial acme window doesn't disappear (like it does with gcc -O3) but I couldn't compile anything. Probably I made a mistake. Will try this later. Without -O it comes up but the initial acme window disappears. But compiles do work now. s9fes test GC lists that used to fail in random ways with gcc compiled 9vx finishes now without complaints but Hyper Operations failed, and things went downhill from there. Rerunning yields the same result so at least this is consistent! So it seems there are multiple problems.
Re: [9fans] Ken's FS: WORM Superblock read failed
Thanks Erik. Yeah, the f device is just a consistent brain tic for me. Once upon a time I thought for some reason that f always went with c unless you were using a real jukebox, and now I can't seem to get rid of that misapprehension when I write my configs. I used h2 because I intended to have the drives on separate controllers. Are you saying it should be h1 even if there is no primary slave, or are you recommending that I put the drives on a single controller? Getting optimistic about trying this. It seems to me from my lurking around the groups that Erik is almost single-handedly keeping Ken's FS alive. I would just like to voice my appreciation for that. Greg On Sun, May 30, 2010 at 12:53 PM, erik quanstrom quans...@quanstro.netwrote: filsys main cph0.cachef{ph0.wormh2.worm} filsys dump o filsys other ph2.other should be filsys main cph0cachef{ph0wormh1worm} filsys dump o filsys other ph1other also recommend against using the f fake worm device. it makes changing the size of the device (say by 1 sector) excessively difficult, and i'm unclear on the actual upside; if the fake worm device saves you, things are already pear shaped. - erik
Re: [9fans] crashing 9vx
You also have to recompile vx library. I'm pretty sure I did. I did a gmake clean followed by gmake 9vx/9vx in vx32/src. I'm pretty sure I saw the libraries being compiled as the compile commands flew by on the screen. BLS
Re: [9fans] crashing 9vx
you may be right, but it seems too easy to blame gcc. a better fit for the facts so far would seem to me that 9vx' locking is broken. the optimization may just put more pressure on broken locking. I would certainly agree that the variability of the crashes feels like a mutual exclusion problem. The wide variety of effects of changing optimization seems to by trying really hard to tell us something. Of course, after two days of house-hunting I could probably convince myself that the phase of the moon is involved. BLS