Re: [9fans] crashing 9vx

2010-05-30 Thread Brian L. Stuart
 OK, somebody sent a hint that it
 might make sense to take the -O3 out
 of the make flags. Done.
 
 Result: I can now get through this command:
 hget -v http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2
 |[2]aux/statusbar plan9.iso
 
 without an explosion.

This is weird.  I just built 9vx on FreeBSD without
the -O3.  But instead of being more stable, that
one crashed on startup, like Charles reported.
Namely:

9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 
eip=80b973c esp=493ffac0
aborting, to dump core.

With the -O3, the crashes are rare, and seem to be
associated with heavy I/O.

BLS




[9fans] Ken's FS: WORM Superblock read failed

2010-05-30 Thread Gregory Pavelcak
Hi,

I've been messing around with setting up a Ken's file server with old
hardware that's accumulated over the years.

Yesterday, I configured a file server with two IDE drives as ch0fh2. The
setup seemed to go OK, but one of the last things on the screen before it
brought the server up is in the subject line: WORM Superblock read failed.
It also said something about block 2 (I know the numeral `2' was in there).

When I booted this morning and let it go through the whole process, I got:
   tag = Tnone/0: expected Tbuck/2697 -- flushed (2697)
   panic: cwinit: ??? tag c bucket

(Embarrassing. The ??? indicates where I can't read my own handwriting.)

Any insight would be appreciated.

Here's some more setup info:

Motherboard is ECS L4S5MG with SiS 650/961 chipset
Hard Drives are both Western Digital WD800JB (80G ATA drives)
The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if Erik's
fs has mods that are relevant. I haven't tried his contrib version yet.

Thanks.

Happy long weekend to those of you who have one.

Greg


Re: [9fans] crashing 9vx

2010-05-30 Thread ron minnich
Here are your debug options:
case '1':
singlethread = 1;
break;
case 'A':
doabort++;
break;
case 'B':
abortonfault++;
break;
case 'K':
tracekdev++;
break;
case 'F':
nofork = 1;
break;
case 'M':
tracemmu++;
break;
case 'P':
traceprocs++;
break;
case 'S':
tracesyscalls++;
break;
case 'U':
nuspace = atoi(EARGF(usage()));
break;
case 'X':
vx32_debugxlate++;
break;

On Sun, May 30, 2010 at 2:46 PM, Brian L. Stuart blstu...@bellsouth.net wrote:
 This is weird.  I just built 9vx on FreeBSD without
 the -O3.  But instead of being more stable, that
 one crashed on startup, like Charles reported.
 Namely:

 9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 
 eip=80b973c esp=493ffac0
 aborting, to dump core.

-X is super-slick. But you can probably see what can be done here.

I wonder if you could run -X with your immediate failure and put it on
pastebin.com or similar.

ron



Re: [9fans] crashing 9vx

2010-05-30 Thread Philippe Anel

You also have to recompile vx library.

Phil

Brian L. Stuart wrote:

OK, somebody sent a hint that it
might make sense to take the -O3 out
of the make flags. Done.

Result: I can now get through this command:
hget -v http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2
|[2]aux/statusbar plan9.iso

without an explosion.



This is weird.  I just built 9vx on FreeBSD without
the -O3.  But instead of being more stable, that
one crashed on startup, like Charles reported.
Namely:

9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 
eip=80b973c esp=493ffac0
aborting, to dump core.

With the -O3, the crashes are rare, and seem to be
associated with heavy I/O.

BLS



  





Re: [9fans] Ken's FS: WORM Superblock read failed

2010-05-30 Thread erik quanstrom
 Yesterday, I configured a file server with two IDE drives as ch0fh2. The
 setup seemed to go OK, but one of the last things on the screen before it
 brought the server up is in the subject line: WORM Superblock read failed.
 It also said something about block 2 (I know the numeral `2' was in there).
 
 When I booted this morning and let it go through the whole process, I got:
tag = Tnone/0: expected Tbuck/2697 -- flushed (2697)
panic: cwinit: ??? tag c bucket

if you booted from one of these drives, you may have stepped on
the partition table and/or configuration.

have you tried recover main at the configuration prompt?

i'd recommend making the cache relatively small.  p10.10 might be
a good plan.

 
 Motherboard is ECS L4S5MG with SiS 650/961 chipset
 Hard Drives are both Western Digital WD800JB (80G ATA drives)
 The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if Erik's
 fs has mods that are relevant. I haven't tried his contrib version yet.

if you do use it, you can partition the drives with prep and/or
fdisk and use partition names.  percentage is pretty sloppy when
you get to an 80gb drive.  i found that absolute sector numbers
are error prone for humans to type and sharing an aoe target
makes sense.  here's my main configuration

filsys main cpe2.0kcachee2.1

- erik



Re: [9fans] PCI SATA Controller Cards

2010-05-30 Thread erik quanstrom
 Anyone have any recommendations for a PCI SATA controller card that will
 work with Plan 9?

if pci's (not pcie) is the only option, you'd likely do
better for cheeper getting an atom motherboard
and using the onboard ports.

marvell 88sx pci/pci-x cards will work, but they're
relatively expensive.  ahci interfaced cards should
also work, but i didn't turn up any pci cards with
an ahci interface in a quick search.

- erik



Re: [9fans] crashing 9vx

2010-05-30 Thread erik quanstrom
On Sun May 30 12:00:25 EDT 2010, x...@bouyapop.org wrote:
 You also have to recompile vx library.
 
 Phil
 
 Brian L. Stuart wrote:
  OK, somebody sent a hint that it
  might make sense to take the -O3 out
  of the make flags. Done.
 
  Result: I can now get through this command:
  hget -v 
  http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2/tmp/iso.bz2
  |[2]aux/statusbar plan9.iso
 
  without an explosion.
  
 
  This is weird.  I just built 9vx on FreeBSD without
  the -O3.  But instead of being more stable, that
  one crashed on startup, like Charles reported.
  Namely:
 
  9vx panic: user fault: signo=11 addr=3850cb67 [useraddr=cb67] read=1 
  eip=80b973c esp=493ffac0
  aborting, to dump core.
 
  With the -O3, the crashes are rare, and seem to be
  associated with heavy I/O.

you may be right, but it seems too easy to blame gcc.
a better fit for the facts so far would seem to me that
9vx' locking is broken.  the optimization may just put
more pressure on broken locking.

putting a couple of prints in the startup will also change
timing.  if you can eliminate or cause a crash by adding
or removing prints, then you can be sure that there is a
locking/timing problem in 9vx.  although that doesn't prove
that gcc is blameless, it would be a reasonable assumption.

- erik



Re: [9fans] Ken's FS: WORM Superblock read failed

2010-05-30 Thread Gregory Pavelcak
I didn't try recover, but since the config I really want is a partitioned
one like you suggest, and the errors I reported didn't make anyone
immediately think dying hard drive; I think I'll try getting your (Erik's)
fs from contrib.

I assume I can do something like this:

Put the two hard drives destined to be on the file server on my existing
fossil-only machine (I have a hard drive with that somewhere from the last
few days of experimentation).

Diskpart and diskprep those drives with, for example, 9fat, cache, and worm
on the drive destined to be h0 on the server and 9fat, other and worm on h2.
Using some appropriate sizes (approx 10% of disk size going to cache and
other with h0.worm and h2.worm the same size for mirroring).

Put 9load, plan9.ini, a kernel from fs on h0.9fat (Something missing? I
guess I just want to emulate what pc/bootfloppy would do for me, but I want
to copy those things to h0.9fat.)

Move the disks to the file server, boot, watch everything come up just fine,
and config something like:
  filsys main cph0.cachef{ph0.wormh2.worm}
  filsys dump o
  filsys other ph2.other

I'd appreciate it if someone would let me know if I'm way off base here.
Otherwise, I'll probably give this a shot this afternoon or tomorrow.

Thanks.

Greg

On Sun, May 30, 2010 at 11:56 AM, erik quanstrom quans...@quanstro.netwrote:

  Yesterday, I configured a file server with two IDE drives as ch0fh2. The
  setup seemed to go OK, but one of the last things on the screen before it
  brought the server up is in the subject line: WORM Superblock read
 failed.
  It also said something about block 2 (I know the numeral `2' was in
 there).
 
  When I booted this morning and let it go through the whole process, I
 got:
 tag = Tnone/0: expected Tbuck/2697 -- flushed (2697)
 panic: cwinit: ??? tag c bucket

 if you booted from one of these drives, you may have stepped on
 the partition table and/or configuration.

 have you tried recover main at the configuration prompt?

 i'd recommend making the cache relatively small.  p10.10 might be
 a good plan.


 
  Motherboard is ECS L4S5MG with SiS 650/961 chipset
  Hard Drives are both Western Digital WD800JB (80G ATA drives)
  The kernel is 9fsfs64 from /sources/extra/kensfs.tgz. I don't know if
 Erik's
  fs has mods that are relevant. I haven't tried his contrib version yet.

 if you do use it, you can partition the drives with prep and/or
 fdisk and use partition names.  percentage is pretty sloppy when
 you get to an 80gb drive.  i found that absolute sector numbers
 are error prone for humans to type and sharing an aoe target
 makes sense.  here's my main configuration

filsys main cpe2.0kcachee2.1

 - erik



Re: [9fans] Ken's FS: WORM Superblock read failed

2010-05-30 Thread erik quanstrom
   filsys main cph0.cachef{ph0.wormh2.worm}
   filsys dump o
   filsys other ph2.other

should be

filsys main cph0cachef{ph0wormh1worm}
filsys dump o
filsys other ph1other

also recommend against using the f fake worm device.
it makes changing the size of the device (say by 1 sector)
excessively difficult, and i'm unclear on the actual upside;
if the fake worm device saves you, things are already
pear shaped.

- erik



Re: [9fans] crashing 9vx

2010-05-30 Thread Bakul Shah
On Sun, 30 May 2010 17:59:49 +0200 Philippe Anel x...@bouyapop.org  wrote:
 You also have to recompile vx library.

Doesn't help.

 Brian L. Stuart wrote:
  With the -O3, the crashes are rare, and seem to be
  associated with heavy I/O.

When I run s9fes (Scheme 9 from Empty Space) tests, some of
them fail or 9vx crashes and AFAIK they don't do much I/O.
They all pass on Plan9.

Just for kicks I compiled 9vx with clang-2.7.  With -O3 it
comes up fine. The initial acme window doesn't disappear
(like it does with gcc -O3) but I couldn't compile anything.
Probably I made a mistake. Will try this later.

Without -O it comes up but the initial acme window
disappears. But compiles do work now. s9fes test GC lists
that used to fail in random ways with gcc compiled 9vx
finishes now without complaints but Hyper Operations
failed, and things went downhill from there.  Rerunning
yields the same result so at least this is consistent!

So it seems there are multiple problems.



Re: [9fans] Ken's FS: WORM Superblock read failed

2010-05-30 Thread Gregory Pavelcak
Thanks Erik.

Yeah, the f device is just a consistent brain tic for me. Once upon a time
I thought for some reason that f always went with c unless you were
using a real jukebox, and now I can't seem to get rid of that
misapprehension when I write my configs.

I used h2 because I intended to have the drives on separate controllers.
Are you saying it should be h1  even if there is no primary slave, or are
you recommending that I put the drives on a single controller?

Getting optimistic about trying this.

It seems to me from my lurking around the groups that Erik is almost
single-handedly keeping Ken's FS alive. I would just like to voice my
appreciation for that.

Greg




On Sun, May 30, 2010 at 12:53 PM, erik quanstrom quans...@quanstro.netwrote:

filsys main cph0.cachef{ph0.wormh2.worm}
filsys dump o
filsys other ph2.other

 should be

filsys main cph0cachef{ph0wormh1worm}
filsys dump o
filsys other ph1other

 also recommend against using the f fake worm device.
 it makes changing the size of the device (say by 1 sector)
 excessively difficult, and i'm unclear on the actual upside;
 if the fake worm device saves you, things are already
 pear shaped.

 - erik



Re: [9fans] crashing 9vx

2010-05-30 Thread Brian L. Stuart
 You also have to recompile vx
 library.

I'm pretty sure I did.  I did a gmake clean
followed by gmake 9vx/9vx in vx32/src.  I'm
pretty sure I saw the libraries being compiled
as the compile commands flew by on the screen.

BLS




Re: [9fans] crashing 9vx

2010-05-30 Thread Brian L. Stuart
 you may be right, but it seems too easy to blame gcc.
 a better fit for the facts so far would seem to me that
 9vx' locking is broken.  the optimization may just
 put
 more pressure on broken locking.

I would certainly agree that the variability of the
crashes feels like a mutual exclusion problem.  The
wide variety of effects of changing optimization
seems to by trying really hard to tell us something.
Of course, after two days of house-hunting I could
probably convince myself that the phase of the moon
is involved.

BLS