Re: FreeBSD 6.1 Tor issues (Once More, with Feeling) [Probably not a FreeBSD problem]

2007-02-03 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Tue, 27 Jun 2006, Fabian Keil wrote:
> 
> > There was a "request" for Tor related problem reports a while ago, I 
> > couldn't find the message again, but I believe it was posted here.
> 
> I'm very interested in tracking down this problem, but have had a lot of 
> trouble getting reliable reports of problems -- i.e., ones where I could get 
> any debugging information.

Two month ago I temporary switched back to Debian GNU/Linux,
which is supported by my hoster, and the crashes/hangs/whatever
continued.

While this doesn't proof that there is no
Tor-related bug in FreeBSD, it certainly
looks like my problems were caused by faulty
hardware or other external factors.

I'll try FreeBSD again once the server is fixed.

Fabian


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-27 Thread Fabian Keil
Fabian Keil <[EMAIL PROTECTED]> wrote:

> Fabian Keil <[EMAIL PROTECTED]> wrote:
> 
> > Peter Thoenen <[EMAIL PROTECTED]> wrote:
> > 
> > > To you have pf running? If so can you turn it off for a bit a see
> > > if you still crash.  On my box I was getting all sorts of witness
> > > kbd backtraces on pf and since turning pf off (maybe a week ago),
> > > haven't crashed yet.  Going to let it keep running unmetered for
> > > another 2 weeks and see if I crash or not.

> > So far I didn't see a single PF related complaint from witness,
> > but I'll try disabling PF in a few days anyway.
> 
> It took a little longer than I thought, but I finally
> disabled PF today and switched to natd.

Uptime was slightly above 25 hours. Compiling HEAD right now. 

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-26 Thread Fabian Keil
Fabian Keil <[EMAIL PROTECTED]> wrote:

> Peter Thoenen <[EMAIL PROTECTED]> wrote:
> 
> > To you have pf running? If so can you turn it off for a bit a see if
> > you still crash.  On my box I was getting all sorts of witness kbd
> > backtraces on pf and since turning pf off (maybe a week ago),
> > haven't crashed yet.  Going to let it keep running unmetered for
> > another 2 weeks and see if I crash or not.

How is it going, Peter, still running?
 
> I'm running Tor jailed and use PF for NAT, port forwarding and
> filtering: http://tor.fabiankeil.de/pf-stats/
> 
> So far I didn't see a single PF related complaint from witness,
> but I'll try disabling PF in a few days anyway.

It took a little longer than I thought, but I finally
disabled PF today and switched to natd.

> At the moment I'm still testing if enabling polling really
> increases the uptime.

I'm still not sure, however polling made it possible to
use fxp0 without acpi, the hangs still occur and the serial
console still becomes unresponsive though.

On another wild guess I switched Tor's threading library
from libpthread to libthr. While it doesn't seem
to affect the uptime, it makes Tor's cpu usage visible
in top, so maybe it would be a good default for tor-devel?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Peter Thoenen <[EMAIL PROTECTED]> wrote:

> To you have pf running? If so can you turn it off for a bit a see if
> you still crash.  On my box I was getting all sorts of witness kbd
> backtraces on pf and since turning pf off (maybe a week ago), haven't
> crashed yet.  Going to let it keep running unmetered for another 2
> weeks and see if I crash or not.

I'm running Tor jailed and use PF for NAT, port forwarding and filtering:
http://tor.fabiankeil.de/pf-stats/

So far I didn't see a single PF related complaint from witness,
but I'll try disabling PF in a few days anyway. At the moment
I'm still testing if enabling polling really increases the uptime.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Peter Thoenen
Hey Fabian,

To you have pf running? If so can you turn it off for a bit a see if
you still crash.  On my box I was getting all sorts of witness kbd
backtraces on pf and since turning pf off (maybe a week ago), haven't
crashed yet.  Going to let it keep running unmetered for another 2
weeks and see if I crash or not.

-Peter
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Fabian Keil <[EMAIL PROTECTED]> wrote:

> Robert Watson <[EMAIL PROTECTED]> wrote:
> 
> > On Wed, 28 Jun 2006, Fabian Keil wrote:
> 
> > > I just got:
> > >
> > > Jun 28 23:01:19 tor kernel: lock order reversal:
> > > Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
> > > /usr/src/sys/kern/kern_event.c:1053
> > > Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
> > > /usr/src/sys/vm/
> 
> > > Looks similar to .
> > 
> > Could you run "vmstat -z", "netstat -m", and "vmstat -m" please?
> 
> I enabled polling three days ago and saw this lor two times
> since then. It may or may not be a coincidence.

> The system is still up at the moment, so the lor might
> have nothing to do with the crashes/hangs/whatever.

Actually I had to reset the box about two hours
ago, I just forgot and overlooked the few minutes
downtime in the logs.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Wed, 28 Jun 2006, Fabian Keil wrote:

> > I just got:
> >
> > Jun 28 23:01:19 tor kernel: lock order reversal:
> > Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
> > /usr/src/sys/kern/kern_event.c:1053
> > Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
> > /usr/src/sys/vm/

> > Looks similar to .
> 
> Could you run "vmstat -z", "netstat -m", and "vmstat -m" please?

I enabled polling three days ago and saw this lor two times
since then. It may or may not be a coincidence.

I log:

top -S -d 2
pfctl -si
netstat -ss
sysctl -a
vmstat -z
netstat -m
vmstat -m 

every five minutes, the output before and after the lor
can be found at: http://www.fabiankeil.de/tmp/lor-185.txt

The system is still up at the moment, so the lor might
have nothing to do with the crashes/hangs/whatever.

I have the feeling that polling does increase the uptime,
but I'm not sure yet.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-07 Thread Fabian Keil
Fabian Keil <[EMAIL PROTECTED]> wrote:

> Fabian Keil <[EMAIL PROTECTED]> wrote:
> 
> > Robert Watson <[EMAIL PROTECTED]> wrote:
> 
> > > It sounds like your serial console server may not know how to map
> > > SSH break signals into remote serial break signals.  Try
> > > ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
> > > 
> > > # Solaris implements a new BREAK which is initiated by a character
> > > # sequence CR ~ ^b which is similar to a familiar pattern used on
> > > # Sun servers by the Remote Console.
> > > options ALT_BREAK_TO_DEBUGGER
> > 
> > It took me several attempts to get the character sequence right,
> > but yes, this one works. Thanks.
> 
> Unfortunately it didn't work while the system was hanging
> this morning.

Since then I got one or two hangs a day and entering
the debugger never worked out, even if my console connection
was opened a few minutes before the hang.

I no longer think it has anything to do with the terminal
server, but assume the hang takes the console with it.

sio0 is running on acpi0, so I tried to disable acpi
to see if it changes anything, but the only change I
got was that fxp0 stopped working (it is up but only
produces timeout warnings).

I tried to partly disable acpi subsystems like
described in acpi(4), but either I got the
syntax wrong, or it just isn't working.

Can someone on this list confirm or deny if
something like debug.acpi.disabled=isa in
/boot/loader.conf makes sense?

That's how I understand the man page, but I don't see any
reaction. I also tried /etc/sysctl.conf (which probably
is parsed too late anyway) but I just got a message that the
sysctl does not exists.

sysctl debug.acpi indeed only shows:
debug.acpi.do_powerstate: 1
debug.acpi.acpi_ca_version: 0x20041119
debug.acpi.semaphore_debug: 0

so maybe I need some special acpi options
or it just doesn't work if acpi is loaded as a module,
but as least the man page has no such hints.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-03 Thread Fabian Keil
Fabian Keil <[EMAIL PROTECTED]> wrote:

> Robert Watson <[EMAIL PROTECTED]> wrote:

> > It sounds like your serial console server may not know how to map
> > SSH break signals into remote serial break signals.  Try
> > ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
> > 
> > # Solaris implements a new BREAK which is initiated by a character
> > # sequence CR ~ ^b which is similar to a familiar pattern used on
> > # Sun servers by the Remote Console.
> > options ALT_BREAK_TO_DEBUGGER
> 
> It took me several attempts to get the character sequence right,
> but yes, this one works. Thanks.

Unfortunately it didn't work while the system was hanging
this morning. I wasn't logged in at the console before the
hang occurred, so it maybe that the terminal server checked
the console for life signs, found none and did neither
connect nor print a warning (wild guess I have no idea
if it does that).

It could also mean that I'm seeing the mysterious "power off" part
described in: 
but I have no way to tell the difference.

I will stay connected to the console until the system hangs
again to see if it changes anything.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-03 Thread Fabian Keil
Dan Nelson <[EMAIL PROTECTED]> wrote:

> In the last episode (Jul 02), Robert Watson said:
> > On Sun, 2 Jul 2006, Fabian Keil wrote:
> > >The ssh man page offers:
> > >
> > >|~B  Send a BREAK to the remote system (only useful for SSH
> > >|protocol version 2 and if the peer supports it).
> > >
> > >I am using ssh 2, but the only reaction I get is a new line.
> > >
> > >|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
> > >|
> > >|login: ~B
> 
> If you enter ~B and actually see a ~B printed to the screen, then ssh
> didn't process it because you didn't hit  first.  So ~B will
> tell ssh to send a break.

I am actually using ~B and I don't see just "~B",
but "~B
". The tilde is printed after I release B, therefore I
guess it is working.
 
> > It sounds like your serial console server may not know how to map
> > SSH break signals into remote serial break signals.  Try
> > ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
> > 
> > # Solaris implements a new BREAK which is initiated by a character
> > # sequence CR ~ ^b which is similar to a familiar pattern used on
> > # Sun servers by the Remote Console.
> > options ALT_BREAK_TO_DEBUGGER
> 
> ... and if you're sshing to your terminal server, remember that ssh
> will eat that tilde (because you sent ~ ), so you need to send
> ~~^B to pass the right characters to FreeBSD.  Or change ssh's
> escape character with the -e flag.

~^b works for me, without touching any ssh settings.
As ~. is still causing a disconnect, it doesn't look
like the escape character was changed either.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Dan Nelson
In the last episode (Jul 02), Robert Watson said:
> On Sun, 2 Jul 2006, Fabian Keil wrote:
> >The ssh man page offers:
> >
> >|~B  Send a BREAK to the remote system (only useful for SSH
> >|protocol version 2 and if the peer supports it).
> >
> >I am using ssh 2, but the only reaction I get is a new line.
> >
> >|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
> >|
> >|login: ~B

If you enter ~B and actually see a ~B printed to the screen, then ssh
didn't process it because you didn't hit  first.  So ~B will
tell ssh to send a break.

> It sounds like your serial console server may not know how to map SSH
> break signals into remote serial break signals.  Try
> ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
> 
> # Solaris implements a new BREAK which is initiated by a character
> # sequence CR ~ ^b which is similar to a familiar pattern used on
> # Sun servers by the Remote Console.
> options ALT_BREAK_TO_DEBUGGER

... and if you're sshing to your terminal server, remember that ssh
will eat that tilde (because you sent ~ ), so you need to send
~~^B to pass the right characters to FreeBSD.  Or change ssh's
escape character with the -e flag.

-- 
Dan Nelson
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Sun, 2 Jul 2006, Fabian Keil wrote:

> > I am using ssh 2, but the only reaction I get is a new line.
> >
> > |FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
> > |
> > |login: ~B
> > |
> 
> It sounds like your serial console server may not know how to map SSH
> break signals into remote serial break signals.  Try
> ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
> 
> # Solaris implements a new BREAK which is initiated by a character
> # sequence CR ~ ^b which is similar to a familiar pattern used on
> # Sun servers by the Remote Console.
> options ALT_BREAK_TO_DEBUGGER

It took me several attempts to get the character sequence right,
but yes, this one works. Thanks.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Robert Watson

On Sun, 2 Jul 2006, Fabian Keil wrote:


Robert Watson <[EMAIL PROTECTED]> wrote:


On Sun, 2 Jul 2006, Fabian Keil wrote:



After manually triggering a test panic through debug.kdb.enter I
could enter ddb and everything seemed to be working.

However today I got another hang and couldn't enter the debugger by
sending BREAK. It is the same BREAK ssh sends with ~B, right?

Even after rebooting, sending break didn't trigger a panic, so
either I'm sending the wrong BREAK, or my console settings are
still messed up. Any ideas?


What serial software are you using to reach the console?


I use ssh to log in to a console server, hit enter and am connected to the 
console. I have no idea what kind of software is used between console server 
and console.


You probably need to find out in order to find out what break sequence to 
send.  Alternatively, you can use ALT_BREAK_TO_DEBUGGER, which defines an 
alternative break sequence without relying on a serial break (which is an 
out-of-band break signal).


The delivery mechanism for the break will depend on the software you're 
using...


The ssh man page offers:

|~B  Send a BREAK to the remote system (only useful for SSH protocol
|version 2 and if the peer supports it).

I am using ssh 2, but the only reaction I get is a new line.

|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
|
|login: ~B
|


It sounds like your serial console server may not know how to map SSH break 
signals into remote serial break signals.  Try ALT_BREAK_TO_DEBUGGER.  Here's 
the description from NOTES:


# Solaris implements a new BREAK which is initiated by a character
# sequence CR ~ ^b which is similar to a familiar pattern used on
# Sun servers by the Remote Console.
options ALT_BREAK_TO_DEBUGGER

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Sun, 2 Jul 2006, Fabian Keil wrote:

> > After manually triggering a test panic through debug.kdb.enter I
> > could enter ddb and everything seemed to be working.
> >
> > However today I got another hang and couldn't enter the debugger by
> > sending BREAK. It is the same BREAK ssh sends with ~B, right?
> >
> > Even after rebooting, sending break didn't trigger a panic, so
> > either I'm sending the wrong BREAK, or my console settings are
> > still messed up. Any ideas?
> 
> What serial software are you using to reach the console?

I use ssh to log in to a console server, hit enter and
am connected to the console. I have no idea what kind
of software is used between console server and console.

> Do you have options BREAK_TO_DEBUGGER compiled into your kernel?

Yes, together with the other options you suggested:

makeoptions DEBUG=-g
options DDB
#options KDB_UNATTENDED
options KDB
options BREAK_TO_DEBUGGER
options WITNESS
options WITNESS_SKIPSPIN
options INVARIANTS
options INVARIANT_SUPPORT

> The delivery mechanism for the break will depend on the software
> you're using...

The ssh man page offers:

|~B  Send a BREAK to the remote system (only useful for SSH protocol
|version 2 and if the peer supports it).

I am using ssh 2, but the only reaction I get is a new line.

|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
|
|login: ~B
|

Maybe machdep.enable_panic_key would be another solution?
The description says "Enable panic via keypress
specified in kbdmap(5)", I'm just not sure if console
input qualifies as "keypress".

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Robert Watson


On Sun, 2 Jul 2006, Fabian Keil wrote:

I'm very interested in tracking down this problem, but have had a lot of 
trouble getting reliable reports of problems -- i.e., ones where I could 
get any debugging information.  I had a similar conversation on these lines 
yeterday with Roger (Tor author) here at the WEIS conference.  If this is 
easily reproduceable, I would like you to do the following:


- Does the hang occur?  If so, use a serial break to get into DDB, see the 
above.


I previously had the serial console misconfigured and I'm still not sure if 
the settings are correct now.


So far I put "BOOT_COMCONSOLE_SPEED=57600" in /etc/make.conf, "options 
CONSPEED=57600" in the kernel and "console=comconsole" in /boot/loader.conf. 
Kernel and bootblock were recompiled and reinstalled. /boot.config contains 
the line: "-D -h -S57600" (speed setting through make.conf didn't work).


I don't use alternative console speeds, so can't comment on the specifics of 
the above, but the output below looks right.



The boot process now starts with:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2

/boot.config: -DConsoles: internal video/keyboard  serial port
BIOS drive C: is disk0
BIOS 639kB/523200kB available memory

FreeBSD/i386 bootstrap loader, Revision 1.1
[...]

After manually triggering a test panic through debug.kdb.enter I could enter 
ddb and everything seemed to be working.


However today I got another hang and couldn't enter the debugger by sending 
BREAK. It is the same BREAK ssh sends with ~B, right?


Even after rebooting, sending break didn't trigger a panic, so either I'm 
sending the wrong BREAK, or my console settings are still messed up. Any 
ideas?


What serial software are you using to reach the console?  Do you have options 
BREAK_TO_DEBUGGER compiled into your kernel?  The delivery mechanism for the 
break will depend on the software you're using...


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Tue, 27 Jun 2006, Fabian Keil wrote:
> 
> > There was a "request" for Tor related problem reports a while ago,
> > I couldn't find the message again, but I believe it was posted here.
> 
> I'm very interested in tracking down this problem, but have had a lot
> of trouble getting reliable reports of problems -- i.e., ones where I
> could get any debugging information.  I had a similar conversation on
> these lines yeterday with Roger (Tor author) here at the WEIS
> conference.  If this is easily reproduceable, I would like you to do
> the following:

> - Does the hang occur?  If so, use a serial break to get into DDB,
> see the above.

I previously had the serial console misconfigured and I'm still not
sure if the settings are correct now.

So far I put "BOOT_COMCONSOLE_SPEED=57600" in /etc/make.conf,
"options CONSPEED=57600" in the kernel and "console=comconsole"
in /boot/loader.conf. Kernel and bootblock were recompiled
and reinstalled. /boot.config contains the line:
"-D -h -S57600" (speed setting through make.conf didn't work).

The boot process now starts with:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

/boot.config: -DConsoles: internal video/keyboard  serial port  
BIOS drive C: is disk0
BIOS 639kB/523200kB available memory

FreeBSD/i386 bootstrap loader, Revision 1.1
[...]

After manually triggering a test panic through debug.kdb.enter
I could enter ddb and everything seemed to be working.

However today I got another hang and couldn't enter the debugger
by sending BREAK. It is the same BREAK ssh sends with ~B, right?

Even after rebooting, sending break didn't trigger a panic,
so either I'm sending the wrong BREAK, or my console settings
are still messed up. Any ideas?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-29 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Thu, 29 Jun 2006, Fabian Keil wrote:
> 
> > I wish I could. The machine died before I read your message.
> >
> > I was logged in on the serial console running tail
> > -f /var/log/messages. Last messages were:
> >
> > Jun 29 00:42:20 tor kernel: Memory modified after free
> > 0xc4275000(2048) val=a020c0de @ 0xc4275000 Jun 29 00:42:20 tor
> > kernel: Memory modified after free 0xc4055800(2048) val=a020c0de @

> > 0xc432a000 Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA
> > retrying (1 retry left) LBA=34263674 Jun 29 00:42:24 tor kernel:
> > Memory modified after free 0xc3dff800(2048) val=a020c0d
> >
> > Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset
> > through the ISP's webinterface. Now the system appears to be hosed,
> > at least FreeBSD never reaches the login:
> >
> > PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
> > Booting from local disk...
> >
> > 1   Linux
> > 2   FreeBSD
> > 3   FreeBSD
> >
> > Default: 2
> >
> > [nothing]

> The ATA error above is a bit distressing, as is the fact that it
> won't boot. Is "[nothing]" normally the FreeBSD boot loader rather
> than nothing?

The "1 Linux ..." part already is the FreeBSD boot loader.
Normally it goes:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)

login:

> I would suggest running some hardware diagnostics to
> make sure we're dealing with reliable hardware before continuing so
> that we're not chasing both hardware and software problems, since you
> can't reliably debug software problems in the presence of hardware
> failures.

I'll see what the ports collection has to offer (running
smartmontools right now) but so far it's the only ATA message I got.

> > Probably something which would be easy to resolve with keyboard
> > access and a screen, but I think I'm forced to use the
> > "RecoveryManager". Unfortunately "recovery" means reinstalling the
> > preconfigured GNU/Linux which I than can replace with FreeBSD
> > again. If there ever was a core dump it will be gone, and so will
> > be kernel.debug.

Lucky me. The "RecoveryManager" turned out to be a full featured
PXE-booted GNU/Linux system. It allowed me to fetch and replace
/dev/ad0s2a (/) through ssh. The system is online again. 

After fsck -y /dev/ad0s3d (/usr) the whole tor jail is gone,
but the rest of this slice seems to be ok, including kernel.debug.

I can't fsck /var:
[EMAIL PROTECTED] ~]$ sudo fsck /dev/ad0s3d
** /dev/ad0s3d
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
fsck_4.2bsd: cannot alloc 1082190976 bytes for inoinfo

but it can still be mounted. No core dump though.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-29 Thread Robert Watson


On Thu, 29 Jun 2006, Fabian Keil wrote:


I wish I could. The machine died before I read your message.

I was logged in on the serial console running tail -f /var/log/messages. 
Last messages were:


Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4275000(2048) 
val=a020c0de @ 0xc4275000
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4055800(2048) 
val=a020c0de @ 0xc4055800
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4ca(2048) 
val=a020c0de @ 0xc4ca
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc39ef000(2048) 
val=a020c0de @ 0xc39ef000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc4bd7000(2048) 
val=a020c0de @ 0xc4bd7000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3c8a000(2048) 
val=a020c0de @ 0xc3c8a000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc33bd000(2048) 
val=a020c0de @ 0xc33bd000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3f1d000(2048) 
val=a020c0de @ 0xc3f1d000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc45dc800(2048) 
val=a020c0de @ 0xc45dc800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc429e000(2048) 
val=a020c0de @ 0xc429e000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3aef800(2048) 
val=a020c0de @ 0xc3aef800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc432a000(2048) 
val=a020c0de @ 0xc432a000
Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) 
LBA=34263674
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3dff800(2048) 
val=a020c0d

Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset through the 
ISP's webinterface. Now the system appears to be hosed, at least FreeBSD 
never reaches the login:


PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2

[nothing]

Probably something which would be easy to resolve with keyboard access and a 
screen, but I think I'm forced to use the "RecoveryManager". Unfortunately 
"recovery" means reinstalling the preconfigured GNU/Linux which I than can 
replace with FreeBSD again. If there ever was a core dump it will be gone, 
and so will be kernel.debug.


On the bright side you can chose the OS to go with. Should I use Current to 
see if the problem still exists?


The ATA error above is a bit distressing, as is the fact that it won't boot. 
Is "[nothing]" normally the FreeBSD boot loader rather than nothing?  I would 
suggest running some hardware diagnostics to make sure we're dealing with 
reliable hardware before continuing so that we're not chasing both hardware 
and software problems, since you can't reliably debug software problems in the 
presence of hardware failures.


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Wed, 28 Jun 2006, Fabian Keil wrote:
> 
> > Robert Watson <[EMAIL PROTECTED]> wrote:
> >
> >> - Are there any warnings on the console from WITNESS or other
> >> debugging options?
> >
> > I just got:
> >
> > Jun 28 23:01:19 tor kernel: lock order reversal:
> > Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue)

> > Looks similar to .
> 
> Could you run "vmstat -z", "netstat -m", and "vmstat -m" please?

I wish I could. The machine died before I read your message.

I was logged in on the serial console running tail -f /var/log/messages.
Last messages were:

Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4275000(2048) 
val=a020c0de @ 0xc4275000
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4055800(2048) 
val=a020c0de @ 0xc4055800
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4ca(2048) 
val=a020c0de @ 0xc4ca
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc39ef000(2048) 
val=a020c0de @ 0xc39ef000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc4bd7000(2048) 
val=a020c0de @ 0xc4bd7000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3c8a000(2048) 
val=a020c0de @ 0xc3c8a000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc33bd000(2048) 
val=a020c0de @ 0xc33bd000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3f1d000(2048) 
val=a020c0de @ 0xc3f1d000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc45dc800(2048) 
val=a020c0de @ 0xc45dc800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc429e000(2048) 
val=a020c0de @ 0xc429e000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3aef800(2048) 
val=a020c0de @ 0xc3aef800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc432a000(2048) 
val=a020c0de @ 0xc432a000
Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) 
LBA=34263674
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3dff800(2048) 
val=a020c0d

Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset through
the ISP's webinterface. Now the system appears to be hosed, at least
FreeBSD never reaches the login:
   
PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

[nothing]

Probably something which would be easy to resolve with
keyboard access and a screen, but I think I'm forced to use
the "RecoveryManager". Unfortunately "recovery" means reinstalling
the preconfigured GNU/Linux which I than can replace with FreeBSD
again. If there ever was a core dump it will be gone, and so will
be kernel.debug.

On the bright side you can chose the OS to go with.
Should I use Current to see if the problem still exists?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson

On Wed, 28 Jun 2006, Fabian Keil wrote:


Robert Watson <[EMAIL PROTECTED]> wrote:


- Are there any warnings on the console from WITNESS or other
debugging options?


I just got:

Jun 28 23:01:19 tor kernel: lock order reversal:
Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
/usr/src/sys/kern/kern_event.c:1053
Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
/usr/src/sys/vm/vm_map.c:2390
Jun 28 23:01:20 tor kernel: KDB: stack backtrace:
Jun 28 23:01:20 tor kernel: 
kdb_backtrace(0,,c0711af0,c0713440,c06db624) at kdb_backtrace+0x29
Jun 28 23:01:20 tor kernel: witness_checkorder(c1043144,9,c06b90a8,956) at 
witness_checkorder+0x578
Jun 28 23:01:20 tor kernel: _mtx_lock_flags(c1043144,0,c06b90a8,956) at 
_mtx_lock_flags+0x5b
Jun 28 23:01:20 tor kernel: _vm_map_lock(c10430c0,c06b90a8,956) at 
_vm_map_lock+0x26
Jun 28 23:01:20 tor kernel: 
vm_map_remove(c10430c0,c3bc6000,c3bc8000,d6f55b30,c0623361) at 
vm_map_remove+0x1f
Jun 28 23:01:20 tor kernel: kmem_free(c10430c0,c3bc6000,2000,d6f55b48,c062524f) 
at kmem_free+0x25
Jun 28 23:01:20 tor kernel: page_free(c3bc6000,2000,22,2000,d6f55b60) at 
page_free+0x29
Jun 28 23:01:20 tor kernel: uma_large_free(c3ba5140) at uma_large_free+0x7b
Jun 28 23:01:20 tor kernel: free(c3bc6000,c06d8980,c3bc6000,c483,1400) at 
free+0xc5
Jun 28 23:01:20 tor kernel: kqueue_expand(c3795000,c06d8a40,500,0) at 
kqueue_expand+0xd7
Jun 28 23:01:20 tor kernel: kqueue_register(c3795000,d6f55bf4,c3a8f480,1,0) at 
kqueue_register+0x1b8
Jun 28 23:01:20 tor kernel: kern_kevent(c3a8f480,3,19,200,d6f55cc8) at 
kern_kevent+0xc9
Jun 28 23:01:20 tor kernel: kevent(c3a8f480,d6f55d04,6,2,212) at kevent+0x55
Jun 28 23:01:20 tor kernel: syscall(2824003b,80e003b,bfbf003b,cb87000,80d5020) 
at syscall+0x22f
Jun 28 23:01:20 tor kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f
Jun 28 23:01:20 tor kernel: --- syscall (363, FreeBSD ELF32, kevent), eip = 
0x282cc4af, esp = 0xbfbfe9fc, ebp = 0xbfbfea48 ---

Looks similar to .


Could you run "vmstat -z", "netstat -m", and "vmstat -m" please?

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> - Are there any warnings on the console from WITNESS or other
> debugging options?

I just got:

Jun 28 23:01:19 tor kernel: lock order reversal:
Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
/usr/src/sys/kern/kern_event.c:1053
Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
/usr/src/sys/vm/vm_map.c:2390
Jun 28 23:01:20 tor kernel: KDB: stack backtrace:
Jun 28 23:01:20 tor kernel: 
kdb_backtrace(0,,c0711af0,c0713440,c06db624) at kdb_backtrace+0x29
Jun 28 23:01:20 tor kernel: witness_checkorder(c1043144,9,c06b90a8,956) at 
witness_checkorder+0x578
Jun 28 23:01:20 tor kernel: _mtx_lock_flags(c1043144,0,c06b90a8,956) at 
_mtx_lock_flags+0x5b
Jun 28 23:01:20 tor kernel: _vm_map_lock(c10430c0,c06b90a8,956) at 
_vm_map_lock+0x26
Jun 28 23:01:20 tor kernel: 
vm_map_remove(c10430c0,c3bc6000,c3bc8000,d6f55b30,c0623361) at 
vm_map_remove+0x1f
Jun 28 23:01:20 tor kernel: kmem_free(c10430c0,c3bc6000,2000,d6f55b48,c062524f) 
at kmem_free+0x25
Jun 28 23:01:20 tor kernel: page_free(c3bc6000,2000,22,2000,d6f55b60) at 
page_free+0x29
Jun 28 23:01:20 tor kernel: uma_large_free(c3ba5140) at uma_large_free+0x7b
Jun 28 23:01:20 tor kernel: free(c3bc6000,c06d8980,c3bc6000,c483,1400) at 
free+0xc5
Jun 28 23:01:20 tor kernel: kqueue_expand(c3795000,c06d8a40,500,0) at 
kqueue_expand+0xd7
Jun 28 23:01:20 tor kernel: kqueue_register(c3795000,d6f55bf4,c3a8f480,1,0) at 
kqueue_register+0x1b8
Jun 28 23:01:20 tor kernel: kern_kevent(c3a8f480,3,19,200,d6f55cc8) at 
kern_kevent+0xc9
Jun 28 23:01:20 tor kernel: kevent(c3a8f480,d6f55d04,6,2,212) at kevent+0x55
Jun 28 23:01:20 tor kernel: syscall(2824003b,80e003b,bfbf003b,cb87000,80d5020) 
at syscall+0x22f
Jun 28 23:01:20 tor kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f
Jun 28 23:01:20 tor kernel: --- syscall (363, FreeBSD ELF32, kevent), eip = 
0x282cc4af, esp = 0xbfbfe9fc, ebp = 0xbfbfea48 ---

Looks similar to .

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson <[EMAIL PROTECTED]> wrote:

> On Tue, 27 Jun 2006, Fabian Keil wrote:
> 
> > There was a "request" for Tor related problem reports a while ago,
> > I couldn't find the message again, but I believe it was posted here.
> 
> I'm very interested in tracking down this problem, but have had a lot
> of trouble getting reliable reports of problems -- i.e., ones where I
> could get any debugging information.  I had a similar conversation on
> these lines yeterday with Roger (Tor author) here at the WEIS
> conference.  If this is easily reproduceable, I would like you to do
> the following:
> 
> - Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER,
> options WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options
>INVARIANT_SUPPORT.
> 
> - Make sure to have a kernel with debugging symbols for the kernel.
> 
> - Turn on core dumps.

Done. I expect to get a chance to test the settings in the next 24 hours.
 
> The above debugging options will have a significant performance
> impact, and may or may not affect the probability of the race or
> deadlock being exercised. The first question is:
> 
> - Are there any warnings on the console from WITNESS or other
> debugging options?  If so, please copy/paste them into an e-mail for
> me.

So far the logs show nothing unusual, but I
noticed that the ssh connection gets unresponsive
from time to time.

I did a few pings with "interesting" results:

[EMAIL PROTECTED] ~]$ ping 10.0.0.1 | grep 'time=[^0]'
64 bytes from 10.0.0.1: icmp_seq=25 ttl=64 time=1.104 ms
64 bytes from 10.0.0.1: icmp_seq=61 ttl=64 time=2.983 ms
64 bytes from 10.0.0.1: icmp_seq=167 ttl=64 time=1.112 ms
64 bytes from 10.0.0.1: icmp_seq=189 ttl=64 time=1.653 ms
64 bytes from 10.0.0.1: icmp_seq=222 ttl=64 time=1.748 ms
64 bytes from 10.0.0.1: icmp_seq=291 ttl=64 time=1.058 ms
64 bytes from 10.0.0.1: icmp_seq=334 ttl=64 time=1.020 ms
64 bytes from 10.0.0.1: icmp_seq=337 ttl=64 time=1.967 ms
64 bytes from 10.0.0.1: icmp_seq=562 ttl=64 time=1.027 ms
64 bytes from 10.0.0.1: icmp_seq=586 ttl=64 time=1.230 ms
[EMAIL PROTECTED] ~]$ ping tor.fabiankeil.de | grep 'time=[^0]'
64 bytes from 81.169.155.246: icmp_seq=70 ttl=64 time=1.920 ms
64 bytes from 81.169.155.246: icmp_seq=79 ttl=64 time=1.587 ms
64 bytes from 81.169.155.246: icmp_seq=402 ttl=64 time=1.062 ms
[EMAIL PROTECTED] ~]$ ping localhost | grep 'time=[^0]'
64 bytes from 127.0.0.1: icmp_seq=142 ttl=64 time=1.142 ms
64 bytes from 127.0.0.1: icmp_seq=497 ttl=64 time=1.227 ms
64 bytes from 127.0.0.1: icmp_seq=627 ttl=64 time=1.181 ms

10.0.0.1 is on lo1, 81.169.155.246 is on fxp0, both
are filtered with pf. lo0 is skipped. The pings were run
locally while tor was running, the usual ping response times
are below 0.2 ms.

I get even more obscene ping times if I ping
from home, but my net connection isn't the best.
I'd appreciate if someone with a reliable net
connection could confirm the weirdness.

Thanks for your time, Robert, I hope to have real
information by tomorrow.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson


On Tue, 27 Jun 2006, Peter Thoenen wrote:


--- Fabian Keil <[EMAIL PROTECTED]> wrote:

There was a "request" for Tor related problem reports
a while ago, I couldn't find the message again, but I
believe it was posted here.



Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
or later with similar or higher load?


I am hitting the same issue still Fabian.  I had that PR closed as "works 
for me" with insignificant testing.  I am still crashing (as before) but 
maybe only once every week or two instead of every couple hours with 6.1 
RELEASE.  The PR really should be reopened.  Couple other folk have emailed 
me with similiar issues offline (and also spoke with it about me on IRC).


In the future, it would be helpful if you replied to the PR saying so.  It 
looks like it was closed at your request as you stated the problem had gone 
away, so I've been working under the assumption that the problem has gone 
away, as that's the last information I have.


I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many tor 
users on other platforms and the actual developers and this is not seen by 
any of them.  I can also recreate this crash NOT running tor but just 
generating a heavy load with freenet and i2p.  My gut feeling is still a 
network code regression between 5.x -> 6.x with the stack rewrite. I am at a 
loss how to troubleshoot this anymore (as noted in the PR and my earlier 
email).  I truly hope somebody (e.g. a developer) can shed some light on 
this issue or troubleshoot it.


I have appealed a number of times on the freebsd-security mailing list and 
eslewhere for information from people who could reproduce the problem.  In 
general the replies I got were that people either had the problem go away with 
recent 6.x, or that they did not have time or were not interested in helping 
debug the problem.  If you are interested, please take a look at my recent 
reply to the reported problem, and work through the steps there.  I strongly 
recommend using a serial console on the box.  I can only debug a problem if I 
know it exists, and with enough information, and so far, there's been 
insufficient information to track down the problem.


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Chuck Swiger

Peter Thoenen wrote:

--- Fabian Keil <[EMAIL PROTECTED]> wrote:

[ ... ]

Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
or later with similar or higher load?


I am hitting the same issue still Fabian.  I had that PR closed as
"works for me" with insignificant testing.  I am still crashing (as
before) but maybe only once every week or two instead of every couple
hours with 6.1 RELEASE.  The PR really should be reopened.  Couple
other folk have emailed me with similiar issues offline (and also spoke
with it about me on IRC).


Well, having several people show similar problems will help track the issue 
down, if only by letting us examine common aspects (ie, this happens on SMP 
systems, it happens when people are using PF, or IPFW, it only happens to 
people using vr0, or rl0, or some other specific NIC, etc).



I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many
tor users on other platforms and the actual developers and this is not
seen by any of them.  I can also recreate this crash NOT running tor
but just generating a heavy load with freenet and i2p.


It's probably not a TOR issue, no.  I gather that you've already run the 
manufacturer's hardware diagnostics and something like prime95 or memtest86 
overnight or longer than 24 hours (ideally)...



My gut feeling is still a network code regression between 5.x -> 6.x with the 
stack
rewrite. I am at a loss how to troubleshoot this anymore (as noted in
the PR and my earlier email).  I truly hope somebody (e.g. a developer)
can shed some light on this issue or troubleshoot it.


It would also be interesting to know whether you can revert to running FreeBSD 
5.5 on the same hardware under the same workload and have it stay up for longer.


Put your dmesg(s), kernel config files, /etc/make.conf, and best efforts at 
logging the issue (serial console, running vmstat whatever or sysctl -a kern 
via cron periodicly to a file), on a webpage someplace, and try to cross-link 
with other people showing the same problem.  Post that URL to a PR and/or the 
mailing lists...


--
-Chuck
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Peter Thoenen
--- Fabian Keil <[EMAIL PROTECTED]> wrote:
> There was a "request" for Tor related problem reports
> a while ago, I couldn't find the message again, but I
> believe it was posted here.

> Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
> or later with similar or higher load?

I am hitting the same issue still Fabian.  I had that PR closed as
"works for me" with insignificant testing.  I am still crashing (as
before) but maybe only once every week or two instead of every couple
hours with 6.1 RELEASE.  The PR really should be reopened.  Couple
other folk have emailed me with similiar issues offline (and also spoke
with it about me on IRC).

I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many
tor users on other platforms and the actual developers and this is not
seen by any of them.  I can also recreate this crash NOT running tor
but just generating a heavy load with freenet and i2p.  My gut feeling
is still a network code regression between 5.x -> 6.x with the stack
rewrite. I am at a loss how to troubleshoot this anymore (as noted in
the PR and my earlier email).  I truly hope somebody (e.g. a developer)
can shed some light on this issue or troubleshoot it.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson


On Tue, 27 Jun 2006, Fabian Keil wrote:

There was a "request" for Tor related problem reports a while ago, I 
couldn't find the message again, but I believe it was posted here.


I'm very interested in tracking down this problem, but have had a lot of 
trouble getting reliable reports of problems -- i.e., ones where I could get 
any debugging information.  I had a similar conversation on these lines 
yeterday with Roger (Tor author) here at the WEIS conference.  If this is 
easily reproduceable, I would like you to do the following:


- Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER, options
  WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options
  INVARIANT_SUPPORT.

- Make sure to have a kernel with debugging symbols for the kernel.

- Turn on core dumps.

The above debugging options will have a significant performance impact, and 
may or may not affect the probability of the race or deadlock being exercised. 
The first question is:


- Are there any warnings on the console from WITNESS or other debugging
  options?  If so, please copy/paste them into an e-mail for me.

- Does a panic occur?  If so, the output of the following comments would be
  very useful:

  show pcpu
  show allpcpu
  ps
  show locks
  show alllocks
  show lockedvnods
  trace

  Then walk the list of all processes listed in 'show alllocks', and run trace
  on each pid.

- Does the hang occur?  If so, use a serial break to get into DDB, see the
  above.

In both of the last two cases, attempt to get a core dump.

Robert N M Watson
Computer Laboratory
University of Cambridge



Last week I installed:
FreeBSD tor.fabiankeil.de 6.1-RELEASE-p2 FreeBSD
6.1-RELEASE-p2 #0: Fri Jun 23 20:06:57 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSLEEP  i386.

At the moment it is only acting as Tor node

tor-devel (maintainer CC'd) is running jailed in a Geli image,
ntpd, named, cron and sshd are running in the host system
and that's about it. No mail or web server and nearly no traffic
besides the one caused by Tor.

I started Tor Friday night and had to reset the box three times
since then. The server just suddenly stops responding, the logs
stop as well, therefore I assume it either panics or hangs.

I only have remote access, a serial console is available,
but it becomes unresponsive as well. I didn't configure DDB yet,
so maybe that is to be expected?

cron creates some stats every five minutes, a few minutes
before a hang this morning the load was:

last pid:  7996;  load averages:  0.40,  0.37,  0.36  up 0+18:38:2505:55:02
83 processes:  2 running, 66 sleeping, 15 waiting
CPU states: 21.3% user,  0.0% nice, 17.8% system, 20.2% interrupt, 40.7% idle
Mem: 100M Active, 157M Inact, 102M Wired, 12K Cache, 60M Buf, 134M Free
Swap: 1024M Total, 1024M Free

 PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  11 root1 171   52 0K 8K RUN857:30 53.61% idle
  12 root1 -44 -163 0K 8K WAIT45:22  6.54% swi1: net
  23 root1 -68 -187 0K 8K WAIT14:48  2.83% irq12: fxp0 fxp1
7973 root1  960  2264K  1544K RUN  0:00  0.51% top
  13 root1 -32 -151 0K 8K WAIT 5:49  0.10% swi4: clock sio
  33 root1 171   52 0K 8K pgzero   0:02  0.10% pagezero
   3 root1  -80 0K 8K -0:16  0.05% g_up
1586 _tor   14  20099M 97912K kserel 188:36  0.00% tor
  15 root1 -160 0K 8K -1:01  0.00% yarrow
1443 root1  -80 0K 8K geli:w   0:49  0.00% g_eli[0] md0
   4 root1  -80 0K 8K -0:21  0.00% g_down
  35 root1  200 0K 8K syncer   0:17  0.00% syncer
1439 root1  -80 0K 8K mdwait   0:13  0.00% md0
  24 root1 -64 -183 0K 8K WAIT 0:08  0.00% irq14: ata0
   2 root1  -80 0K 8K -0:07  0.00% g_event
  42 root1 -160 0K 8K -0:06  0.00% schedcpu
 453 root1  960  2920K  1752K select   0:05  0.00% ntpd
 256 _pflogd 1 -580  1548K  1216K bpf  0:05  0.00% pflog

pfctls -si:
Status: Enabled for 0 days 18:37:52   Debug: Urgent

Hostid: 0x1ec3da6b

Interface Stats for fxp0  IPv4 IPv6
 Bytes In 250778591590
 Bytes Out274988633620
 Packets In
   Passed361927600
   Blocked  322130
 Packets Out
   Passed368714320
   Blocked2650

State Table  Total Rate
 current entries 5290
 searches73567507 1096.8/s
 inserts   6000688.9/s
 removals