Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-27 Thread Fabian Keil
Fabian Keil [EMAIL PROTECTED] wrote:

 Fabian Keil [EMAIL PROTECTED] wrote:
 
  Peter Thoenen [EMAIL PROTECTED] wrote:
  
   To you have pf running? If so can you turn it off for a bit a see
   if you still crash.  On my box I was getting all sorts of witness
   kbd backtraces on pf and since turning pf off (maybe a week ago),
   haven't crashed yet.  Going to let it keep running unmetered for
   another 2 weeks and see if I crash or not.

  So far I didn't see a single PF related complaint from witness,
  but I'll try disabling PF in a few days anyway.
 
 It took a little longer than I thought, but I finally
 disabled PF today and switched to natd.

Uptime was slightly above 25 hours. Compiling HEAD right now. 

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-26 Thread Fabian Keil
Fabian Keil [EMAIL PROTECTED] wrote:

 Peter Thoenen [EMAIL PROTECTED] wrote:
 
  To you have pf running? If so can you turn it off for a bit a see if
  you still crash.  On my box I was getting all sorts of witness kbd
  backtraces on pf and since turning pf off (maybe a week ago),
  haven't crashed yet.  Going to let it keep running unmetered for
  another 2 weeks and see if I crash or not.

How is it going, Peter, still running?
 
 I'm running Tor jailed and use PF for NAT, port forwarding and
 filtering: http://tor.fabiankeil.de/pf-stats/
 
 So far I didn't see a single PF related complaint from witness,
 but I'll try disabling PF in a few days anyway.

It took a little longer than I thought, but I finally
disabled PF today and switched to natd.

 At the moment I'm still testing if enabling polling really
 increases the uptime.

I'm still not sure, however polling made it possible to
use fxp0 without acpi, the hangs still occur and the serial
console still becomes unresponsive though.

On another wild guess I switched Tor's threading library
from libpthread to libthr. While it doesn't seem
to affect the uptime, it makes Tor's cpu usage visible
in top, so maybe it would be a good default for tor-devel?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Wed, 28 Jun 2006, Fabian Keil wrote:

  I just got:
 
  Jun 28 23:01:19 tor kernel: lock order reversal:
  Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
  /usr/src/sys/kern/kern_event.c:1053
  Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
  /usr/src/sys/vm/

  Looks similar to http://sources.zabbadoz.net/freebsd/lor.html#185.
 
 Could you run vmstat -z, netstat -m, and vmstat -m please?

I enabled polling three days ago and saw this lor two times
since then. It may or may not be a coincidence.

I log:

top -S -d 2
pfctl -si
netstat -ss
sysctl -a
vmstat -z
netstat -m
vmstat -m 

every five minutes, the output before and after the lor
can be found at: http://www.fabiankeil.de/tmp/lor-185.txt

The system is still up at the moment, so the lor might
have nothing to do with the crashes/hangs/whatever.

I have the feeling that polling does increase the uptime,
but I'm not sure yet.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Fabian Keil [EMAIL PROTECTED] wrote:

 Robert Watson [EMAIL PROTECTED] wrote:
 
  On Wed, 28 Jun 2006, Fabian Keil wrote:
 
   I just got:
  
   Jun 28 23:01:19 tor kernel: lock order reversal:
   Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
   /usr/src/sys/kern/kern_event.c:1053
   Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
   /usr/src/sys/vm/
 
   Looks similar to http://sources.zabbadoz.net/freebsd/lor.html#185.
  
  Could you run vmstat -z, netstat -m, and vmstat -m please?
 
 I enabled polling three days ago and saw this lor two times
 since then. It may or may not be a coincidence.

 The system is still up at the moment, so the lor might
 have nothing to do with the crashes/hangs/whatever.

Actually I had to reset the box about two hours
ago, I just forgot and overlooked the few minutes
downtime in the logs.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Peter Thoenen
Hey Fabian,

To you have pf running? If so can you turn it off for a bit a see if
you still crash.  On my box I was getting all sorts of witness kbd
backtraces on pf and since turning pf off (maybe a week ago), haven't
crashed yet.  Going to let it keep running unmetered for another 2
weeks and see if I crash or not.

-Peter
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-15 Thread Fabian Keil
Peter Thoenen [EMAIL PROTECTED] wrote:

 To you have pf running? If so can you turn it off for a bit a see if
 you still crash.  On my box I was getting all sorts of witness kbd
 backtraces on pf and since turning pf off (maybe a week ago), haven't
 crashed yet.  Going to let it keep running unmetered for another 2
 weeks and see if I crash or not.

I'm running Tor jailed and use PF for NAT, port forwarding and filtering:
http://tor.fabiankeil.de/pf-stats/

So far I didn't see a single PF related complaint from witness,
but I'll try disabling PF in a few days anyway. At the moment
I'm still testing if enabling polling really increases the uptime.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-07 Thread Fabian Keil
Fabian Keil [EMAIL PROTECTED] wrote:

 Fabian Keil [EMAIL PROTECTED] wrote:
 
  Robert Watson [EMAIL PROTECTED] wrote:
 
   It sounds like your serial console server may not know how to map
   SSH break signals into remote serial break signals.  Try
   ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
   
   # Solaris implements a new BREAK which is initiated by a character
   # sequence CR ~ ^b which is similar to a familiar pattern used on
   # Sun servers by the Remote Console.
   options ALT_BREAK_TO_DEBUGGER
  
  It took me several attempts to get the character sequence right,
  but yes, this one works. Thanks.
 
 Unfortunately it didn't work while the system was hanging
 this morning.

Since then I got one or two hangs a day and entering
the debugger never worked out, even if my console connection
was opened a few minutes before the hang.

I no longer think it has anything to do with the terminal
server, but assume the hang takes the console with it.

sio0 is running on acpi0, so I tried to disable acpi
to see if it changes anything, but the only change I
got was that fxp0 stopped working (it is up but only
produces timeout warnings).

I tried to partly disable acpi subsystems like
described in acpi(4), but either I got the
syntax wrong, or it just isn't working.

Can someone on this list confirm or deny if
something like debug.acpi.disabled=isa in
/boot/loader.conf makes sense?

That's how I understand the man page, but I don't see any
reaction. I also tried /etc/sysctl.conf (which probably
is parsed too late anyway) but I just got a message that the
sysctl does not exists.

sysctl debug.acpi indeed only shows:
debug.acpi.do_powerstate: 1
debug.acpi.acpi_ca_version: 0x20041119
debug.acpi.semaphore_debug: 0

so maybe I need some special acpi options
or it just doesn't work if acpi is loaded as a module,
but as least the man page has no such hints.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-03 Thread Fabian Keil
Dan Nelson [EMAIL PROTECTED] wrote:

 In the last episode (Jul 02), Robert Watson said:
  On Sun, 2 Jul 2006, Fabian Keil wrote:
  The ssh man page offers:
  
  |~B  Send a BREAK to the remote system (only useful for SSH
  |protocol version 2 and if the peer supports it).
  
  I am using ssh 2, but the only reaction I get is a new line.
  
  |FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
  |
  |login: ~B
 
 If you enter ~B and actually see a ~B printed to the screen, then ssh
 didn't process it because you didn't hit cr first.  So cr~B will
 tell ssh to send a break.

I am actually using cr~B and I don't see just ~B,
but ~B
. The tilde is printed after I release B, therefore I
guess it is working.
 
  It sounds like your serial console server may not know how to map
  SSH break signals into remote serial break signals.  Try
  ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
  
  # Solaris implements a new BREAK which is initiated by a character
  # sequence CR ~ ^b which is similar to a familiar pattern used on
  # Sun servers by the Remote Console.
  options ALT_BREAK_TO_DEBUGGER
 
 ... and if you're sshing to your terminal server, remember that ssh
 will eat that tilde (because you sent cr~ ), so you need to send
 cr~~^B to pass the right characters to FreeBSD.  Or change ssh's
 escape character with the -e flag.

cr~^b works for me, without touching any ssh settings.
As cr~. is still causing a disconnect, it doesn't look
like the escape character was changed either.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-03 Thread Fabian Keil
Fabian Keil [EMAIL PROTECTED] wrote:

 Robert Watson [EMAIL PROTECTED] wrote:

  It sounds like your serial console server may not know how to map
  SSH break signals into remote serial break signals.  Try
  ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
  
  # Solaris implements a new BREAK which is initiated by a character
  # sequence CR ~ ^b which is similar to a familiar pattern used on
  # Sun servers by the Remote Console.
  options ALT_BREAK_TO_DEBUGGER
 
 It took me several attempts to get the character sequence right,
 but yes, this one works. Thanks.

Unfortunately it didn't work while the system was hanging
this morning. I wasn't logged in at the console before the
hang occurred, so it maybe that the terminal server checked
the console for life signs, found none and did neither
connect nor print a warning (wild guess I have no idea
if it does that).

It could also mean that I'm seeing the mysterious power off part
described in: http://www.freebsd.org/cgi/query-pr.cgi?pr=95180
but I have no way to tell the difference.

I will stay connected to the console until the system hangs
again to see if it changes anything.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Tue, 27 Jun 2006, Fabian Keil wrote:
 
  There was a request for Tor related problem reports a while ago,
  I couldn't find the message again, but I believe it was posted here.
 
 I'm very interested in tracking down this problem, but have had a lot
 of trouble getting reliable reports of problems -- i.e., ones where I
 could get any debugging information.  I had a similar conversation on
 these lines yeterday with Roger (Tor author) here at the WEIS
 conference.  If this is easily reproduceable, I would like you to do
 the following:

 - Does the hang occur?  If so, use a serial break to get into DDB,
 see the above.

I previously had the serial console misconfigured and I'm still not
sure if the settings are correct now.

So far I put BOOT_COMCONSOLE_SPEED=57600 in /etc/make.conf,
options CONSPEED=57600 in the kernel and console=comconsole
in /boot/loader.conf. Kernel and bootblock were recompiled
and reinstalled. /boot.config contains the line:
-D -h -S57600 (speed setting through make.conf didn't work).

The boot process now starts with:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

/boot.config: -DConsoles: internal video/keyboard  serial port  
BIOS drive C: is disk0
BIOS 639kB/523200kB available memory

FreeBSD/i386 bootstrap loader, Revision 1.1
[...]

After manually triggering a test panic through debug.kdb.enter
I could enter ddb and everything seemed to be working.

However today I got another hang and couldn't enter the debugger
by sending BREAK. It is the same BREAK ssh sends with ~B, right?

Even after rebooting, sending break didn't trigger a panic,
so either I'm sending the wrong BREAK, or my console settings
are still messed up. Any ideas?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Robert Watson


On Sun, 2 Jul 2006, Fabian Keil wrote:

I'm very interested in tracking down this problem, but have had a lot of 
trouble getting reliable reports of problems -- i.e., ones where I could 
get any debugging information.  I had a similar conversation on these lines 
yeterday with Roger (Tor author) here at the WEIS conference.  If this is 
easily reproduceable, I would like you to do the following:


- Does the hang occur?  If so, use a serial break to get into DDB, see the 
above.


I previously had the serial console misconfigured and I'm still not sure if 
the settings are correct now.


So far I put BOOT_COMCONSOLE_SPEED=57600 in /etc/make.conf, options 
CONSPEED=57600 in the kernel and console=comconsole in /boot/loader.conf. 
Kernel and bootblock were recompiled and reinstalled. /boot.config contains 
the line: -D -h -S57600 (speed setting through make.conf didn't work).


I don't use alternative console speeds, so can't comment on the specifics of 
the above, but the output below looks right.



The boot process now starts with:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2

/boot.config: -DConsoles: internal video/keyboard  serial port
BIOS drive C: is disk0
BIOS 639kB/523200kB available memory

FreeBSD/i386 bootstrap loader, Revision 1.1
[...]

After manually triggering a test panic through debug.kdb.enter I could enter 
ddb and everything seemed to be working.


However today I got another hang and couldn't enter the debugger by sending 
BREAK. It is the same BREAK ssh sends with ~B, right?


Even after rebooting, sending break didn't trigger a panic, so either I'm 
sending the wrong BREAK, or my console settings are still messed up. Any 
ideas?


What serial software are you using to reach the console?  Do you have options 
BREAK_TO_DEBUGGER compiled into your kernel?  The delivery mechanism for the 
break will depend on the software you're using...


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Sun, 2 Jul 2006, Fabian Keil wrote:

  After manually triggering a test panic through debug.kdb.enter I
  could enter ddb and everything seemed to be working.
 
  However today I got another hang and couldn't enter the debugger by
  sending BREAK. It is the same BREAK ssh sends with ~B, right?
 
  Even after rebooting, sending break didn't trigger a panic, so
  either I'm sending the wrong BREAK, or my console settings are
  still messed up. Any ideas?
 
 What serial software are you using to reach the console?

I use ssh to log in to a console server, hit enter and
am connected to the console. I have no idea what kind
of software is used between console server and console.

 Do you have options BREAK_TO_DEBUGGER compiled into your kernel?

Yes, together with the other options you suggested:

makeoptions DEBUG=-g
options DDB
#options KDB_UNATTENDED
options KDB
options BREAK_TO_DEBUGGER
options WITNESS
options WITNESS_SKIPSPIN
options INVARIANTS
options INVARIANT_SUPPORT

 The delivery mechanism for the break will depend on the software
 you're using...

The ssh man page offers:

|~B  Send a BREAK to the remote system (only useful for SSH protocol
|version 2 and if the peer supports it).

I am using ssh 2, but the only reaction I get is a new line.

|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
|
|login: ~B
|

Maybe machdep.enable_panic_key would be another solution?
The description says Enable panic via keypress
specified in kbdmap(5), I'm just not sure if console
input qualifies as keypress.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Robert Watson

On Sun, 2 Jul 2006, Fabian Keil wrote:


Robert Watson [EMAIL PROTECTED] wrote:


On Sun, 2 Jul 2006, Fabian Keil wrote:



After manually triggering a test panic through debug.kdb.enter I
could enter ddb and everything seemed to be working.

However today I got another hang and couldn't enter the debugger by
sending BREAK. It is the same BREAK ssh sends with ~B, right?

Even after rebooting, sending break didn't trigger a panic, so
either I'm sending the wrong BREAK, or my console settings are
still messed up. Any ideas?


What serial software are you using to reach the console?


I use ssh to log in to a console server, hit enter and am connected to the 
console. I have no idea what kind of software is used between console server 
and console.


You probably need to find out in order to find out what break sequence to 
send.  Alternatively, you can use ALT_BREAK_TO_DEBUGGER, which defines an 
alternative break sequence without relying on a serial break (which is an 
out-of-band break signal).


The delivery mechanism for the break will depend on the software you're 
using...


The ssh man page offers:

|~B  Send a BREAK to the remote system (only useful for SSH protocol
|version 2 and if the peer supports it).

I am using ssh 2, but the only reaction I get is a new line.

|FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
|
|login: ~B
|


It sounds like your serial console server may not know how to map SSH break 
signals into remote serial break signals.  Try ALT_BREAK_TO_DEBUGGER.  Here's 
the description from NOTES:


# Solaris implements a new BREAK which is initiated by a character
# sequence CR ~ ^b which is similar to a familiar pattern used on
# Sun servers by the Remote Console.
options ALT_BREAK_TO_DEBUGGER

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Sun, 2 Jul 2006, Fabian Keil wrote:

  I am using ssh 2, but the only reaction I get is a new line.
 
  |FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
  |
  |login: ~B
  |
 
 It sounds like your serial console server may not know how to map SSH
 break signals into remote serial break signals.  Try
 ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
 
 # Solaris implements a new BREAK which is initiated by a character
 # sequence CR ~ ^b which is similar to a familiar pattern used on
 # Sun servers by the Remote Console.
 options ALT_BREAK_TO_DEBUGGER

It took me several attempts to get the character sequence right,
but yes, this one works. Thanks.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-07-02 Thread Dan Nelson
In the last episode (Jul 02), Robert Watson said:
 On Sun, 2 Jul 2006, Fabian Keil wrote:
 The ssh man page offers:
 
 |~B  Send a BREAK to the remote system (only useful for SSH
 |protocol version 2 and if the peer supports it).
 
 I am using ssh 2, but the only reaction I get is a new line.
 
 |FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)
 |
 |login: ~B

If you enter ~B and actually see a ~B printed to the screen, then ssh
didn't process it because you didn't hit cr first.  So cr~B will
tell ssh to send a break.

 It sounds like your serial console server may not know how to map SSH
 break signals into remote serial break signals.  Try
 ALT_BREAK_TO_DEBUGGER.  Here's the description from NOTES:
 
 # Solaris implements a new BREAK which is initiated by a character
 # sequence CR ~ ^b which is similar to a familiar pattern used on
 # Sun servers by the Remote Console.
 options ALT_BREAK_TO_DEBUGGER

... and if you're sshing to your terminal server, remember that ssh
will eat that tilde (because you sent cr~ ), so you need to send
cr~~^B to pass the right characters to FreeBSD.  Or change ssh's
escape character with the -e flag.

-- 
Dan Nelson
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-29 Thread Robert Watson


On Thu, 29 Jun 2006, Fabian Keil wrote:


I wish I could. The machine died before I read your message.

I was logged in on the serial console running tail -f /var/log/messages. 
Last messages were:


Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4275000(2048) 
val=a020c0de @ 0xc4275000
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4055800(2048) 
val=a020c0de @ 0xc4055800
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4ca(2048) 
val=a020c0de @ 0xc4ca
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc39ef000(2048) 
val=a020c0de @ 0xc39ef000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc4bd7000(2048) 
val=a020c0de @ 0xc4bd7000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3c8a000(2048) 
val=a020c0de @ 0xc3c8a000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc33bd000(2048) 
val=a020c0de @ 0xc33bd000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3f1d000(2048) 
val=a020c0de @ 0xc3f1d000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc45dc800(2048) 
val=a020c0de @ 0xc45dc800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc429e000(2048) 
val=a020c0de @ 0xc429e000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3aef800(2048) 
val=a020c0de @ 0xc3aef800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc432a000(2048) 
val=a020c0de @ 0xc432a000
Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) 
LBA=34263674
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3dff800(2048) 
val=a020c0d

Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset through the 
ISP's webinterface. Now the system appears to be hosed, at least FreeBSD 
never reaches the login:


PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2

[nothing]

Probably something which would be easy to resolve with keyboard access and a 
screen, but I think I'm forced to use the RecoveryManager. Unfortunately 
recovery means reinstalling the preconfigured GNU/Linux which I than can 
replace with FreeBSD again. If there ever was a core dump it will be gone, 
and so will be kernel.debug.


On the bright side you can chose the OS to go with. Should I use Current to 
see if the problem still exists?


The ATA error above is a bit distressing, as is the fact that it won't boot. 
Is [nothing] normally the FreeBSD boot loader rather than nothing?  I would 
suggest running some hardware diagnostics to make sure we're dealing with 
reliable hardware before continuing so that we're not chasing both hardware 
and software problems, since you can't reliably debug software problems in the 
presence of hardware failures.


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-29 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Thu, 29 Jun 2006, Fabian Keil wrote:
 
  I wish I could. The machine died before I read your message.
 
  I was logged in on the serial console running tail
  -f /var/log/messages. Last messages were:
 
  Jun 29 00:42:20 tor kernel: Memory modified after free
  0xc4275000(2048) val=a020c0de @ 0xc4275000 Jun 29 00:42:20 tor
  kernel: Memory modified after free 0xc4055800(2048) val=a020c0de @

  0xc432a000 Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA
  retrying (1 retry left) LBA=34263674 Jun 29 00:42:24 tor kernel:
  Memory modified after free 0xc3dff800(2048) val=a020c0d
 
  Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset
  through the ISP's webinterface. Now the system appears to be hosed,
  at least FreeBSD never reaches the login:
 
  PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
  Booting from local disk...
 
  1   Linux
  2   FreeBSD
  3   FreeBSD
 
  Default: 2
 
  [nothing]

 The ATA error above is a bit distressing, as is the fact that it
 won't boot. Is [nothing] normally the FreeBSD boot loader rather
 than nothing?

The 1 Linux ... part already is the FreeBSD boot loader.
Normally it goes:

PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

FreeBSD/i386 (tor.fabiankeil.de) (ttyd0)

login:

 I would suggest running some hardware diagnostics to
 make sure we're dealing with reliable hardware before continuing so
 that we're not chasing both hardware and software problems, since you
 can't reliably debug software problems in the presence of hardware
 failures.

I'll see what the ports collection has to offer (running
smartmontools right now) but so far it's the only ATA message I got.

  Probably something which would be easy to resolve with keyboard
  access and a screen, but I think I'm forced to use the
  RecoveryManager. Unfortunately recovery means reinstalling the
  preconfigured GNU/Linux which I than can replace with FreeBSD
  again. If there ever was a core dump it will be gone, and so will
  be kernel.debug.

Lucky me. The RecoveryManager turned out to be a full featured
PXE-booted GNU/Linux system. It allowed me to fetch and replace
/dev/ad0s2a (/) through ssh. The system is online again. 

After fsck -y /dev/ad0s3d (/usr) the whole tor jail is gone,
but the rest of this slice seems to be ok, including kernel.debug.

I can't fsck /var:
[EMAIL PROTECTED] ~]$ sudo fsck /dev/ad0s3d
** /dev/ad0s3d
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
fsck_4.2bsd: cannot alloc 1082190976 bytes for inoinfo

but it can still be mounted. No core dump though.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson


On Tue, 27 Jun 2006, Fabian Keil wrote:

There was a request for Tor related problem reports a while ago, I 
couldn't find the message again, but I believe it was posted here.


I'm very interested in tracking down this problem, but have had a lot of 
trouble getting reliable reports of problems -- i.e., ones where I could get 
any debugging information.  I had a similar conversation on these lines 
yeterday with Roger (Tor author) here at the WEIS conference.  If this is 
easily reproduceable, I would like you to do the following:


- Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER, options
  WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options
  INVARIANT_SUPPORT.

- Make sure to have a kernel with debugging symbols for the kernel.

- Turn on core dumps.

The above debugging options will have a significant performance impact, and 
may or may not affect the probability of the race or deadlock being exercised. 
The first question is:


- Are there any warnings on the console from WITNESS or other debugging
  options?  If so, please copy/paste them into an e-mail for me.

- Does a panic occur?  If so, the output of the following comments would be
  very useful:

  show pcpu
  show allpcpu
  ps
  show locks
  show alllocks
  show lockedvnods
  trace

  Then walk the list of all processes listed in 'show alllocks', and run trace
  on each pid.

- Does the hang occur?  If so, use a serial break to get into DDB, see the
  above.

In both of the last two cases, attempt to get a core dump.

Robert N M Watson
Computer Laboratory
University of Cambridge



Last week I installed:
FreeBSD tor.fabiankeil.de 6.1-RELEASE-p2 FreeBSD
6.1-RELEASE-p2 #0: Fri Jun 23 20:06:57 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSLEEP  i386.

At the moment it is only acting as Tor node
http://serifos.eecs.harvard.edu/cgi-bin/desc.pl?q=zwiebelsuppe
tor-devel (maintainer CC'd) is running jailed in a Geli image,
ntpd, named, cron and sshd are running in the host system
and that's about it. No mail or web server and nearly no traffic
besides the one caused by Tor.

I started Tor Friday night and had to reset the box three times
since then. The server just suddenly stops responding, the logs
stop as well, therefore I assume it either panics or hangs.

I only have remote access, a serial console is available,
but it becomes unresponsive as well. I didn't configure DDB yet,
so maybe that is to be expected?

cron creates some stats every five minutes, a few minutes
before a hang this morning the load was:

last pid:  7996;  load averages:  0.40,  0.37,  0.36  up 0+18:38:2505:55:02
83 processes:  2 running, 66 sleeping, 15 waiting
CPU states: 21.3% user,  0.0% nice, 17.8% system, 20.2% interrupt, 40.7% idle
Mem: 100M Active, 157M Inact, 102M Wired, 12K Cache, 60M Buf, 134M Free
Swap: 1024M Total, 1024M Free

 PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  11 root1 171   52 0K 8K RUN857:30 53.61% idle
  12 root1 -44 -163 0K 8K WAIT45:22  6.54% swi1: net
  23 root1 -68 -187 0K 8K WAIT14:48  2.83% irq12: fxp0 fxp1
7973 root1  960  2264K  1544K RUN  0:00  0.51% top
  13 root1 -32 -151 0K 8K WAIT 5:49  0.10% swi4: clock sio
  33 root1 171   52 0K 8K pgzero   0:02  0.10% pagezero
   3 root1  -80 0K 8K -0:16  0.05% g_up
1586 _tor   14  20099M 97912K kserel 188:36  0.00% tor
  15 root1 -160 0K 8K -1:01  0.00% yarrow
1443 root1  -80 0K 8K geli:w   0:49  0.00% g_eli[0] md0
   4 root1  -80 0K 8K -0:21  0.00% g_down
  35 root1  200 0K 8K syncer   0:17  0.00% syncer
1439 root1  -80 0K 8K mdwait   0:13  0.00% md0
  24 root1 -64 -183 0K 8K WAIT 0:08  0.00% irq14: ata0
   2 root1  -80 0K 8K -0:07  0.00% g_event
  42 root1 -160 0K 8K -0:06  0.00% schedcpu
 453 root1  960  2920K  1752K select   0:05  0.00% ntpd
 256 _pflogd 1 -580  1548K  1216K bpf  0:05  0.00% pflog

pfctls -si:
Status: Enabled for 0 days 18:37:52   Debug: Urgent

Hostid: 0x1ec3da6b

Interface Stats for fxp0  IPv4 IPv6
 Bytes In 250778591590
 Bytes Out274988633620
 Packets In
   Passed361927600
   Blocked  322130
 Packets Out
   Passed368714320
   Blocked2650

State Table  Total Rate
 current entries 5290
 searches73567507 1096.8/s
 inserts   6000688.9/s
 removals   

Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Peter Thoenen
--- Fabian Keil [EMAIL PROTECTED] wrote:
 There was a request for Tor related problem reports
 a while ago, I couldn't find the message again, but I
 believe it was posted here.

 Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
 or later with similar or higher load?

I am hitting the same issue still Fabian.  I had that PR closed as
works for me with insignificant testing.  I am still crashing (as
before) but maybe only once every week or two instead of every couple
hours with 6.1 RELEASE.  The PR really should be reopened.  Couple
other folk have emailed me with similiar issues offline (and also spoke
with it about me on IRC).

I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many
tor users on other platforms and the actual developers and this is not
seen by any of them.  I can also recreate this crash NOT running tor
but just generating a heavy load with freenet and i2p.  My gut feeling
is still a network code regression between 5.x - 6.x with the stack
rewrite. I am at a loss how to troubleshoot this anymore (as noted in
the PR and my earlier email).  I truly hope somebody (e.g. a developer)
can shed some light on this issue or troubleshoot it.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Chuck Swiger

Peter Thoenen wrote:

--- Fabian Keil [EMAIL PROTECTED] wrote:

[ ... ]

Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
or later with similar or higher load?


I am hitting the same issue still Fabian.  I had that PR closed as
works for me with insignificant testing.  I am still crashing (as
before) but maybe only once every week or two instead of every couple
hours with 6.1 RELEASE.  The PR really should be reopened.  Couple
other folk have emailed me with similiar issues offline (and also spoke
with it about me on IRC).


Well, having several people show similar problems will help track the issue 
down, if only by letting us examine common aspects (ie, this happens on SMP 
systems, it happens when people are using PF, or IPFW, it only happens to 
people using vr0, or rl0, or some other specific NIC, etc).



I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many
tor users on other platforms and the actual developers and this is not
seen by any of them.  I can also recreate this crash NOT running tor
but just generating a heavy load with freenet and i2p.


It's probably not a TOR issue, no.  I gather that you've already run the 
manufacturer's hardware diagnostics and something like prime95 or memtest86 
overnight or longer than 24 hours (ideally)...



My gut feeling is still a network code regression between 5.x - 6.x with the 
stack
rewrite. I am at a loss how to troubleshoot this anymore (as noted in
the PR and my earlier email).  I truly hope somebody (e.g. a developer)
can shed some light on this issue or troubleshoot it.


It would also be interesting to know whether you can revert to running FreeBSD 
5.5 on the same hardware under the same workload and have it stay up for longer.


Put your dmesg(s), kernel config files, /etc/make.conf, and best efforts at 
logging the issue (serial console, running vmstat whatever or sysctl -a kern 
via cron periodicly to a file), on a webpage someplace, and try to cross-link 
with other people showing the same problem.  Post that URL to a PR and/or the 
mailing lists...


--
-Chuck
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson


On Tue, 27 Jun 2006, Peter Thoenen wrote:


--- Fabian Keil [EMAIL PROTECTED] wrote:

There was a request for Tor related problem reports
a while ago, I couldn't find the message again, but I
believe it was posted here.



Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
or later with similar or higher load?


I am hitting the same issue still Fabian.  I had that PR closed as works 
for me with insignificant testing.  I am still crashing (as before) but 
maybe only once every week or two instead of every couple hours with 6.1 
RELEASE.  The PR really should be reopened.  Couple other folk have emailed 
me with similiar issues offline (and also spoke with it about me on IRC).


In the future, it would be helpful if you replied to the PR saying so.  It 
looks like it was closed at your request as you stated the problem had gone 
away, so I've been working under the assumption that the problem has gone 
away, as that's the last information I have.


I am still 99% sure this is NOT A TOR ISSUE!!!  I have spoken with many tor 
users on other platforms and the actual developers and this is not seen by 
any of them.  I can also recreate this crash NOT running tor but just 
generating a heavy load with freenet and i2p.  My gut feeling is still a 
network code regression between 5.x - 6.x with the stack rewrite. I am at a 
loss how to troubleshoot this anymore (as noted in the PR and my earlier 
email).  I truly hope somebody (e.g. a developer) can shed some light on 
this issue or troubleshoot it.


I have appealed a number of times on the freebsd-security mailing list and 
eslewhere for information from people who could reproduce the problem.  In 
general the replies I got were that people either had the problem go away with 
recent 6.x, or that they did not have time or were not interested in helping 
debug the problem.  If you are interested, please take a look at my recent 
reply to the reported problem, and work through the steps there.  I strongly 
recommend using a serial console on the box.  I can only debug a problem if I 
know it exists, and with enough information, and so far, there's been 
insufficient information to track down the problem.


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Tue, 27 Jun 2006, Fabian Keil wrote:
 
  There was a request for Tor related problem reports a while ago,
  I couldn't find the message again, but I believe it was posted here.
 
 I'm very interested in tracking down this problem, but have had a lot
 of trouble getting reliable reports of problems -- i.e., ones where I
 could get any debugging information.  I had a similar conversation on
 these lines yeterday with Roger (Tor author) here at the WEIS
 conference.  If this is easily reproduceable, I would like you to do
 the following:
 
 - Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER,
 options WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options
INVARIANT_SUPPORT.
 
 - Make sure to have a kernel with debugging symbols for the kernel.
 
 - Turn on core dumps.

Done. I expect to get a chance to test the settings in the next 24 hours.
 
 The above debugging options will have a significant performance
 impact, and may or may not affect the probability of the race or
 deadlock being exercised. The first question is:
 
 - Are there any warnings on the console from WITNESS or other
 debugging options?  If so, please copy/paste them into an e-mail for
 me.

So far the logs show nothing unusual, but I
noticed that the ssh connection gets unresponsive
from time to time.

I did a few pings with interesting results:

[EMAIL PROTECTED] ~]$ ping 10.0.0.1 | grep 'time=[^0]'
64 bytes from 10.0.0.1: icmp_seq=25 ttl=64 time=1.104 ms
64 bytes from 10.0.0.1: icmp_seq=61 ttl=64 time=2.983 ms
64 bytes from 10.0.0.1: icmp_seq=167 ttl=64 time=1.112 ms
64 bytes from 10.0.0.1: icmp_seq=189 ttl=64 time=1.653 ms
64 bytes from 10.0.0.1: icmp_seq=222 ttl=64 time=1.748 ms
64 bytes from 10.0.0.1: icmp_seq=291 ttl=64 time=1.058 ms
64 bytes from 10.0.0.1: icmp_seq=334 ttl=64 time=1.020 ms
64 bytes from 10.0.0.1: icmp_seq=337 ttl=64 time=1.967 ms
64 bytes from 10.0.0.1: icmp_seq=562 ttl=64 time=1.027 ms
64 bytes from 10.0.0.1: icmp_seq=586 ttl=64 time=1.230 ms
[EMAIL PROTECTED] ~]$ ping tor.fabiankeil.de | grep 'time=[^0]'
64 bytes from 81.169.155.246: icmp_seq=70 ttl=64 time=1.920 ms
64 bytes from 81.169.155.246: icmp_seq=79 ttl=64 time=1.587 ms
64 bytes from 81.169.155.246: icmp_seq=402 ttl=64 time=1.062 ms
[EMAIL PROTECTED] ~]$ ping localhost | grep 'time=[^0]'
64 bytes from 127.0.0.1: icmp_seq=142 ttl=64 time=1.142 ms
64 bytes from 127.0.0.1: icmp_seq=497 ttl=64 time=1.227 ms
64 bytes from 127.0.0.1: icmp_seq=627 ttl=64 time=1.181 ms

10.0.0.1 is on lo1, 81.169.155.246 is on fxp0, both
are filtered with pf. lo0 is skipped. The pings were run
locally while tor was running, the usual ping response times
are below 0.2 ms.

I get even more obscene ping times if I ping
from home, but my net connection isn't the best.
I'd appreciate if someone with a reliable net
connection could confirm the weirdness.

Thanks for your time, Robert, I hope to have real
information by tomorrow.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 - Are there any warnings on the console from WITNESS or other
 debugging options?

I just got:

Jun 28 23:01:19 tor kernel: lock order reversal:
Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
/usr/src/sys/kern/kern_event.c:1053
Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
/usr/src/sys/vm/vm_map.c:2390
Jun 28 23:01:20 tor kernel: KDB: stack backtrace:
Jun 28 23:01:20 tor kernel: 
kdb_backtrace(0,,c0711af0,c0713440,c06db624) at kdb_backtrace+0x29
Jun 28 23:01:20 tor kernel: witness_checkorder(c1043144,9,c06b90a8,956) at 
witness_checkorder+0x578
Jun 28 23:01:20 tor kernel: _mtx_lock_flags(c1043144,0,c06b90a8,956) at 
_mtx_lock_flags+0x5b
Jun 28 23:01:20 tor kernel: _vm_map_lock(c10430c0,c06b90a8,956) at 
_vm_map_lock+0x26
Jun 28 23:01:20 tor kernel: 
vm_map_remove(c10430c0,c3bc6000,c3bc8000,d6f55b30,c0623361) at 
vm_map_remove+0x1f
Jun 28 23:01:20 tor kernel: kmem_free(c10430c0,c3bc6000,2000,d6f55b48,c062524f) 
at kmem_free+0x25
Jun 28 23:01:20 tor kernel: page_free(c3bc6000,2000,22,2000,d6f55b60) at 
page_free+0x29
Jun 28 23:01:20 tor kernel: uma_large_free(c3ba5140) at uma_large_free+0x7b
Jun 28 23:01:20 tor kernel: free(c3bc6000,c06d8980,c3bc6000,c483,1400) at 
free+0xc5
Jun 28 23:01:20 tor kernel: kqueue_expand(c3795000,c06d8a40,500,0) at 
kqueue_expand+0xd7
Jun 28 23:01:20 tor kernel: kqueue_register(c3795000,d6f55bf4,c3a8f480,1,0) at 
kqueue_register+0x1b8
Jun 28 23:01:20 tor kernel: kern_kevent(c3a8f480,3,19,200,d6f55cc8) at 
kern_kevent+0xc9
Jun 28 23:01:20 tor kernel: kevent(c3a8f480,d6f55d04,6,2,212) at kevent+0x55
Jun 28 23:01:20 tor kernel: syscall(2824003b,80e003b,bfbf003b,cb87000,80d5020) 
at syscall+0x22f
Jun 28 23:01:20 tor kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f
Jun 28 23:01:20 tor kernel: --- syscall (363, FreeBSD ELF32, kevent), eip = 
0x282cc4af, esp = 0xbfbfe9fc, ebp = 0xbfbfea48 ---

Looks similar to http://sources.zabbadoz.net/freebsd/lor.html#185.

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Robert Watson

On Wed, 28 Jun 2006, Fabian Keil wrote:


Robert Watson [EMAIL PROTECTED] wrote:


- Are there any warnings on the console from WITNESS or other
debugging options?


I just got:

Jun 28 23:01:19 tor kernel: lock order reversal:
Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue) @ 
/usr/src/sys/kern/kern_event.c:1053
Jun 28 23:01:19 tor kernel: 2nd 0xc1043144 system map (system map) @ 
/usr/src/sys/vm/vm_map.c:2390
Jun 28 23:01:20 tor kernel: KDB: stack backtrace:
Jun 28 23:01:20 tor kernel: 
kdb_backtrace(0,,c0711af0,c0713440,c06db624) at kdb_backtrace+0x29
Jun 28 23:01:20 tor kernel: witness_checkorder(c1043144,9,c06b90a8,956) at 
witness_checkorder+0x578
Jun 28 23:01:20 tor kernel: _mtx_lock_flags(c1043144,0,c06b90a8,956) at 
_mtx_lock_flags+0x5b
Jun 28 23:01:20 tor kernel: _vm_map_lock(c10430c0,c06b90a8,956) at 
_vm_map_lock+0x26
Jun 28 23:01:20 tor kernel: 
vm_map_remove(c10430c0,c3bc6000,c3bc8000,d6f55b30,c0623361) at 
vm_map_remove+0x1f
Jun 28 23:01:20 tor kernel: kmem_free(c10430c0,c3bc6000,2000,d6f55b48,c062524f) 
at kmem_free+0x25
Jun 28 23:01:20 tor kernel: page_free(c3bc6000,2000,22,2000,d6f55b60) at 
page_free+0x29
Jun 28 23:01:20 tor kernel: uma_large_free(c3ba5140) at uma_large_free+0x7b
Jun 28 23:01:20 tor kernel: free(c3bc6000,c06d8980,c3bc6000,c483,1400) at 
free+0xc5
Jun 28 23:01:20 tor kernel: kqueue_expand(c3795000,c06d8a40,500,0) at 
kqueue_expand+0xd7
Jun 28 23:01:20 tor kernel: kqueue_register(c3795000,d6f55bf4,c3a8f480,1,0) at 
kqueue_register+0x1b8
Jun 28 23:01:20 tor kernel: kern_kevent(c3a8f480,3,19,200,d6f55cc8) at 
kern_kevent+0xc9
Jun 28 23:01:20 tor kernel: kevent(c3a8f480,d6f55d04,6,2,212) at kevent+0x55
Jun 28 23:01:20 tor kernel: syscall(2824003b,80e003b,bfbf003b,cb87000,80d5020) 
at syscall+0x22f
Jun 28 23:01:20 tor kernel: Xint0x80_syscall() at Xint0x80_syscall+0x1f
Jun 28 23:01:20 tor kernel: --- syscall (363, FreeBSD ELF32, kevent), eip = 
0x282cc4af, esp = 0xbfbfe9fc, ebp = 0xbfbfea48 ---

Looks similar to http://sources.zabbadoz.net/freebsd/lor.html#185.


Could you run vmstat -z, netstat -m, and vmstat -m please?

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-28 Thread Fabian Keil
Robert Watson [EMAIL PROTECTED] wrote:

 On Wed, 28 Jun 2006, Fabian Keil wrote:
 
  Robert Watson [EMAIL PROTECTED] wrote:
 
  - Are there any warnings on the console from WITNESS or other
  debugging options?
 
  I just got:
 
  Jun 28 23:01:19 tor kernel: lock order reversal:
  Jun 28 23:01:19 tor kernel: 1st 0xc3795000 kqueue (kqueue)

  Looks similar to http://sources.zabbadoz.net/freebsd/lor.html#185.
 
 Could you run vmstat -z, netstat -m, and vmstat -m please?

I wish I could. The machine died before I read your message.

I was logged in on the serial console running tail -f /var/log/messages.
Last messages were:

Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4275000(2048) 
val=a020c0de @ 0xc4275000
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4055800(2048) 
val=a020c0de @ 0xc4055800
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc4ca(2048) 
val=a020c0de @ 0xc4ca
Jun 29 00:42:20 tor kernel: Memory modified after free 0xc39ef000(2048) 
val=a020c0de @ 0xc39ef000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc4bd7000(2048) 
val=a020c0de @ 0xc4bd7000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3c8a000(2048) 
val=a020c0de @ 0xc3c8a000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc33bd000(2048) 
val=a020c0de @ 0xc33bd000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3f1d000(2048) 
val=a020c0de @ 0xc3f1d000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc45dc800(2048) 
val=a020c0de @ 0xc45dc800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc429e000(2048) 
val=a020c0de @ 0xc429e000
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3aef800(2048) 
val=a020c0de @ 0xc3aef800
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc432a000(2048) 
val=a020c0de @ 0xc432a000
Jun 29 00:42:24 tor kernel: ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) 
LBA=34263674
Jun 29 00:42:24 tor kernel: Memory modified after free 0xc3dff800(2048) 
val=a020c0d

Ctrl+Alt+ESC didn't trigger any reaction, so I caused a reset through
the ISP's webinterface. Now the system appears to be hosed, at least
FreeBSD never reaches the login:
   
PXELINUX 3.11 2005-09-02  Copyright (C) 1994-2005 H. Peter Anvin
Booting from local disk...

1   Linux
2   FreeBSD
3   FreeBSD

Default: 2 

[nothing]

Probably something which would be easy to resolve with
keyboard access and a screen, but I think I'm forced to use
the RecoveryManager. Unfortunately recovery means reinstalling
the preconfigured GNU/Linux which I than can replace with FreeBSD
again. If there ever was a core dump it will be gone, and so will
be kernel.debug.

On the bright side you can chose the OS to go with.
Should I use Current to see if the problem still exists?

Fabian
-- 
http://www.fabiankeil.de/


signature.asc
Description: PGP signature


FreeBSD 6.1 Tor issues (Once More, with Feeling)

2006-06-27 Thread Fabian Keil
There was a request for Tor related problem reports
a while ago, I couldn't find the message again, but I
believe it was posted here.

Last week I installed:
FreeBSD tor.fabiankeil.de 6.1-RELEASE-p2 FreeBSD
6.1-RELEASE-p2 #0: Fri Jun 23 20:06:57 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSLEEP  i386.

At the moment it is only acting as Tor node
http://serifos.eecs.harvard.edu/cgi-bin/desc.pl?q=zwiebelsuppe
tor-devel (maintainer CC'd) is running jailed in a Geli image,
ntpd, named, cron and sshd are running in the host system
and that's about it. No mail or web server and nearly no traffic
besides the one caused by Tor.

I started Tor Friday night and had to reset the box three times
since then. The server just suddenly stops responding, the logs
stop as well, therefore I assume it either panics or hangs.

I only have remote access, a serial console is available,
but it becomes unresponsive as well. I didn't configure DDB yet,
so maybe that is to be expected?

cron creates some stats every five minutes, a few minutes
before a hang this morning the load was:

last pid:  7996;  load averages:  0.40,  0.37,  0.36  up 0+18:38:2505:55:02
83 processes:  2 running, 66 sleeping, 15 waiting
CPU states: 21.3% user,  0.0% nice, 17.8% system, 20.2% interrupt, 40.7% idle
Mem: 100M Active, 157M Inact, 102M Wired, 12K Cache, 60M Buf, 134M Free
Swap: 1024M Total, 1024M Free

  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
   11 root1 171   52 0K 8K RUN857:30 53.61% idle
   12 root1 -44 -163 0K 8K WAIT45:22  6.54% swi1: net
   23 root1 -68 -187 0K 8K WAIT14:48  2.83% irq12: fxp0 fxp1
 7973 root1  960  2264K  1544K RUN  0:00  0.51% top
   13 root1 -32 -151 0K 8K WAIT 5:49  0.10% swi4: clock sio
   33 root1 171   52 0K 8K pgzero   0:02  0.10% pagezero
3 root1  -80 0K 8K -0:16  0.05% g_up
 1586 _tor   14  20099M 97912K kserel 188:36  0.00% tor
   15 root1 -160 0K 8K -1:01  0.00% yarrow
 1443 root1  -80 0K 8K geli:w   0:49  0.00% g_eli[0] md0
4 root1  -80 0K 8K -0:21  0.00% g_down
   35 root1  200 0K 8K syncer   0:17  0.00% syncer
 1439 root1  -80 0K 8K mdwait   0:13  0.00% md0
   24 root1 -64 -183 0K 8K WAIT 0:08  0.00% irq14: ata0
2 root1  -80 0K 8K -0:07  0.00% g_event
   42 root1 -160 0K 8K -0:06  0.00% schedcpu
  453 root1  960  2920K  1752K select   0:05  0.00% ntpd
  256 _pflogd 1 -580  1548K  1216K bpf  0:05  0.00% pflog

pfctls -si:
Status: Enabled for 0 days 18:37:52   Debug: Urgent

Hostid: 0x1ec3da6b

Interface Stats for fxp0  IPv4 IPv6
  Bytes In 250778591590
  Bytes Out274988633620
  Packets In
Passed361927600
Blocked  322130
  Packets Out
Passed368714320
Blocked2650

State Table  Total Rate
  current entries 5290   
  searches73567507 1096.8/s
  inserts   6000688.9/s
  removals  5947788.9/s
Counters
  match 752600   11.2/s
  bad-offset 00.0/s
  fragment 1020.0/s
  short  00.0/s
  normalize  20.0/s
  memory680.0/s
  bad-timestamp  00.0/s
  congestion 00.0/s
  ip-option  00.0/s
  proto-cksum00.0/s
  state-mismatch 126550.2/s
  state-insert   00.0/s
  state-limit00.0/s
  src-limit  20.0/s
  synproxy

Today's traffic graph:
http://www.fabiankeil.de/blog-surrogat/2006/06/27/tor.fabiankeil.de-dritter-ausfall-24-stunden-durchsatz-statistik-595x337.png
(The hang around 14:00 happened while I was logged in doing a buildworld)

At the moment I'm building RELENG_6 with DDB to see if it changes anything
and if I can get a core dump, but so far the problem seems to be
similar to: http://www.freebsd.org/cgi/query-pr.cgi?pr=95180 (closed)
and http://freebsd.rambler.ru/bsdmail/freebsd-questions_2006/msg08692.html.

Is anyone on this