Re: [tor-relays] Undiagnosable Crashes in Relays

2015-04-21 Thread teor

> On 26 Mar 2015, at 12:38 , teor  wrote:
> 
>> Date: Wed, 25 Mar 2015 17:21:50 +0800
>> From: Vincent Yu 
>> 
>> On Wed, Mar 25, 2015 at 4:26 PM, skyhighatrist 
>> wrote:
>>> I am wondering if anyone has had their relay randomly crash in the
>>> past
>>> week or so. Three of mine (I run 6, nonexits) have fallen over. One of
>>> them ~5 days ago, one of them ~4 days ago, and one of them earlier
>>> today
>>> .
>> 
>> This also happened to my two relays about three weeks ago. They are:
>> 
>> https://globe.torproject.org/#/relay/C309A31AD772FFDD0805C9FECB6D4748A7CBF684
>> https://globe.torproject.org/#/relay/18BE989663CF3351F73D33C672BB1C985E0EA5D0
>> 
>> They are both middle/guard relays (about 200 Mbps each) on the 0.2.6
>> branch and are on the same dedicated server. They went down at
>> different times, and as far as I can remember, there was nothing
>> notable in the Tor and system logs. No issues prior to this over the
>> past 12 months. I haven't had time to investigate this.
>> 
> 
> This also happened to my EC2 relay a few weeks ago, but I never had time to 
> investigate.
> It's quite possible it was a bad nightly tor version, or some sort of 
> misconfiguration on my end, but the timing matches.
> I didn't manage to get it to come back up in the limited time I had to try 
> and fix it.
> 
> https://globe.torproject.org/#/relay/425E55A8FA145ACFC01FA58CD4E6F46DD7762AAB
> (No record, it's been down for a while.)

As a follow-up, Amazon has now informed me that the hardware the instance was 
on has undergone "irreparable failure". So it appears likely that the tor 
crashes were hardware-related.


teor

teor2345 at gmail dot com
pgp 0xABFED1AC
https://gist.github.com/teor2345/d033b8ce0a99adbc89c5

teor at blah dot im
OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-26 Thread skyhighatrist
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ok, another one went down. Here be the logs, last 30 lines of, pasted
into email, unredacted. Given I just got back from the fucking pub, if
someone else spots the wierdness in this... I'll buy them a pint the
next time I meet them. Now, keep a close eye on the timestamps.

You will note that tor "restarts" or "shuts down" 2 hours before it
seemingly crashes. the last line of the debug log was incomplete,
didn't even have a newline, just *DEAD*. I have restarted the node in
the hopes I can capture more data.

IF anyone recognises this behavior, please let me know.

root@nationalcrime:/var/log/tor# tail -n 30 log
Mar 26 11:16:30.000 [notice] Bootstrapped 85%: Finishing handshake
with first hop.
Mar 26 11:16:30.000 [notice] Self-testing indicates your ORPort is
reachable from the outside. Excellent. Publishing server descriptor.
Mar 26 11:16:31.000 [notice] Bootstrapped 90%: Establishing a Tor circui
t.
Mar 26 11:16:32.000 [notice] Tor has successfully opened a circuit.
Looks like client functionality is working.
Mar 26 11:16:32.000 [notice] Bootstrapped 100%: Done.
Mar 26 11:16:33.000 [notice] Self-testing indicates your DirPort is
reachable from the outside. Excellent.
Mar 26 11:16:34.000 [notice] Performing bandwidth self-test...done.
Mar 26 11:20:57.000 [notice] Interrupt: we have stopped accepting new
connections, and will shut down in 30 seconds. Interrupt again to exit
now.
Mar 26 11:21:27.000 [notice] Clean shutdown finished. Exiting.
Mar 26 11:21:46.000 [notice] Tor 0.2.4.26 (git-0b9fcb34f1996a74)
opening log file.
Mar 26 11:21:46.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoi
p.
Mar 26 11:21:46.000 [notice] Parsing GEOIP IPv6 file
/usr/share/tor/geoip6.
Mar 26 11:21:46.000 [notice] Configured to measure statistics. Look
for the *-stats files that will first be written to the data directory
in 24 hours from now.
Mar 26 11:21:47.000 [warn] I have no descriptor for the router named
"ChickenLiver" in my declared family; I'll use the nickname as is, but
this may confuse clients.
Mar 26 11:21:47.000 [warn] I have no descriptor for the router named
"Neuromancer" in my declared family; I'll use the nickname as is, but
this may confuse clients.
Mar 26 11:21:47.000 [warn] I have no descriptor for the router named
"Cyberia" in my declared family; I'll use the nickname as is, but this
may confuse clients.
Mar 26 11:21:47.000 [warn] I have no descriptor for the router named
"necronomicon" in my declared family; I'll use the nickname as is, but
this may confuse clients.
Mar 26 11:21:47.000 [warn] I have no descriptor for the router named
"SnowCrash" in my declared family; I'll use the nickname as is, but
this may confuse clients.
Mar 26 11:21:47.000 [notice] Your Tor server's identity key
fingerprint is 'NationalCrimeAgency
AC9803701F9EE18194D40B38E47CE4C68CF2F567'
Mar 26 11:21:53.000 [notice] We now have enough directory information
to build circuits.
Mar 26 11:21:53.000 [notice] Bootstrapped 80%: Connecting to the Tor
network.
Mar 26 11:21:53.000 [notice] Self-testing indicates your ORPort is
reachable from the outside. Excellent. Publishing server descriptor.
Mar 26 11:21:53.000 [notice] Bootstrapped 85%: Finishing handshake
with first hop.
Mar 26 11:21:54.000 [notice] Bootstrapped 90%: Establishing a Tor circui
t.
Mar 26 11:21:55.000 [notice] Tor has successfully opened a circuit.
Looks like client functionality is working.
Mar 26 11:21:55.000 [notice] Bootstrapped 100%: Done.
Mar 26 11:22:55.000 [notice] Self-testing indicates your DirPort is
reachable from the outside. Excellent.
Mar 26 11:22:56.000 [notice] Performing bandwidth self-test...done.
Mar 26 11:24:34.000 [notice] Interrupt: we have stopped accepting new
connections, and will shut down in 30 seconds. Interrupt again to exit
now.
Mar 26 11:25:04.000 [notice] Clean shutdown finished. Exiting.


DEBUG LOG BEGINS

root@nationalcrime:/var/log/tor# tail -n 30 debug.log
Mar 26 13:03:01.000 [debug] circuit_receive_relay_cell(): Passing on
unrecognized cell.
Mar 26 13:03:01.000 [debug] connection_or_process_cells_from_inbuf():
553: starting, inbuf_datalen 1576 (0 pending in tls object).
Mar 26 13:03:01.000 [debug] channel_queue_cell(): Directly handling
incoming cell_t 0x7fff332939b0 for channel 0x7f238f3dbc80 (global ID 572
)
Mar 26 13:03:01.000 [debug] circuit_get_by_circid_channel_impl():
circuit_get_by_circid_channel_impl() returning circuit 0x7f238fc15550
for circ_id 2147484893, channel ID 572 (0x7f238f3dbc80)
Mar 26 13:03:01.000 [debug] circuit_receive_relay_cell(): Passing on
unrecognized cell.
Mar 26 13:03:01.000 [debug] connection_or_process_cells_from_inbuf():
553: starting, inbuf_datalen 1062 (0 pending in tls object).
Mar 26 13:03:01.000 [debug] channel_queue_cell(): Directly handling
incoming cell_t 0x7fff332939b0 for channel 0x7f238f3dbc80 (global ID 572
)
Mar 26 13:03:01.000 [debug] circuit_get_by_circid_channel_impl():
circuit_get_by_circid_channel_impl() returning circuit 0x7

Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-26 Thread skyhighatrist
Hmmm. "NationalCrimeAgency" and "SnowCrash" fell over again sometime
last night it would appear. I have now enabled debug logging on all
hosts, and updated them to the latest available (in the repositories
those hosts are using) version of Tor.

I also note that the issue others have mentioned (namely in the
"CircuitStorm" issue in the tracker) that the relay which also runs a HS
has been coming under inredibly high load - CPU maxed out at 100%. Given
my HS is rarely used, such load is ... odd. Other relay and HS operators
I have privately spoken with have mentioned their HS nodes have also
locked up at 100% recently.

I'll provide debug logs if this happens again post-updates being
applied, which might help with getting to the bottom of this mess.

.d

On 26/03/15 01:38, teor wrote:
>> Date: Wed, 25 Mar 2015 17:21:50 +0800
>> From: Vincent Yu 
>>
>> On Wed, Mar 25, 2015 at 4:26 PM, skyhighatrist 
>> wrote:
>>> I am wondering if anyone has had their relay randomly crash in the
>>> past
>>> week or so. Three of mine (I run 6, nonexits) have fallen over. One of
>>> them ~5 days ago, one of them ~4 days ago, and one of them earlier
>>> today
>>> .
>>
>> This also happened to my two relays about three weeks ago. They are:
>>
>> https://globe.torproject.org/#/relay/C309A31AD772FFDD0805C9FECB6D4748A7CBF684
>> https://globe.torproject.org/#/relay/18BE989663CF3351F73D33C672BB1C985E0EA5D0
>>
>> They are both middle/guard relays (about 200 Mbps each) on the 0.2.6
>> branch and are on the same dedicated server. They went down at
>> different times, and as far as I can remember, there was nothing
>> notable in the Tor and system logs. No issues prior to this over the
>> past 12 months. I haven't had time to investigate this.
>>
> 
> This also happened to my EC2 relay a few weeks ago, but I never had time to 
> investigate.
> It's quite possible it was a bad nightly tor version, or some sort of 
> misconfiguration on my end, but the timing matches.
> I didn't manage to get it to come back up in the limited time I had to try 
> and fix it.
> 
> https://globe.torproject.org/#/relay/425E55A8FA145ACFC01FA58CD4E6F46DD7762AAB
> (No record, it's been down for a while.)
> 
> teor
> 
> teor2345 at gmail dot com
> pgp 0xABFED1AC
> https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
> 
> teor at blah dot im
> OTR C3C57B23 349825DE 929A1DEF C3531C25 A32287ED
> 
> 
> 
> ___
> tor-relays mailing list
> tor-relays@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
> 



0xB5C3969D.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread teor
> Date: Wed, 25 Mar 2015 17:21:50 +0800
> From: Vincent Yu 
> 
> On Wed, Mar 25, 2015 at 4:26 PM, skyhighatrist 
> wrote:
>> I am wondering if anyone has had their relay randomly crash in the
>> past
>> week or so. Three of mine (I run 6, nonexits) have fallen over. One of
>> them ~5 days ago, one of them ~4 days ago, and one of them earlier
>> today
>> .
> 
> This also happened to my two relays about three weeks ago. They are:
> 
> https://globe.torproject.org/#/relay/C309A31AD772FFDD0805C9FECB6D4748A7CBF684
> https://globe.torproject.org/#/relay/18BE989663CF3351F73D33C672BB1C985E0EA5D0
> 
> They are both middle/guard relays (about 200 Mbps each) on the 0.2.6
> branch and are on the same dedicated server. They went down at
> different times, and as far as I can remember, there was nothing
> notable in the Tor and system logs. No issues prior to this over the
> past 12 months. I haven't had time to investigate this.
> 

This also happened to my EC2 relay a few weeks ago, but I never had time to 
investigate.
It's quite possible it was a bad nightly tor version, or some sort of 
misconfiguration on my end, but the timing matches.
I didn't manage to get it to come back up in the limited time I had to try and 
fix it.

https://globe.torproject.org/#/relay/425E55A8FA145ACFC01FA58CD4E6F46DD7762AAB
(No record, it's been down for a while.)

teor

teor2345 at gmail dot com
pgp 0xABFED1AC
https://gist.github.com/teor2345/d033b8ce0a99adbc89c5

teor at blah dot im
OTR C3C57B23 349825DE 929A1DEF C3531C25 A32287ED



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Speak Freely
My apologies for my lack of memory...

It's still fuzzy, but Roger helped jog the the gerbil into action.

There was a line regarding "assert failure" in my logs.

I could not get Tor to start again until I followed these instructions:
https://trac.torproject.org/projects/tor/ticket/13111

My secret onion keys had 0 values, so I removed them and restarted Tor
and it went on it's merry way.


It's quite possible the cause of my crash was completely different than
Vincent and skyhighatrist, as neither of them indicated they had to do
that to get them back up and running.


Matt
Speak Freely


___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Roger Dingledine
On Wed, Mar 25, 2015 at 08:26:20AM +, skyhighatrist wrote:
> I have no idea why they all fell over, the last thing in the logs was
> the usual "current status" output with some traffic measuring, seemingly
> immediately afterwards, the process killed itself for no reason.

You might also enjoy
https://www.torproject.org/docs/faq#TorCrash

My first guess is that the out-of-memory killer killed it.

--Roger

___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread skyhighatrist
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I'll do an upgrade across all my relays tonight, hopefully that
resolves the issue in the future.

Interesting that SpeakFreely's relay was affected given its on the
patched version...

On 25/03/15 13:53, Speak Freely wrote:
> Vincent Yuis Tor 0.2.6.5-rc on Linux skyhighartistis Tor
> 0.2.4.24 on Linux my affected relay  Tor 0.2.5.11 on Linux
> 
> Cool.
> 
> 
> 
> Matt Speak Freely
> 
> 
> Nick Mathewson:
>> What version of Tor did these relays run?  Is it possible that
>> one of the crash bugs fixed in 0.2.5.11 is to blame?
>> 
> ___ tor-relays mailing
> list tor-relays@lists.torproject.org 
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJVEs2ZAAoJEMRV9IW1w5addBwP+QFv2btGjYUE/E30+5w4aySn
CiP9Xqep3xEHRKCfxE+zbAb+KIOLyJXfKQo5wF9UvOHr0WSD1UQHzaurwwnbgvyv
ylvDkyRj7w2aYTfZiMi2aXlLuH5KKgaDPm72g9T8pL8rUwMfdMtK969wyayXOgJS
Cm7klx+R4j4iK74bA5mselYeTgqfGBbHEvTk13HpII/WJfI3e87ilaqN/w8KyUnl
T+0pEAJdLJML61/OIsarH6ydCePLNaE6Dbgy0BAI0vLWqeQILrIW8S0MpU1xq4zp
0GAFPa5qiuXDCYqOt5KSzqATUjJV9HwJqfbpLwNO58dVdpeDSDJaZNGOzkuuXM0V
xDmKQ6ztolUkhJYhcwexhR+FjuiUXZW0i6kl9OMWdbYEyYg0X0J/kL7w6yx40BwP
8ikd9hcxXFYui8jURfqNw+iR8e7HrymvvgL7R5zny8qGmHGdwfxgjVBjrwVV+Zwv
uWrElZpnQjx8ollpCjPaAumb/I2Wc6rcS0G7IysZKhWkLmlrrSGh3fJH2+L7aDIV
EVDDCrrmz10aAJyNhVBlwGumLJv0qnWJiqJSremFIC3ZGEAMXb/zUKegz/lQyxE2
k3HnR+sklhchVq9CL9XKq/YbbwdSXAZhYQqpmFPwzz5rdBcaPCk8bMp/O/WmILCA
Mjid9oC8CbRMV1Q/OzEE
=go/q
-END PGP SIGNATURE-
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Speak Freely
Vincent Yu  is Tor 0.2.6.5-rc on Linux
skyhighartist   is Tor 0.2.4.24 on Linux
my affected relay  Tor 0.2.5.11 on Linux

Cool.



Matt
Speak Freely


Nick Mathewson:
> What version of Tor did these relays run?  Is it possible that one of
> the crash bugs fixed in 0.2.5.11 is to blame?
> 
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Nick Mathewson
On Wed, Mar 25, 2015 at 4:26 AM, skyhighatrist  wrote:
> Hello List,
> I am wondering if anyone has had their relay randomly crash in the past
> week or so. Three of mine (I run 6, nonexits) have fallen over. One of
> them ~5 days ago, one of them ~4 days ago, and one of them earlier today
> .
>
> The logs indicate everything was operating normally, and the only reason
> I noticed they had crashed was when I was checking what bandwidth
> globe.torproject was measuring for them it said some were down.
>
> The affected relays are as follows:
> AC9803701F9EE18194D40B38E47CE4C68CF2F567
> 73067CD4ADD8A294BDA913DF45B63190A52B5F9F
> D76252B1A6E9F01FC6772CFFB651056A2B54F92B
>
> I have no idea why they all fell over, the last thing in the logs was
> the usual "current status" output with some traffic measuring, seemingly
> immediately afterwards, the process killed itself for no reason.

What version of Tor did these relays run?  Is it possible that one of
the crash bugs fixed in 0.2.5.11 is to blame?

-- 
Nick
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread skyhighatrist
As promised, heres my utterly hideous monitoring script for
checking/restarting relays. Its a work in progress, any advice/comments
would be greatly appreciated.

https://github.com/0x27/relaycheck

Hopefully we can get to the bottom of this wierd relay-collapse problem,
its a bit of a nuisance!

On 25/03/15 12:53, Speak Freely wrote:
> One of my relays went down a few weeks ago, and I didn't notice until a
> few days ago.
> 
> https://atlas.torproject.org/#details/4CA46581FB3C82102565B02C1ECB6DD38EF6654A
> 
> I did find what caused it, but thus far I cannot remember what it was.
> If I remember, I'll post again.
> 
> ___
> tor-relays mailing list
> tor-relays@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
> 



0xB5C3969D.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Speak Freely
One of my relays went down a few weeks ago, and I didn't notice until a
few days ago.

https://atlas.torproject.org/#details/4CA46581FB3C82102565B02C1ECB6DD38EF6654A

I did find what caused it, but thus far I cannot remember what it was.
If I remember, I'll post again.

___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Undiagnosable Crashes in Relays

2015-03-25 Thread Vincent Yu
On Wed, Mar 25, 2015 at 4:26 PM, skyhighatrist  
wrote:
I am wondering if anyone has had their relay randomly crash in the 
past

week or so. Three of mine (I run 6, nonexits) have fallen over. One of
them ~5 days ago, one of them ~4 days ago, and one of them earlier 
today

.


This also happened to my two relays about three weeks ago. They are:

https://globe.torproject.org/#/relay/C309A31AD772FFDD0805C9FECB6D4748A7CBF684
https://globe.torproject.org/#/relay/18BE989663CF3351F73D33C672BB1C985E0EA5D0

They are both middle/guard relays (about 200 Mbps each) on the 0.2.6 
branch and are on the same dedicated server. They went down at 
different times, and as far as I can remember, there was nothing 
notable in the Tor and system logs. No issues prior to this over the 
past 12 months. I haven't had time to investigate this.


Vincent
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays