#22602: CollecTor's relaydescs module freezes while downloading from directory authorities -----------------------------------+-------------------------- Reporter: karsten | Owner: metrics-team Type: defect | Status: new Priority: High | Milestone: Component: Metrics/CollecTor | Version: Severity: Normal | Keywords: Actual Points: | Parent ID: Points: | Reviewer: Sponsor: | -----------------------------------+-------------------------- This morning, 2017-06-14 ~07:00, I noticed that the latest consensus retrieved by CollecTor was valid after 2017-06-13 17:00.
The last log lines from the relaydescs module were: {{{ 2017-06-13 17:05:00,001 INFO o.t.c.c.CollecTorMain:66 Starting relaydescs module of CollecTor. 2017-06-13 17:05:26,184 INFO o.t.c.r.CachedRelayDescriptorReader:255 Finished importing relay descriptors from local Tor data directories: cached-consensus: 2017-06-13 17:00:00 cached-descriptors: parsed 0, skipped 24560 server descriptors cached-descriptors.new: parsed 608, skipped 8585 server descriptors cached-extrainfo: parsed 0, skipped 24543 extra-info descriptors cached-extrainfo.new: parsed 607, skipped 8239 extra-info descriptors v3-status-votes: parsed 8, skipped 0 votes }}} All other modules continued as usual. Here's a stack trace obtained using `jcmd`: {{{ "CollecTor-Scheduled-Thread-8" daemon prio=10 tid=0x00007fedd8006800 nid=0x6411 runnable [0x00007fee023fd000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:153) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x000000078fd3b3d8> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:707) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:650) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1371) - locked <0x000000078fd3b418> (a sun.net.www.protocol.http.HttpURLConnection) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at org.torproject.collector.relaydescs.RelayDescriptorDownloader.downloadResourceFromAuthority(RelayDescriptorDownloader.java:869) at org.torproject.collector.relaydescs.RelayDescriptorDownloader.downloadDescriptors(RelayDescriptorDownloader.java:817) at org.torproject.collector.relaydescs.ArchiveWriter.startProcessing(ArchiveWriter.java:176) at org.torproject.collector.cron.CollecTorMain.run(CollecTorMain.java:67) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) }}} I stopped and restarted CollecTor and am now working on filling the gap of relay descriptors published in these ~16 hours by syncing from the backup instance. I guess the fix is to start using a timeout somewhere. It's just curious that we didn't run into this case before. We didn't change anything there recently, did we? -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22602> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs