[tor-relays] Suspicious activity
Hello. Did anyone noticed unusual connection count spikes for their relays? My relay encountered several ~1k spikes with rise time ~= 10 minutes: https://imgur.com/a/6JvB7gp Maybe it is someone trying to fool anti-DDoS protection? -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 100K circuit request per minute for hours killed my relay
>> Jul 27 18:08:31.000 [notice] Circuit handshake stats since last time: >> 5198/5200 TAP, 3994625/3995090 NTor. > TAP is used for hidden services to connect to intro and rendezvous > points, and you're not seeing many extra TAP connections. > So *if* this is related to hidden services, it is not connecting to the > hidden service directly. Instead, it is sending (exit?) traffic through > the relays in the hidden service circuit. I have found that there are two patterns, which are associated with "assign_to_cpuworker failed" errors. First one: heavy overload, millions of NTor handshakes, weight is decreased several times, relay can lose Guard state. Second one: moderate overload, TAP handshakes slightly increased, weight is not affected. Normal stats: Jul 24 18:08:29.000 [notice] Circuit handshake stats since last time: 4892/4892 TAP, 61208/61208 NTor. Jul 25 00:08:29.000 [notice] Circuit handshake stats since last time: 3753/3753 TAP, 61775/61775 NTor. Jul 25 06:08:29.000 [notice] Circuit handshake stats since last time: 3218/3218 TAP, 57756/57756 NTor. Jul 25 12:08:29.000 [notice] Circuit handshake stats since last time: 3538/3538 TAP, 56631/56631 NTor. Jul 25 18:08:29.000 [notice] Circuit handshake stats since last time: 4188/4188 TAP, 60672/60672 NTor. Overload #1 stats: Jul 27 12:08:31.000 [notice] Circuit handshake stats since last time: 4715/4715 TAP, 100785/100785 NTor. Jul 27 18:08:31.000 [notice] Circuit handshake stats since last time: 5198/5200 TAP, 3994625/3995090 NTor. Jul 28 00:08:31.000 [notice] Circuit handshake stats since last time: 2771/2773 TAP, 4172331/4174404 NTor. Jul 28 06:08:31.000 [notice] Circuit handshake stats since last time: 1304/1305 TAP, 3899551/3899941 NTor. Jul 28 12:08:32.000 [notice] Circuit handshake stats since last time: 1415/1416 TAP, 3802487/3803824 NTor. Jul 28 18:08:32.000 [notice] Circuit handshake stats since last time: 1895/1895 TAP, 843496/843724 NTor. Jul 29 00:08:32.000 [notice] Circuit handshake stats since last time: 1948/1948 TAP, 34055/34055 NTor. Overload #2 stats: Jul 30 06:08:33.000 [notice] Circuit handshake stats since last time: 9288/9288 TAP, 60425/60425 NTor. Jul 30 12:08:33.000 [notice] Circuit handshake stats since last time: 31739/32038 TAP, 37301/37307 NTor. Jul 30 18:08:33.000 [notice] Circuit handshake stats since last time: 40316/40993 TAP, 36967/36972 NTor. Jul 31 00:08:34.000 [notice] Circuit handshake stats since last time: 36414/36830 TAP, 36726/36730 NTor. Jul 31 06:08:31.000 [notice] Circuit handshake stats since last time: 21715/21801 TAP, 40564/40564 NTor. I'm not sure what this differences mean, but, maybe, this stats can help to distinguish the sources of overload (or prove that they are the same). -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 100K circuit request per minute for hours killed my relay
> Each onion service has six relays each day that serve as the place for > fetching its onion descriptor, and some onion services are super popular Exactly after 24 hours connection count dropped to 2200 and "assign_to_cpuworker failed" error stopped appearing. Thanks! -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 100K circuit request per minute for hours killed my relay
> This sort of thing has been going on for many years. I used to refer > to it as "mobbing". As nearly as I was ever able to determine, the behavior > is an unintended consequence of hidden services. Same thing started to happen today and I have noticed that 100% CPU usage spikes happens every hour and lasts for several minutes. During this spikes, all cores of CPU are used and stack trace points somewhere at worker_thread_main() function. Also today relay have more connections than usually (5500 vs 2000-3000). Is this pattern matches the characteristics of hidden services work? Jul 27 16:09:12.000 [warn] assign_to_cpuworker failed. Ignoring. ... Jul 27 17:09:13.000 [warn] assign_to_cpuworker failed. Ignoring. ... Jul 27 18:08:31.000 [notice] Circuit handshake stats since last time: 5198/5200 TAP, 3994625/3995090 NTor. ... Jul 27 18:09:11.000 [warn] assign_to_cpuworker failed. Ignoring. ... Jul 27 19:09:11.000 [warn] assign_to_cpuworker failed. Ignoring. ... Jul 27 20:10:11.000 [warn] assign_to_cpuworker failed. Ignoring. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 100K circuit request per minute for hours killed my relay
> Your message prompted me to check logs, and on one relay I see the following: Similar thing for me: Jul 19 00:08:27.000 [notice] Circuit handshake stats since last time: 3571/3571 TAP, 41180/41180 NTor. Jul 19 06:08:27.000 [notice] Circuit handshake stats since last time: 2054/2054 TAP, 29181/29181 NTor. Jul 19 12:08:28.000 [notice] Circuit handshake stats since last time: 2773/2773 TAP, 26497/26497 NTor. Jul 19 18:08:28.000 [notice] Circuit handshake stats since last time: 3970/3970 TAP, 31344/31344 NTor. Jul 20 00:08:28.000 [notice] Circuit handshake stats since last time: 4096/4096 TAP, 41730/41730 NTor. Jul 20 06:08:28.000 [notice] Circuit handshake stats since last time: 18285/18285 TAP, 54102/54102 NTor. Jul 20 12:08:28.000 [notice] Circuit handshake stats since last time: 61136/61386 TAP, 378196/378339 NTor. Jul 20 18:08:29.000 [notice] Circuit handshake stats since last time: 73297/73688 TAP, 566708/566892 NTor. Jul 21 00:08:29.000 [notice] Circuit handshake stats since last time: 67165/67830 TAP, 572685/572851 NTor. Jul 21 06:08:29.000 [notice] Circuit handshake stats since last time: 31988/32138 TAP, 521455/521536 NTor. Jul 21 12:08:29.000 [notice] Circuit handshake stats since last time: 5523/5523 TAP, 222378/222432 NTor. Also there are too much "[warn] assign_to_cpuworker failed. Ignoring." lines in the logs. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
[tor-relays] Windows relay performance: svchost process
Hello. Since connection count for my Windows 7 relay began to grow, I have started to notice unusual CPU load pattern. That's a periodical spikes every ~8 seconds. First I was thought that this spikes are generated only by tor.exe process, but then I have found that svchost.exe process have identical spikes (I guess they belong to NlaSvc system service). Do anyone know why this can happen? Here is the screenshot: https://s8.hostingkartinok.com/uploads/images/2017/07/6fe01ab8601548f95ba47602de0f3739.png -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
Finally, I have made my test program and found that I was wrong about two things: 1. Low weight relays (< 30) rarely give fast speed (> 150 KiB/s) on two-hop circuits. With three hops, fast speed even more rare thing. 2. Windows version of Tor really have some problems. I don't quite understand the factors, which have influence on circuit speed, but, at least, I have found how it is possible to obtain the low bandwidth estimate for my relay. I have selected two entry nodes and two exit nodes: refEntry1 D665C959571041972EA8C0DD77559EF5579BA112 refEntry2 13B2354C74CCE29815B4E1F692F2F0E86C7F13DD refExit1 5CECC5C30ACC4B3DE462792323967087CC53D947 refExit2 07C05ED4825F51D5BE4CDBBAA80BFA484132A2F5 Then launched four circuits: refEntry1, myNode, refExit1 refEntry1, myNode, refExit2 refEntry2, myNode, refExit1 refEntry2, myNode, refExit2 And measured their bandwidth: 1. 117 KiB/s 2. 122 KiB/s 3. 59 KiB/s 4. 51 KiB/s That was pretty strange. Previous tests with speedtest.net showed that 500-1000 KiB/s speeds are not a problem for my connection. Next idea was to measure the neighbor relay (from the same city and ISP). And this resulted in following speeds: 1. 356 KiB/s 2. 392 KiB/s 3. 375 KiB/s 4. 271 KiB/s Not too fast, but definitely better than result for my relay. So one of the limiting factors is located somewhere on my computer. Connection count is fine, RAM and CPU are also good enough. The only difference left is operating system. It is possible for me to boot from USB stick with some Linux. But first I have decided to make a test with virtual machine. Port forwarding was set, Ubuntu and new Tor relay are launched and here is the result: 1. 452 KiB/s 2. 375 KiB/s 3. 141 KiB/s 4. 163 KiB/s Usually adding a virtual machine leads to worse results. But not this time. So the next question is: How Linux version inside Windows can perform three times better than Windows version alone? And what is the real limit for my configuration? Even if 100 KiB/s speed changes to 300 KiB/s, this will be still far from 1 MiB/s, 5 MiB/s, 10 MiB/s, which are possible with my connection. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
> We tried to unstick some of the lowest bandwidth relays (below 1000), You mean 0..1 weight (1 KiB/s and lower)? They may be really bad. I guess, that good test range is 10..20 (or 5..30). > and our initial results are: > * most (15) of relays we tried are actually very slow, or down, > * some (3) relays that we tried went down before we could see if we had > changed anything, Relays can be additionally filtered by Stable flag (or uptime). If relay haven't rebooted for weeks, there is a big chance that it will stay online during the tests too. > We are trying on a larger set now. I hope this will give more successful attempts. > But these results indicate that most relays > that are measured slow are actually slow for tor clients. > (Which is what matters.) If larger set will not give better results, I will try to make my own test program and launch it from my location. Maybe different approach will give different results. I am still sure, that low weight estimate is hiding many fast relays. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
In order to estimate the effect of relays unstuck measures, I have made some graphs. The first graph shows how many relays in the network have weight < 20 (in percents, relative to total measured, valid and running count): https://s8.hostingkartinok.com/uploads/images/2017/06/e905280414853a800031e7ffbeba02c0.png And I see some drop at June, 15: from ~7.4% to ~6.6%. This corresponds to increase of my relay weight: from ~10 to ~20. Looks like this is the effect of increasing the minimum test file size. Second graph is the same, but for weight < 100: https://s8.hostingkartinok.com/uploads/images/2017/06/7768344880b81c80442aaa383550e117.png And there no effect can be seen at this scale. (Don't know if this analysis can help, but, anyway, here it is) -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
>> Windows does not limit connection count for processes and users. >> There are also no system-wide limit for sockets. >> Except for available dynamic port range (1025-64510 on my computer). > Depending on your Windows version, the limit may be around 2000-4000, > check this article: > http://smallvoid.com/article/winnt-tcpip-max-limit.html 1. This article is from year 2004. Most described parameters are removed on modern systems. 2. Available port range on my computer is 64510 - 1025 = 63485 ports. But this limits only outbound connection count. Inbound connection count is unlimited. https://msdn.microsoft.com/en-us/library/windows/desktop/ms739169(v=vs.85).aspx : "The Microsoft Winsock provider limits the maximum number of sockets supported only by available memory on the local computer." 3. Half-open connections limit, mentioned in article, was removed starting from Windows 7. > You should also check how many connections your relay is actually > making. Jun 28 01:30:51.000 [notice] Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 384 v4 connections; and received 26 v1 connections, 0 v2 connections, 0 v3 connections, and 639 v4 connections. Jun 28 07:30:51.000 [notice] Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 664 v4 connections; and received 46 v1 connections, 0 v2 connections, 0 v3 connections, and 1200 v4 connections. Jun 28 13:30:51.000 [notice] Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 725 v4 connections; and received 63 v1 connections, 1 v2 connections, 0 v3 connections, and 1469 v4 connections. > Ok, so if your relay is in the 16MB bucket, it should be measured > at at least 200 after a few weeks. But it's hard to tell which > bucket each relay is in, that depends on the bandwidth authority. If my relay resurrects, that would be great. But more important goal is to prevent the possibility of such stuck. Instead of + 1 MiB/s this can yield + 1 GiB/s. > That might unstick your relay. > We need to know if this happens, because it helps us to know what to do > to fix stuck relays. Yes, it can. Some relays are already unstuck (9FC2673BB2704C2AAB851F8334938565DF1D0819, 143BC876D403003FBEF2AA843942DC4D248E3872 for example). But some stuck even deeper (B918EB3FA4D03A4F9F632AA17F217A6C04044EF7, BD4354E76929C90B7004FF149A3C52189A3B4634). So my fear is that routers are get unstuck at the expense of some getting stuck. If this is not the case, that is great! > We are working on it a few different ways: > * increasing the minimum bandwidth authority file size > * making an automatic process to un-stick stuck relays > * getting more bandwidth authorities in more places > * re-writing the bandwidth authority code I saw some changes and was wondering if they are random or not. Thanks for your work. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
> This could be part of your issue. > The code for tor relays on Windows is not maintained very well. There are many relays on Windows, which are not stuck. And many relays on Linux, which are stuck. > What is the connection / handle limit on the tor process and the user > you are using for the tor process? > For a non-exit relay, it needs to be around 10,000. > For an large exit relay, it needs to be 50,000 or so. Windows does not limit connection count for processes and users. There are also no system-wide limit for sockets. Except for available dynamic port range (1025-64510 on my computer). > Now check the latency and bandwidth to these directory authorities. > But only do to once, they have a lot of load already. > Also, use gabelmoobwauth, rather than gabelmoo. > And check Faravahar. Latency (ping): longclaw / 199.254.238.53 : 187 ms gabelmoobwscan / 131.188.40.189 : 44 ms moria1 / 128.31.0.34: 128 ms faravahar / 154.35.175.225 : 147 ms Bandwidth (via PrivacyRepublic0001 and 16M file from 38.229.72.16): longclaw: 285 KiB/s gabelmoobwscan : 1195 KiB/s moria1 : 404 KiB/s faravahar : 141 KiB/s > Ok, the next limit will be the observed bandwidth. After the yesterday test #5, observed bandwidth changed to 1.12 MiB/s. > You need to be patient. That's not a problem if I know that something will definitely change in the future. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
> Before we investigate the measurements: > * We need to know if anything on your relay or at your provider > is making your relay slow, > * We need to be know which measurement of your relay is slow. > I made a wiki page to tell people how to do that: > https://trac.torproject.org/projects/tor/wiki/doc/MyRelayIsSlow > Please go through the steps, and let us know what results you get. > Then someone can help you more. > 1. Check RAM, CPU, and socket/file descriptor usage on your relay Private bytes amount for tor.exe process is 116 MiB, 3.4 GiB of system memory is available (out of 8 GiB total). Log shows that: "Based on detected system memory, MaxMemInQueues is set to 2048 MB. You can override this by setting MaxMemInQueues by hand." Tor is using 0-1% of CPU resource on the average. Sometimes it consumes 25% of CPU (100% of 1 core) for 10 seconds. And then returns back to normal 0-1% usage. Tor process have 573 handles open and about 380 established TCP connections. But this is unusual activity, related to Faravahar downtime and, respectively, obtaining of Fast and HSDir flags. Usually, it have only 20-30 connections established. > 2. Check the internet peering (bandwidth, latency) from your relay's > provider to other relays. Latency (ping): hviv104 / 192.42.116.16 : 40 ms PrivacyRepublic0001 / 178.32.181.96 : 39 ms Unnamed / 185.170.41.8 : 36 ms McCormickRecipes/ 18.85.22.204 : 135 ms PhantomTrain4 / 65.19.167.131 : 184 ms Bandwidth (via dopper / 192.42.113.102 and bwauth's 16M file): hviv104 : 50 KiB/s PrivacyRepublic0001 : 1.3 MiB/s Unnamed : 155 KiB/s McCormickRecipes : 947 KiB/s PhantomTrain4: 899 KiB/s > 3. Check each of the votes for your relay on consensus-health > (large page), and check the median: Consensus was published 2017-06-27 12:00:00. longclaw: bw=34 gabelmoo: bw=41 moria1 : bw=23 *median*: bw=34 > 4. Check your relay's observed bandwidth and bandwidth rate (limit). Bandwidth rate: 1 MiB/s Bandwidth burst : 3 MiB/s Observed bandwidth: 250.77 KiB/s > 5. Run a test using tor to see how fast tor can get on your network/CPU: This will alter observed bandwidth. But okay. Depending on exit node, result varies from 117 KiB/s to 1 MiB/s. Example: $ curl --socks5-hostname localhost:9050 --insecure -O https://38.229.72.16/bwauth.torproject.org/16M % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 16.0M 100 16.0M0 0 857k 0 0:00:19 0:00:19 --:--:-- 967k > 6. Run a test using tor and chutney to find out how fast tor can get on > your CPU. Keep increasing the data volume until the bandwidth stops > increasing: As this tool is designed for Linux, I can run it only within virtual machine. But results are still good: $ CHUTNEY_DATA_BYTES=104857600 ./chutney verify networks/basic-min ... Single Stream Bandwidth: 99.46 MBytes/s Overall tor Bandwidth: 397.85 MBytes/s $ CHUTNEY_DATA_BYTES=1048576000 ./chutney verify networks/basic-min ... Single Stream Bandwidth: 43.52 MBytes/s Overall tor Bandwidth: 174.07 MBytes/s > It might not be me that helps you. > So please talk to the list when you write back. But no one else shown the interest on answering to this topic. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
Hello, teor. Is it worth to wait till you have time to investigate stuck relays problem? -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
>> Are there any news regarding that problem? > It's a complicated problem. > We need to work out why it is happening before we can fix it. I am not sure if I really can help with BwAuth's algorithms. But that's what I would do in such situation: First, most important thing, is a clear goal. As BwAuth's task is to make a measurements, then theirs result must be clearly defined. So, the question is - how an ideal "w" ([Consensus] Weight) looks like? Suppose we have some perfectly measured parameters for relay: perfect_measured_bandwidth (KiB/sec) and perfect_measured_latency (sec). Their exact definition can be omitted now for simplicity. Then ideal_w should be equal to: ideal_w = perfect_measured_bandwidth ideal_w = k * perfect_measured_bandwidth ideal_w = k * perfect_measured_bandwidth / perfect_measured_latency or, maybe, something else? Once this question is clear, then it is possible to see how existent implementation is far from this ideal. And here is part two: why it is different? It is some noise that can't be eliminated? Or algorithm for some reason converges to incorrect values? BwAuths have some debug information. This can help to investigate problem. There are also the possibility of doing manual checks. Which then can be compared to debug logs of BwAuth. Maybe my approach is not so good and you will choose another. But, anyway, thanks for working on this problem. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
ay was slow in > the past. My relay was never slow. Possibility of such random stuck is a thing, which is needs to be eliminated. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
> We know of relays that have improved their bandwidth measurements by > changing their keys (this resets the measurements). 1. It is not possible to change keys for relays which you don't control. 2. It is better to have such algorithms, which can't stuck. > But most relays get low weights because: > * they can not get enough CPU or RAM, > * they can not keep enough connections open, This is not a case for relays with 1 KiB/s load. > * they go up and down a lot, My examples was for stable relays. > * they change IP address a lot, or ExoneraTor is lagging, but 3 of 4 example relays was using the same addresses month ago. > * they do not get good bandwidth over time, > * they do not get good bandwidth to the rest of the tor network, > * they have high latency to the rest of the tor network, This can be measured. For example, BD4354E76929C90B7004FF149A3C52189A3B4634 is capable of serving 1 MiB/s (was made a circuit through it this morning): r Hedgehog vUNU52kpyQtwBP8UmjxSGJo7RjQ BG894JEWmT0pcLmWTabGYlWT5Iw 2017-06-13 06:08:30 212.26.140.81 443 0 ... w Bandwidth=1024 Measured=5 > * some other reasons that make them less useful to clients. Looks like clients have no influence on BwAuth's decisions. > But this isn't enough information to work out what the problem is. > Maybe there is a problem with the relay, not the measurements. > We just can't tell. What additional information can help? > Maybe the relay has low CPU, international bandwidth, or connection > limits. We just don't know. If it can retranslate a lot of traffic, then it have no such problems. > We would need to talk to the operator to find out. I would not raised this question if I wasn't such an operator. > These measurements are updated over time. > Please check again after a few weeks. They already shows that relay is more capable than it is rated. > I think this spike means: > "You think your provider is giving you 100 Mbps, but they are > actually giving you much less. Talk to them about it." > Usually this is because the provider only tries to give everyone > 100Mbps, or they limit everyone and don't tell them, or they don't > pay enough to get good international bandwidth. Exact number does not matter. The problem is that weight histogram have no equivalent spike. Here is another histogram. https://s8.hostingkartinok.com/uploads/images/2017/06/749e7e3be806c22f3dd5c0e9586304ab.png (x, y and colors are the same) Just filtered relays so theirs Advertised Bandwidth is in range 110..135. I wouldn't say this values are "proportional" enough. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] Consensus Weight calculation
> Thanks for writing to us. Thanks for the answers. > This is a question that gets asked a lot: > "Many people set up new fast relays and then wonder why their bandwidth > is not fully loaded instantly…" > https://blog.torproject.org/blog/lifecycle-of-a-new-relay Maybe this is correct in many cases. But definitely not in all of them. For example, this line: "once the bwauths have measured you and the directory authorities lift the 20KB cap, you'll attract more and more traffic" Events can go other way: bwauths will assign lower weight, and relay will be getting less and less traffic. > It can take a week or two for the bandwidth authorities to measure a > relay. Relay, which hit the problem, can be in underpowered state for months. > I'm not sure if this is a problem. And I'm not sure how many relays it > impacts. Hundreds, I guess. Here is some examples: https://atlas.torproject.org/#details/9FC2673BB2704C2AAB851F8334938565DF1D0819 Now used bandwidth: 1KiB/s Advertised Bandwidth: 131.38 KiB/s Top used bandwidth: 250KiB/s Bandwidth rate: 4000KiB/s https://atlas.torproject.org/#details/B918EB3FA4D03A4F9F632AA17F217A6C04044EF7 Now used bandwidth: 1KiB/s Advertised Bandwidth: 82.65 KiB/s Top used bandwidth: 245KiB/s Bandwidth rate: 800KiB/s https://atlas.torproject.org/#details/DF1C6C645C5854780778A3E81D12F2A8FF65744B Now used bandwidth: 1KiB/s Advertised Bandwidth: 62.29 KiB/s Top used bandwidth: 7KiB/s Bandwidth rate: 3000KiB/s https://atlas.torproject.org/#details/E2AF5879F39FF40DF8994E9B8FAEAB2518AEEBA4 Now used bandwidth: 1KiB/s Advertised Bandwidth: 70.94 KiB/s Top used bandwidth: 916KiB/s Bandwidth rate: 1000KiB/s As you can see, most of them can handle a lot more traffic: 50x-4000x. Also don't see why they can have high latency. Good relays, on my opinion. > But we know there is a bias in Tor's measurements towards North America > and Europe, because that's where most of the measurements are made from: No, this have no impact in this case. I have launched my own instance of BwAuthority and I see, that measured "filt_bw" values are pretty close to "Advertised Bandwidth": node_id=$9FC2673BB2704C2AAB851F8334938565DF1D0819 nick=qq strm_bw=52732 filt_bw=77967 circ_fail_rate=0.0 desc_bw=134537 ns_bw=13000 node_id=$9FC2673BB2704C2AAB851F8334938565DF1D0819 nick=qq strm_bw=61278 filt_bw=70430 circ_fail_rate=0.0 desc_bw=85495 ns_bw=13000 node_id=$B918EB3FA4D03A4F9F632AA17F217A6C04044EF7 nick=TranTor strm_bw=40485 filt_bw=47052 circ_fail_rate=0.0 desc_bw=84635 ns_bw=12000 The problem is on the next step, I think. >> The result has revealed some anomalies: >> >> https://s8.hostingkartinok.com/uploads/images/2017/06/fed1cf8b57fc027223c8eaf3deb0d28a.png >> First, and most important, - a lot of relays have bandwidth estimate >> in range 0-50: 1082 of them. > I don't know what each axis is on this graph. x is KiB/s, y is count (yellow bars are for "Advertised Bandwidth", blue - for "Consensus Weight", grey mean both values) > 20 is the default, 50 is the maximum for a relay's self-test. > If a relay isn't measured, or measures very low, it usually gets a > figure in this range. I have excluded non-measured relays from this histogram. >> Second - there are incorrect estimates >> for popular bandwidths of 5, 10 and 20 MBits. > I don't understand what you mean here. The advertised bandwidth is in > kilobytes per second, and the consensus weight is dimensionless (but > scaled from kilobytes per second). > Can you point out the lines you mean? Look at the yellow spike at x = ~1200. Low blue bars at the same point means that "Consensus Weight" model did not take into account that there are many 1200 KiB/s nodes on the network, which will result in theirs underload. -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
[tor-relays] Consensus Weight calculation
Hello. Recently I have decided to create a new relay. After several days of waiting, I have realized that decision of Bandwidth Authorities, that my bandwidth is 1000 times lower than it should be, is pretty stable. That is bad on its own, but I was wandering - how many other relays suffers from the same problem? Since all network data is open to analysis, I have decided to calculate some statistics. As "Consensus Weight", theoretically, should correspond to relay's bandwidth, first thought was to compare it with "Advertised Bandwidth" value (assuming there not too many liars on the network). The result has revealed some anomalies: https://s8.hostingkartinok.com/uploads/images/2017/06/fed1cf8b57fc027223c8eaf3deb0d28a.png First, and most important, - a lot of relays have bandwidth estimate in range 0-50: 1082 of them. Second - there are incorrect estimates for popular bandwidths of 5, 10 and 20 MBits. Next question was: what estimates was actually assigned to that bandwidth spikes? Maybe all zeroes? This led me to another charts: https://s8.hostingkartinok.com/uploads/images/2017/06/8cefb70fce667a1b89c783ed2bfc9442.png https://s8.hostingkartinok.com/uploads/images/2017/06/2e42634ea3f9b71df8a7fd17c27660d9.png x here is "Advertised Bandwidth", y is "Consensus Weight". I was expected to see something close to x = y line. But result was much worse. First problem (not too important) is a lot of randomness. 5 MiB relay can be easily detected as 1 MiB or 10 MiB. Second one is a thing, which, probably, steals a lot of available network bandwidth: relays with low "Advertised Bandwidth" gets much less traffic than they can handle. Almost no relay with speed < 500 KiB is rated correctly. Similarly, high-speed relays have higher weight than needed. If all 0-50KiB-estimated relays are capable of serving at least 100 KiB, fixing this problem will lead to ~ (100-25)*1082 = 82 MiB/s increase of network bandwidth. But they have even more potential, I think. Do anyone have ideas how to solve this problem? -- Vort ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays