Re: [tor-relays] Very high CPU Load and low Traffic since Sunday

2019-08-20 Thread Rob Jansen


> On Aug 20, 2019, at 2:05 AM, teor  wrote:
> 
>> On 20 Aug 2019, at 12:56, John Ricketts  wrote:
>> 
>> reduction++;
> 
> This could be a result of load balancing changes due to Rob's bandwidth 
> experiment.
> 
> CPU overloads could also be a result of load balancing changes. The tests 
> only used a few large bandwidth circuits, but the CPU usage of lots of small 
> circuits is much higher.
> 
> I've cc'd Rob to get his opinion.

I scanned each relay once for less than 30 seconds. I don't plan on repeating 
the experiment. It looks like the advertised BWs have reverted, at least on a 
global level (it's possible that individual relays may still notice an 
adjustment though).

https://metrics.torproject.org/bandwidth.html?start=2019-08-01&end=2019-08-20

Sorry I don't have anything more informative to say at the moment.

Peace, love, and positivity,
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-08-09 Thread Rob Jansen


> On Aug 6, 2019, at 5:31 PM, Rob Jansen  wrote:
> 
> Over the last 2 days I tested my speedtest on 4 test relays and verified that 
> it does in fact increase relays' advertised bandwidth on Tor metrics.
> 
> Today, I started running the speedtest on all relays in the network. So far, 
> I have finished about 100 relays (and counting). I expect that the advertised 
> bandwidths reported by metrics will increase over the next few days.

Update: the measurement finished around 0100 UTC on 2019-08-09. I attempted to 
measure each relay that appeared in the latest consensus over time. Due to 
relay churn, this resulted in more measurements than the number of relays in a 
single consensus.

I attempted 7001 measurements:
- 4867 relays were successfully measured for 20 seconds each.
- 2134 relays timed out while trying to build the 10 speedtest circuits.

The measurement should be reflected in most server descriptors of successfully 
measured relays within 36 hours, at about 1300 UTC on 2019-08-10.

Peace, love, and positivity,
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-08-08 Thread Rob Jansen


> On Aug 6, 2019, at 5:48 PM, Roger Dingledine  wrote:
> 
> On Tue, Aug 06, 2019 at 05:31:39PM -0400, Rob Jansen wrote:
>> Today, I started running the speedtest on all relays in the network. So far, 
>> I have finished about 100 relays (and counting). I expect that the 
>> advertised bandwidths reported by metrics will increase over the next few 
>> days. For this to happen, the bandwidth histories observed by a relay during 
>> my speedtest are first committed to the bandwidth history table (within 24 
>> hours), and then reported in the server descriptors (within 18-36 hours, 
>> depending on when the bandwidth history commit happens).
> 
> Great.
> 
> There will be another confusing (confounding) factor, which is that the
> weights in the consensus are chosen by the bandwidth authorities, so
> even if the relay's self-reported bandwidth goes up (because it now sees
> that it can handle more traffic), that doesn't mean that the consensus
> weight will necessarily go up. In theory it ought to, but with a day or
> so delay, as the bwauths catch on to the larger value in the descriptor;
> but in practice, I am not willing to make bets on whether it will behave
> as intended. :) So, call it another thing to keep an eye out for during
> the experiment.


Another wrinkle to keep in mind is that my script measures one relay at a time. 
If there are multiple relays running on the same NIC, after my measurement each 
of them will think they have the full capacity of the NIC. So if we just add up 
all of the advertised bandwidths after my measurement without considering that 
some of them share a NIC, that will result in an over-estimate of the available 
capacity of the network.

To avoid over-estimating network capacity, we could use IP-based heuristics to 
guess which relays share a machine (e.g., if they share an IP address, or have 
a nearby IP address). In the long term, it would be nice if Tor would collect 
and report some sort of machine ID the same way it reports the platform.

Whe!
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Running gigabit relay

2019-08-07 Thread Rob Jansen


> On Aug 6, 2019, at 8:17 PM, teor  wrote:
> 
> Hi,
> 
>> On 6 Aug 2019, at 20:12, Mitar  wrote:
>> 
>> Hi!
>> 
>> I have deployed it:
>> 
>> https://metrics.torproject.org/rs.html#details/567E9785458C605E59202755C74898E3C96FB1CC
>> 
>> On gigabit fiber, using this NUC:
>> 
>> https://ark.intel.com/content/www/us/en/ark/products/126140/intel-nuc-kit-nuc8i7beh.html
>> 
>> Now I just have to wait for traffic to build-up to see if it can
>> really achieve gigabit. Is there any way to speed this process up? Is
>> there anyone who would like to do some circuits through the node to
>> exercise its bandwidth a bit?
> 
> Rob might like to test your relay, because it should show a big speed
> increase if his tests are working.

The result over the 20 second measurement is below. You should see Tor metrics 
get updated within about 24 hours.

Peace, love, and positivity,
Rob

---

432.7944107055664 mbit/s
895.6919174194336 mbit/s
874.1641845703125 mbit/s
887.5885925292969 mbit/s
893.9453659057617 mbit/s
889.0330429077148 mbit/s
890.8471984863281 mbit/s
894.3734512329102 mbit/s
895.8067245483398 mbit/s
900.4853210449219 mbit/s
900.6352462768555 mbit/s
890.7599563598633 mbit/s
895.5899658203125 mbit/s
897.7090148925781 mbit/s
890.3787384033203 mbit/s
897.9648971557617 mbit/s
897.6359939575195 mbit/s
886.5202789306641 mbit/s
876.5180358886719 mbit/s
893.7390213012695 mbit/s
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-08-06 Thread Rob Jansen

> On Jul 26, 2019, at 10:35 AM, Roger Dingledine  wrote:
> 
> On Fri, Jul 26, 2019 at 10:18:24AM -0400, Rob Jansen wrote:
>> I am planning on performing an experiment on the Tor network to try to gauge 
>> the accuracy of the advertised bandwidths that relays report in their server 
>> descriptors. Briefly, the experiment involves running a speed test on every 
>> relay for a short time (about 20 seconds).
> 
> Thanks Rob!
> 
> For context, I asked Rob to do this experiment, because we know that
> the current bandwidth authority design is mis-measuring relays, but we
> don't know how wrong things are. Giving every relay a short burst of
> load should give us some insight into how much traffic that relay can
> handle, which will in turn tell us how much room for improvement there
> is in our bandwidth estimation.
> 
> And as a bonus, for this one time, fast relays should actually be
> consistently seen as fast, and the Tor network should be better balanced
> and the user experience should be better. If we like how it works,
> our follow-up task will be to change things so we get this result all
> the time. :)

Over the last 2 days I tested my speedtest on 4 test relays and verified that 
it does in fact increase relays' advertised bandwidth on Tor metrics.

Today, I started running the speedtest on all relays in the network. So far, I 
have finished about 100 relays (and counting). I expect that the advertised 
bandwidths reported by metrics will increase over the next few days. For this 
to happen, the bandwidth histories observed by a relay during my speedtest are 
first committed to the bandwidth history table (within 24 hours), and then 
reported in the server descriptors (within 18-36 hours, depending on when the 
bandwidth history commit happens).

Peace, love, and positivity,
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-08-01 Thread Rob Jansen


> On Jul 30, 2019, at 2:02 PM, Michael Gerstacker 
>  wrote:
> 
> Hi!
> 
> Good to hear that you guys try to solve the problem of slow measured relays.
> For example when i measure my relay
> 
> 40108FDFA40EDB013F7291F3B4DA3D412ED3A5EF
> 
> with the speedtest from tele2 i get about 90 MiB download and about 50 MiB 
> upload but Tor measures it with about 15 MiB.
> Some of my relays are measured very accurate but other ones are measured with 
> only about 1/5 of what my results are.
> 

Cool, I hope my experiment yields good results for your relay.

> I read the sbws documentation about how the measuring process is working and 
> i am curious about how the experiment is measuring relays.
> 
> if possible please publish a little more info about the experiment or at 
> least the results somewhere. 
> Thanks

Note that I am not using sbws for this experiment, but rather a custom 
measurement process. The plan is to use multiple Tor clients to create multiple 
sockets to the target relay, and then each client will extend a circuit through 
the target and then back to one of a set of relays running on the same machine 
as the client. I'm hoping the use of multiple sockets will help mitigate the 
effects of packet loss.

The results will be published when possible, after they have been analyzed and 
understood.

Peace, love, and positivity,
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-08-01 Thread Rob Jansen


> On Jul 31, 2019, at 7:34 PM, teor  wrote:
> 
> Hi Rob,
> 

Hey there!

> Can you define "goodput"?

Application-level throughput, i.e., bytes transferred in packet payloads but 
not counting packet headers or retransmissions. In our case I mean the number 
of bytes that Tor reports in the BW controller event.

> How is it different to the bandwidth reported by a standard speed test?

I believe that iperf also reports goodput as defined above.

> How is it different to the bandwidth measured by sbws?

I am not an expert on sbws, but I believe it also measures goodput.

> Where is your server?

West coast US.

> How do you expect the location of your server to affect your results?

I expect that the packet loss that occurs between my measurement machine and 
the target may limit the goodput I am able to achieve, and packet loss tends to 
occur more frequently on links with higher latency. I plan to use multiple 
sockets (as standard speed testing tools like iperf do) and multiple circuits 
to try to mitigate the effects.

Note that this is meant to be a fairly simple experiment, not a complete 
measurement system. Of course I won't be able to measure more than the 
bandwidth capacity of my measurement machine, but many relays already carry 
significant load so I'll just be giving them a boost.

Peace, love, and positivity,
Rob
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


[tor-relays] Measuring the Accuracy of Tor Relays' Advertised Bandwidths

2019-07-26 Thread Rob Jansen
Hello relay operators!

I am planning on performing an experiment on the Tor network to try to gauge 
the accuracy of the advertised bandwidths that relays report in their server 
descriptors. Briefly, the experiment involves running a speed test on every 
relay for a short time (about 20 seconds). Details follow.

I plan to run the experiment in about 1 week. Relay operators can opt-out of 
the speed test by replying on this thread, and we will remove you from the list 
of relays to scan.

Peace, love, and positivity,
Rob

---
Measuring the Accuracy of Tor Relays' Advertised Bandwidths

Motivation
--
The capacity of Tor relays (maximum available goodput) is an important metric. 
Combined with mean goodput, it allows us to compute the bandwidth utilization 
of individual relays as well as the entire network in aggregate. Generally, 
capacity is used to help balance client load across relays, and relay 
utilization rates help Tor make informed decisions about how to allocate 
resources and prioritize performance and scalability improvements.

Problem
---
Currently, Tor uses a heuristic measure of unknown accuracy to estimate Tor 
relay capacity. Each relay keeps track of the maximum goodput it has achieved 
over any 10 second window in a 24 hour period. This is called the "observed 
bandwidth". Relays take the minimum of their "observed bandwidth" and their 
bandwidth rate-limiting configuration and reports the result as the "advertised 
bandwidth" in their server descriptors. We do not know how well the advertised 
bandwidth estimates the true relay capacity, but we do know that it represents 
a lower bound on capacity.

Hypothesis
--
The advertised bandwidth significantly underestimates the true capacity of Tor 
relays. On average, relays with higher true capacities will be more strongly 
correlated with capacity underestimation (because it will be less likely that 
fast relays will have sustained their full capacity over a 10 second period).

Experiment
--
A relay reports its advertised bandwidth in its server descriptor. To test how 
well these reported numbers represent the true capacity of a relay, we can 
manually perform a speed test on the relay by initiating the simultaneous 
download of several large data streams for a period that exceeds 10 seconds. In 
the report following our test, the relay will report its advertised bandwidth 
in its server descriptor and the results will be collected and reported by 
metrics.torproject.org.

The experiment involves two steps: running the speed test on a relay under our 
control, and running the speed test on all relays in Tor network.

We will first run the speed test on at least one relay that we control, in 
order to test that the method is effective and that we can in fact observe a 
change in the advertised bandwidth reported on metrics.torproject.org. Once we 
have confidence that our speed test is functioning correctly, and that the 
metrics pipeline will allow us to gather the results, we will repeat it on all 
relays in the network.

We will conduct the speed tests while minimizing network overhead. We will use 
a custom client that builds 2-relay circuits. The first relay will be the 
target relay we are speed testing, and the second relay will be a fast exit 
relay that we control. We will initiate data streams between a speedtest client 
and server running on the same machine as our exit relay.

The setup will look like:

speedtest-client <--> tor-client <--> target-relay <--> exit-relay <--> 
speedtest-server

All components will run on the same machine that we control except for the 
target-relay, which will rotate as we test different relays in the network. For 
each target relay, we plan to run the speedtest for 20 seconds in order to 
increase the probability that the 10 second mean goodput will reach the true 
capacity. We will measure each relay over a few days to ensure that our 
speedtest effects are reported by every relay.

Although we believe that the overhead of this speed test is in line with 
regular usage, relay operators can opt-out of the speed test by replying on 
this thread. Those that opt out will be removed from our list of relays to scan.

Analysis

Following our speedtest, we will analyze the data collected and reported by Tor 
metrics. We will compared the advertised bandwidth that each relay reports 
before our experiment to those reported during our experiment. This will help 
us test our hypothesis that relays' advertised bandwidth underestimates the 
true capacity of relays. We will run a statistical correlation analysis on the 
data to test the strength of the correlation between the previously reported 
(estimated) relay capacity and relay capacity underestimation. We will report 
our results to the Tor community.

We expect that the results of our experiment will help Tor decide how to 
allocate resources and will help them plan and prioritize performance 
improvements. It will