Re: [tor-dev] design for a Tor router without anonymity compromises

2015-05-06 Thread coderman
On 5/4/15, Mike Perry mikepe...@torproject.org wrote:
 ...
 In my opinion, the most interesting use case for these devices is where
 Tor Launcher implements a peering mechanism whereby the user can click a
 button at some point in the initial connection wizard that says My
 Router Knows My Tor Configuration.

hi Mike,

i called this Device Driven Configuration in the updated document,
and added two FAQ entries regarding device public key verification for
use in the JSON based device driven configuration between Tor Launcher
and Tor enforcing device.

thanks again!


P.S. additional edits continue; any and all feedback still solicited.
it's not too late... :)
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Brainstorming Domain Fronted Bridge Distribution (was meek costs)

2015-05-06 Thread Mike Perry
isis:
 Mike Perry transcribed 5.1K bytes:
  […]
 
  2. Perhaps cleaner: if BridgeDB itself were accessible through a domain
  front, we could export its captcha and bridge distribution through an
  API on this domain front. Once your IP forwarding in
  https://trac.torproject.org/projects/tor/ticket/13171 is solved,
  BridgeDB even could still make use of its IP-based hashring logic.
 
 Maybe don't set the HTTP header name for the forwarded client IP to
 X-Forwarded-For.  Otherwise, it will probably get overridden by the Apache
 server which acts as a reverse proxy in front of BridgeDB's Twisted servers.
 Just set it to something else, e.g. X-Domain-Fronted-For.
 
 Then, on the BridgeDB side, it's easy: I'd need to add logic to BridgeDB to
 handle preferring X-Domain-Fronted-For, X-Forwarded-For, then request IP,
 in that order.
 
  If we make use of this API in Tor Launcher (and we will, as soon as it
  exists — I'd even pull a crazy and roll it out in the middle of a
  stable, given the rapid rate of increase in these costs), users would
  not need to know the magic incantations to access this front, and new
  bridges could be obtained behind the scenes for them. All they would
  have to do is keep solving captchas until something worked (until we
  also implement some kind of fancy crypto like RBridge).
 
 Perhaps the BridgeDB API part of what you want is the Tor Browser bridge
 distributor that I mentioned in §3.1, SOW.9., in my Statement of Work [0] for
 OTF?

Yes, this is exactly what I want. With respect to SOW.9.1, consider it
feasible! Mission Accomplished! ;)

 Additionally, SOW.9. is actually the chronological precursor to SOW.10., the
 latter of which is implementing rBridge (or at least getting started on it).
 (Work on this is still waiting on OTF to officially grant me the fellowship,
 along with the other prerequisite tasks getting finished.)
 
 But just to be clear — since it sounds like you've asked for several new
 things in that last paragraph :) — which do you want:
 
   1. Tor Browser users use meek to get to BridgeDB, to get non-meek bridges 
 by:
1.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
1.b. Solving a CAPTCHA on a BridgeDB web page.
 
   2. Tor Browser users use BridgeDB's domain front, to get non-meek bridges 
 by:
2.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
2.b. Solving a CAPTCHA on a BridgeDB web page.

 If you want #2, then we're essentially transferring the domain-fronting costs
 (and the DDoS risks) from meek to BridgeDB, and we'd need to decide who is
 going to maintain that service, and who is going to pay for it.  Could The
 Tor Project fund BridgeDB domain fronting?

I proposed two things in my original email. My #1 is your #1.b. My #2 is
your #2.a.

For my #2 (your #2.a), what I want is a separate domain front for
BridgeDB. It makes the most sense to me for Tor to run its own domain
front for this.

If for some reason #2.a can't be done, we could do #1.a and use all of
meek+Tor, but this seems excessive, slow, and potentially confusing for
users (their Tor client would have to bootstrap twice for each bridge
set they test).

I only consider my #1 and #1.b emergency stopgaps, though. In fact, if
any aspect of this this process is too slow and/or confusing, we won't
take any load off of meek (unless the browser also starts regularly
yelling at meek users to donate or something).

 As far as maintenance goes, the threat to any of our domain fronts, including
 meek and any BridgeDB domain fronts, from China's Great Cannon waging economic
 counter-counter-warfare by attacking us (like they did to GreatFire.org) is
 something which must be taken into account.  Will the maintainer of this
 service need to wake up to emergency, the-request-rate-is-skyrocketing, emails
 at 4AM to shut the service down? 

I would love to hear how David deals with this risk since the Great
Cannon incident.

Honestly, though, I think this is less likely now. If China wasn't
somehow discouraged from this behavior via some diplomatic backchannel
or just general public backlash, GreatFire.org would probably still be
under attack right now.

Either way, it does seem wise to structure this such that multiple
people can respond to emergencies here, and that individuals like you
and/or David aren't on the hook for the financial damages.

 Or do we already have technical measures to detect DDoS and prevent
 $30,000+/day CDN bills?  Further, what happens when #2 is being
 DDoS-ed?  Should we fallback to #1?  Should we have both, and some
 strategy for balancing between the two?

I think trying to fall back or balance between the two is unlikely to
save us much, and will just introduce excessive implementation
complexity.

If they're going to attack domain fronting usage of Tor, it seems to me
that they will attack both meek and BridgeDB.

  Now that we have a browser updater, I think it is also OK for us to
  provide autoprobing options for Tor 

Re: [tor-dev] Brainstorming Domain Fronted Bridge Distribution (was meek costs)

2015-05-06 Thread isis

WARNING: much text. so email. very long.


Mike Perry transcribed 13K bytes:
 isis:
  Additionally, SOW.9. is actually the chronological precursor to SOW.10., the
  latter of which is implementing rBridge (or at least getting started on it).
  (Work on this is still waiting on OTF to officially grant me the fellowship,
  along with the other prerequisite tasks getting finished.)
  
  But just to be clear — since it sounds like you've asked for several new
  things in that last paragraph :) — which do you want:
  
1. Tor Browser users use meek to get to BridgeDB, to get non-meek bridges 
  by:
 1.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
 1.b. Solving a CAPTCHA on a BridgeDB web page.
  
2. Tor Browser users use BridgeDB's domain front, to get non-meek bridges 
  by:
 2.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
 2.b. Solving a CAPTCHA on a BridgeDB web page.
 
  If you want #2, then we're essentially transferring the domain-fronting 
  costs
  (and the DDoS risks) from meek to BridgeDB, and we'd need to decide who is
  going to maintain that service, and who is going to pay for it.  Could The
  Tor Project fund BridgeDB domain fronting?
 
 I proposed two things in my original email. My #1 is your #1.b. My #2 is
 your #2.a.
 
 For my #2 (your #2.a), what I want is a separate domain front for
 BridgeDB. It makes the most sense to me for Tor to run its own domain
 front for this.

Got it.


 If for some reason #2.a can't be done, we could do #1.a and use all of
 meek+Tor, but this seems excessive, slow, and potentially confusing for
 users (their Tor client would have to bootstrap twice for each bridge
 set they test).

Well… the cost of the second bootstrap *could* be cut down by persisting the
state file from the first bootstrap… but I see what you mean.  The user
experience doesn't seem like it'd be as smooth.


 I only consider my #1 and #1.b emergency stopgaps, though. In fact, if
 any aspect of this this process is too slow and/or confusing, we won't
 take any load off of meek (unless the browser also starts regularly
 yelling at meek users to donate or something).

Agreed, except for the part about yelling at users to donate.  Asking nicely
and suggesting once or twice, I could get behind. :)


 Honestly, though, I think this is less likely now. If China wasn't
 somehow discouraged from this behavior via some diplomatic backchannel
 or just general public backlash, GreatFire.org would probably still be
 under attack right now.

It seems more likely that China was firing the Great Cannon at GreatFire.org
as a demonstration/warning.


   Now that we have a browser updater, I think it is also OK for us to
   provide autoprobing options for Tor Launcher, so long as the user is
   informed what this means before they select it, and it only happens
   once.
  
  Probing all of the different Pluggable Transport types simultaneously 
  provides
  an excellent training classifier for DPI boxes to learn what new Pluggable
  Transport traffic looks like.
  
  As long as it happens only once, and only uses the bridges bundled in Tor
  Browser, I don't see any issue with auto-selecting from the drop-down of
  transport methodnames in a predefined order.  It's what users do anyway.
 
 Oh, yes. I am still against connect to all of the things at the same
 time. The probing I had in mind was to cycle through the transport list
 and try each type, except also obtain the bridges for each type from
 BridgeDB.

But why does Tor Browser need to get bridges from BridgeDB, if it doesn't know
yet which ones will work?  Why not autoprobe with the bundled bridges, then
ask BridgeDB for some more of the kind that works?


 I also think we should be careful about the probing order. I want to
 probe the most popular and resilient transports (such as obfs4) first.

Currently, obfs4 isn't blocked anywhere… so why probe at all when we know
definitely that the first thing we try is going to work?


   The autoprobing could then keep asking for non-meek bridges for either a
   given type of the user's choice, or optionally all non-meek types (with
   an additional warning that this increases their risk of being discovered
   as a Tor user).
  
  If the autoprobing is going to include asking BridgeDB (multiple times?) for
  different types of bridges in the process, whether through a BridgeDB domain
  front or not, then I think there needs to be more discussion…
  
* Do you think could you explain more about the steps this autoprobing
  entails?
 
 1. User starts a fresh Tor Browser (or one that fails to bootstrap)
 2. User clicks Configure instead of Connect
 3. User says they are censored
 4. User selects a third radio button on the bridge dialog
Please help me obtain bridges.
 5. Tor Browser launches a JSON-RPC request to BridgeDB's domain front
for bridges of type $TYPE
 6. BridgeDB responds with a Captcha
 7. User solves captcha; response is 

Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread David Fifield
On Wed, May 06, 2015 at 12:56:04PM -0700, Arthur D. Edelstein wrote:
 Maybe you could rig up something that shuts down the instance? Or does
 Amazon charge you even then?

That might work. I found some documentation on an API for CloudFront web
distributions:
https://docs.aws.amazon.com/AmazonCloudFront/latest/APIReference/Actions_Dist.html
https://docs.aws.amazon.com/AmazonCloudFront/latest/APIReference/PutConfig.html

It looks like you can PUT an enabledtrue/enabled.

It would be nice if this were a separate process apart from meek-server
that counts requests and bytes and keeps track of estimated costs. It
should be able to send a message to another process somewhere that
controls the AWS API keys.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread Arthur D. Edelstein
 Amazon sucks and they don't have any automatic way to shut down a
 service. I emailed them and they were very clear about that. The best
 you can do is set up an email alert at different cost threshold (which I
 have done). But that requires someone with credentials to be awake and
 online when it happens. This is the main reason I want to drop Amazon.
 (Apart from the billing concerns, Amazon's CDN, technically, is nice and
 fast and reliable.)

Would it make sense to add some code to your meek server to monitor
bandwidth usage and automatically shut off if a limit is reached?
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread David Fifield
On Tue, May 05, 2015 at 11:04:58PM -0400, Griffin Boyce wrote:
 Mike Perry wrote:
 David Fifield:
 Here's the summary of meek's CDN fees for April 2015.
 
 total by CDN  $3292.25 + $3792.79 + $0.00 = $7085.04 grand total
 https://metrics.torproject.org/userstats-bridge-transport.html?graph=userstats-bridge-transportstart=2015-02-01end=2015-04-30transport=meek
 
 Yikes! Are these costs covered by a grant or anything? Should we be
 running a donations campaign?
 
 If you want to help reduce costs, you can
  1. Use meek-azure; it's still covered through a grant for the next four
 months.
  2. Set up your own App Engine or CDN account. Then you can pay for your
 own usage (it might even be free depending on how much you use).
 Here are instructions on how to set up your own:
 https://gitweb.torproject.org/pluggable-transports/meek.git/tree/appengine/README
 https://trac.torproject.org/projects/tor/wiki/doc/meek#AmazonCloudFront
 https://trac.torproject.org/projects/tor/wiki/doc/meek#MicrosoftAzure
 Then you will have to enter a bridge line manually. Follow the
 instructions at
 https://trac.torproject.org/projects/tor/wiki/doc/meek#Howtochangethefrontdomain
 but instead of changing the front= part, change the url= part.
 For example,
   bridge meek 0.0.2.0:1 url=https://myappname.appspot.com/
 front=www.google.com
 
 Please let me know if anyone takes you up on this!
 
 I am happy to add the meek bridges of anyone who does this as an option
 in Tor Browser. We can add logic to round robin or randomly select
 between the set of meek providers for a given meek type upon first
 install, or even for every browser startup.
 
   If there were some randomization logic included, I'd be happy to
 contribute an App Engine or Amazon meek access point.  If a few people did
 that, the costs might be more manageable.  But also the stats might be a bit
 harder to aggregate (which might be important if David is writing a
 thesis/paper/etc).

Thanks Griffin. At this point we'd need, what, 60 operators in order to
cost on average $30/month? At current usage rates.

I think we already have plenty of aggregated stats. It's been nice to be
able to separate cleanly amazon/azure/google, but we shouldn't try to
keep that forever at the expense of becoming more scalable.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread David Fifield
On Wed, May 06, 2015 at 11:56:36AM -0700, Arthur D. Edelstein wrote:
  Amazon sucks and they don't have any automatic way to shut down a
  service. I emailed them and they were very clear about that. The best
  you can do is set up an email alert at different cost threshold (which I
  have done). But that requires someone with credentials to be awake and
  online when it happens. This is the main reason I want to drop Amazon.
  (Apart from the billing concerns, Amazon's CDN, technically, is nice and
  fast and reliable.)
 
 Would it make sense to add some code to your meek server to monitor
 bandwidth usage and automatically shut off if a limit is reached?

I don't think that helps because I think you will still get charged for
requests+bandwidth even if the origin server is unresponsive or returns
an error. I could be wrong about this. Yawning wrote some such code a
while back.

Even if you cut off all abusive use of bandwidth, if the adversary can
figure out how to charge you for requests, they cost $1 per million on
Amazon.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread Arthur D. Edelstein
Maybe you could rig up something that shuts down the instance? Or does
Amazon charge you even then?

On Wed, May 6, 2015 at 12:16 PM, David Fifield da...@bamsoftware.com wrote:
 On Wed, May 06, 2015 at 11:56:36AM -0700, Arthur D. Edelstein wrote:
  Amazon sucks and they don't have any automatic way to shut down a
  service. I emailed them and they were very clear about that. The best
  you can do is set up an email alert at different cost threshold (which I
  have done). But that requires someone with credentials to be awake and
  online when it happens. This is the main reason I want to drop Amazon.
  (Apart from the billing concerns, Amazon's CDN, technically, is nice and
  fast and reliable.)

 Would it make sense to add some code to your meek server to monitor
 bandwidth usage and automatically shut off if a limit is reached?

 I don't think that helps because I think you will still get charged for
 requests+bandwidth even if the origin server is unresponsive or returns
 an error. I could be wrong about this. Yawning wrote some such code a
 while back.

 Even if you cut off all abusive use of bandwidth, if the adversary can
 figure out how to charge you for requests, they cost $1 per million on
 Amazon.
 ___
 tor-dev mailing list
 tor-dev@lists.torproject.org
 https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] exitmap feature requests?

2015-05-06 Thread nusenu
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Hi Philipp,

do you consider feature requests via [1] or would you recommend
forking and implementing it oneself?

thanks,
nusenu


[1] https://github.com/NullHypothesis/exitmap/issues
-BEGIN PGP SIGNATURE-

iQIcBAEBCgAGBQJVSlpSAAoJEFv7XvVCELh0lCQQALCIDN+o7o5rubDUP7+O2J01
pI1ezUWqRk1BkKXYOPhB17gJke/5XzYCQkA/VW8FLmjf52WnucFg25c7TJ6hbRFd
T73jfQvIAoNm938c/ZRHND8xOCOxaO4oa+sd5+oPNDYCfEg3gHfFxHtjgdvUzF+P
Y2CqbO271drrFzkWk0ILSoRJihmZjSZHJl21gvfmeDKRsyycig/yKip/Lyq9n8QF
O9rneAsVXtT72uCKnjd7rAqVwetTgqiNKQMrOG4hZ61BoGgrwYueWpTVN9wSNNOO
755TnSjZw+/rjYJxIQngbhdFSeFOX2MvE2gaUsut4b1tUa23QSQ5bLhPVEw4P6u2
Hm9lo06RCPNoYe16Hr+ff2rLMK/UU2LfoYkBZk2f0FM7LcYzEOXs6mmeUREBoJw8
V6m+O1orczjwppF9aPBLXihUr8FZQyuhIYMPpwYnMdrtysiFAKvOusOxr7svmM/w
nQvPrEy3SuiyyoQa8VGZt0DejeUfyXwhkndbTC/goxKf44pWGDVUB15fyUIKh17l
9vNVdMbCmmRVHa8Mto3RmjlNRG9c3gqsmaMteAJohXGAbe8qoSMqmRF095HYjlhb
MD5UPiuNIoAqOBjC2FMTYnLT2bmrRmcQD0i0dUEtK6wg5w11M3pSbk+jR/7AD5uZ
gPjSfXUM718ly0avzegm
=+LN2
-END PGP SIGNATURE-
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread David Fifield
On Wed, May 06, 2015 at 04:36:48AM +, isis wrote:
 But just to be clear — since it sounds like you've asked for several new
 things in that last paragraph :) — which do you want:
 
   1. Tor Browser users use meek to get to BridgeDB, to get non-meek bridges 
 by:
1.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
1.b. Solving a CAPTCHA on a BridgeDB web page.
 
   2. Tor Browser users use BridgeDB's domain front, to get non-meek bridges 
 by:
2.a. Retrieving and solving a CAPTCHA inside Tor Launcher.
2.b. Solving a CAPTCHA on a BridgeDB web page.
 
 If you want #2, then we're essentially transferring the domain-fronting costs
 (and the DDoS risks) from meek to BridgeDB, and we'd need to decide who is
 going to maintain that service, and who is going to pay for it.  Could The
 Tor Project fund BridgeDB domain fronting?

You still have the DoS risk, but in normal usage the costs will be way
way less because you're only paying for bootstrapping and not for
GNU/Linux ISO downloads or whatever it is people do with Tor. Bandwidth
costs across all CDNs are between $0.10 and $0.20 per GB. To reach even
one GB would take a million 1K bootstraps.

 As far as maintenance goes, the threat to any of our domain fronts, including
 meek and any BridgeDB domain fronts, from China's Great Cannon waging economic
 counter-counter-warfare by attacking us (like they did to GreatFire.org) is
 something which must be taken into account.  Will the maintainer of this
 service need to wake up to emergency, the-request-rate-is-skyrocketing, emails
 at 4AM to shut the service down?  Or do we already have technical measures to
 detect DDoS and prevent $30,000+/day CDN bills?  Further, what happens when #2
 is being DDoS-ed?  Should we fallback to #1?  Should we have both, and some
 strategy for balancing between the two?

App Engine is nice because you can set a daily cost limit, and the
service shuts down after that. It's currently set at $45/day (after we
bumped into the previous $40/day limit one day last week :/). It's nice
because the maximum damage a DoS can cause (besides shutting down the
service) is O(1).

Amazon sucks and they don't have any automatic way to shut down a
service. I emailed them and they were very clear about that. The best
you can do is set up an email alert at different cost threshold (which I
have done). But that requires someone with credentials to be awake and
online when it happens. This is the main reason I want to drop Amazon.
(Apart from the billing concerns, Amazon's CDN, technically, is nice and
fast and reliable.)
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Summary of meek's costs, April 2015

2015-05-06 Thread David Fifield
On Tue, May 05, 2015 at 06:22:47PM -0700, Mike Perry wrote:
 David Fifield:
  Here's the summary of meek's CDN fees for April 2015.
  
  total by CDN  $3292.25 + $3792.79 + $0.00 = $7085.04 grand total
  https://metrics.torproject.org/userstats-bridge-transport.html?graph=userstats-bridge-transportstart=2015-02-01end=2015-04-30transport=meek
 
 Yikes! Are these costs covered by a grant or anything? Should we be
 running a donations campaign?

It's partly covered by grants but not fully.

I'd be happy with donations but I don't want to handle any money. We
also need to think about long-term sustainability: usage and costs will
continue to increase(at least until the world changes), and donations
will need to increase too.

Look at the 1 year bandwidth graph for meek-google. It's pretty close
to linear since October 2014, increasing 400 KB/s/month.
https://globe.torproject.org/#/bridge/88F745840F47CE0C6A4FE61D827950B06F9E4534

  If you want to help reduce costs, you can
   1. Use meek-azure; it's still covered through a grant for the next four
  months.
   2. Set up your own App Engine or CDN account. Then you can pay for your
  own usage (it might even be free depending on how much you use).
  Here are instructions on how to set up your own:

  https://gitweb.torproject.org/pluggable-transports/meek.git/tree/appengine/README

  https://trac.torproject.org/projects/tor/wiki/doc/meek#AmazonCloudFront
https://trac.torproject.org/projects/tor/wiki/doc/meek#MicrosoftAzure
  Then you will have to enter a bridge line manually. Follow the
  instructions at

  https://trac.torproject.org/projects/tor/wiki/doc/meek#Howtochangethefrontdomain
  but instead of changing the front= part, change the url= part.
  For example,
bridge meek 0.0.2.0:1 url=https://myappname.appspot.com/ 
  front=www.google.com
 
 Please let me know if anyone takes you up on this!
 
 I am happy to add the meek bridges of anyone who does this as an option
 in Tor Browser. We can add logic to round robin or randomly select
 between the set of meek providers for a given meek type upon first
 install, or even for every browser startup. 

Thanks.

In recommending that people run their own reflectors, I actually had a
different use case in mind: that they would run one for themself or for
their friends, and not announce it publicly. So basically like setting
up any other private proxy, except it works in more places.

 Given your costs, it also seems worthwhile for us to fund development to
 improve this situation, so that meek remains a transport of last resort
 rather than people's first choice.

I don't have the feeling that it's people's first choice. Rather I think
we're seeing new users who were not being served by any of the other
transports. It's going to be slower than other transports. On the other
hand, not needing to find bridges is a big distinction, and once you
have something that works there's little incentive to change it.

 Here's a couple options:
 
 1. We can add a browser notification box for meek users that either
 tells them about meek-azure, or tells them that now that Tor Browser
 works, they can use it to visit https://bridges.torproject.org to get a
 bridge type that doesn't cost so much money.

I don't want to lean too hard on meek-azure, because its grant runs out
in four months and I don't have a plan to keep it going.

I wouldn't want people to feel guilty when they manage to circumvent
censorship, especially if nothing else works for them. But yes, we can
probably make some UI and backend changes that make the default options
less costly.

 2. Perhaps cleaner: if BridgeDB itself were accessible through a domain
 front, we could export its captcha and bridge distribution through an
 API on this domain front. Once your IP forwarding in
 https://trac.torproject.org/projects/tor/ticket/13171 is solved,
 BridgeDB even could still make use of its IP-based hashring logic.

For this purpose you wouldn't even need the full power of BridgeDB. The
list of bridges doesn't need to be kept secret for blocking resistance,
so you could even just put the list on a web page and domain-front to
download it. (It still might make sense to keep the list secret to
hinder financial DoS on the operators, but unless there are a ton of
operators, they'll still be enumerable and vulnerable.)

I gave a talk about domin fronting at Stanford and that's what the
audience suggested: use a centrally paid-for account only to get a
bridge on someone else's account, and then use that bridge for all your
data transfer. Then the central costs are limited to bootstrapping.

We'll need some added code for robustness, as we can't expect a large
number of bridges to individually be as reliable as the handful of
curated ones we have now. Like, if someone turned off their account or
they reached their daily cost quota.

 Would you and/or Isis be able to work on this on the backend? If not,
 can 

Re: [tor-dev] Brainstorming Domain Fronted Bridge Distribution (was meek costs)

2015-05-06 Thread Mike Perry
isis:
 WARNING: much text. so email. very long.

Right. If I cut your previous text, assume I'm in agreement, not
ignoring it.

 Mike Perry transcribed 13K bytes:
  isis:
Now that we have a browser updater, I think it is also OK for us to
provide autoprobing options for Tor Launcher, so long as the user is
informed what this means before they select it, and it only happens
once.
   
   Probing all of the different Pluggable Transport types simultaneously 
   provides
   an excellent training classifier for DPI boxes to learn what new Pluggable
   Transport traffic looks like.
   
   As long as it happens only once, and only uses the bridges bundled in Tor
   Browser, I don't see any issue with auto-selecting from the drop-down of
   transport methodnames in a predefined order.  It's what users do anyway.
  
  Oh, yes. I am still against connect to all of the things at the same
  time. The probing I had in mind was to cycle through the transport list
  and try each type, except also obtain the bridges for each type from
  BridgeDB.
 
 But why does Tor Browser need to get bridges from BridgeDB, if it doesn't know
 yet which ones will work?  Why not autoprobe with the bundled bridges, then
 ask BridgeDB for some more of the kind that works?

Sure, the first autoprobe can (and should) test the local bridges before
ask for more, but I expect people are using meek because none of those
actually work.

We can do better about trying to sneak fresh bridges into the TBB
distribution immediately before every release, but I doubt that will
help much, since the adversary can scrape them with just a wget from our
git repos.

  I also think we should be careful about the probing order. I want to
  probe the most popular and resilient transports (such as obfs4) first.
 
 Currently, obfs4 isn't blocked anywhere… so why probe at all when we know
 definitely that the first thing we try is going to work?

Mostly because of IP blocking. 

The autoprobing could then keep asking for non-meek bridges for either a
given type of the user's choice, or optionally all non-meek types (with
an additional warning that this increases their risk of being discovered
as a Tor user).
   
   If the autoprobing is going to include asking BridgeDB (multiple times?) 
   for
   different types of bridges in the process, whether through a BridgeDB 
   domain
   front or not, then I think there needs to be more discussion…
   
 * Do you think could you explain more about the steps this autoprobing
   entails?
  
  1. User starts a fresh Tor Browser (or one that fails to bootstrap)
  2. User clicks Configure instead of Connect
  3. User says they are censored
  4. User selects a third radio button on the bridge dialog
 Please help me obtain bridges.
  5. Tor Browser launches a JSON-RPC request to BridgeDB's domain front
 for bridges of type $TYPE
  6. BridgeDB responds with a Captcha
  7. User solves captcha; response is posted back to BridgeDB.
  8. BridgeDB response with bridges (or a captcha error)
  9. Tor Launcher attempts to bootstrap with these bridges.
  10. If bootstrap fails, goto step 5.
  
  The number of loops for steps 5-10 for each $TYPE probably require some
  intuition on how frequently we expect bridges that we hand out to be
  blocked due to scraping, and how many bridge addresses we really want to
  hand out per Captcha+IP address combination.
 
 Currently, you get the same bridges every time you ask (for some arbitrary
 period).  This would definitely require a new Distributor on the backend (not
 a problem, and not difficult, just saying).
 
 Why ask multiple times?  Why not just get three bridges per request, and if
 that appears to be failing to get 99% of users connected to a bridge, increase
 it to four?

Handing out four at a time might work better than requesting the same
type again and again. Then again, simply trying the other transport
types might also work.

How hard is it to get analytics on the requests to BridgeDB? If we give
you a special request parameter (like justfailed=obfs4) that means
Hey, I'm asking you again for this new transport type because the obfs4
bridges you just gave me didn't work, can you count that? Can you break
down that count by GeoIP country for the requesting IP?

That metric will be useful for hints if obfs4 is suddenly blocked in
some country, or if by some other mechanism the censor has discovered
all/most of the IP addresses of the obfs4 bridges.

 FWIW, the number of suspicious attempts to the HTTPS Distributor has dropped
 substantially in the last four months, and to the email distributor has stayed
 the about the same.  Off the top of my head, this is likely, hopefully,
 something to do with:
 
  1. actually distributing separate bridges to Tor/proxy users, and actually
 rate limiting them (!!), [0] [1]
 
  2. actually rotating available bridges, such that large amounts of both time
 and IP space are required to effectively