#25985: Snowflake rendezvous using AMP cache -----------------------------------+-------------------------------- Reporter: twim | Owner: (none) Type: project | Status: needs_revision Priority: Medium | Milestone: Component: Obfuscation/Snowflake | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -----------------------------------+-------------------------------- Changes (by dcf):
* status: needs_review => needs_revision Comment: This is some review of commit [https://github.com/nogoegst/snowflake/commit/b75253ba55bde140be733c4cf5425fc1aab161c4 b75253ba55bde140be733c4cf5425fc1aab161c4]. I had trouble testing this change locally because `amper.Client` was changing my http into https. This is what I did: {{{ broker$ ./broker -disable-tls -addr 127.0.0.1:8080 proxy-go$ ./proxy-go -broker http://127.0.0.1:8080/ client$ TOR_PT_MANAGED_TRANSPORT_VER=1 TOR_PT_CLIENT_TRANSPORTS=snowflake ./client -codec amp -url http://127.0.0.1:8080/ }}} The error I get from the client is a result of attempting to parse HTTP as TLS: {{{ BrokerChannel Error: tls: oversized record received with length 20527 }}} (On further reflection, what I was trying to do wouldn't have worked anyway, because the AMP client code would have sent a URL path of `/c/s/.../amp/client/...`, not `/amp/client/...`, but I could have worked around that for testing.) It looks like the cause is [https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L55 here in amper/client.go], with a hardcoded `"https"`. Looking at that, I also noticed a [https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L23 CDNDomain = "cdn.ampproject.org"]. I have to say, it skeeves me a bit to see hardcoded constants like that in a library; it seems like the kind of thing that belongs in a configuration file. ---- I like the idea behind [https://github.com/nogoegst/snowflake/commit/f358a7696045336ef5e15ff6f8ee2467d8d5f53a the commit that adds the -codec flag]: generalize a currently hardcoded interface, then provide alternate implementations in a later commit. I must say, though, that I don't like the layering of [https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L222 clientOffersPOST]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L186 ctx.handleClientOffers]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L169 DecodeError]. Instead of returning an `error` directly, `ctx.handleClientOffers` writes a JSON representation of the error into a buffer provided by the caller, which the caller must then convert back into an `error` by calling `DecodeError`. It seems unnecessarily convoluted just to enable the side effect of writing directly to the response body in the AMP case. This is a case where I would prefer to see duplicated code first, because I suspect we can find a better factoring. In my testing, the `DecodeError` error recovery didn't work. It decodes to an `Error` that has the same `E` string value but is unequal to `ErrNoSnowflakes`, for example. When I run a [https://gitweb.torproject.org/pluggable- transports/snowflake.git/tree/client?id=25b304a9a856f8c791882ad523df26ffc8fa629c master branch client] (with POST) against an [https://github.com/nogoegst/snowflake/tree/b75253ba55bde140be733c4cf5425fc1aab161c4/broker amp branch broker], it fails because the broker returns status 200 when it should return status 503. The culprit seems to be [https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/broker/broker.go#L229-L234 this error-matching switch]. If I add a `default` case with a panic, I see that the `ErrNoSnowflakes` case isn't being matched, even though that is the intention: {{{ http: panic serving 127.0.0.1:56174: {"error":"No snowflakes proxies available"} }}} ---- Here's an example of what the AMP responses look like: {{{ $ wget -q -O- http://127.0.0.1:8080/client/amp <!doctype html> <html amp> <head> <meta charset="utf-8"> <title>amp</title> <link rel="canonical" href="https://ampproject.org" /> <meta name="viewport" content="width=device-width,minimum-scale=1 ,initial-scale=1"> <style>body {opacity: 0}</style><noscript><style>body {opacity: 1}</style></noscript> <script async src="https://cdn.ampproject.org/v0.js"></script> </head> <body> <p>In varietate concordia</p> <pre id="data">eyJlcnJvciI6Ik5vIHNub3dmbGFrZXMg cHJveGllcyBhdmFpbGFibGUifQo</pre> </body> </html> [The base64 decodes to '{"error":"No snowflakes proxies available"}'.] }}} The overall AMP container encoding looks pretty reasonable. A few tiny questions: * [https://www.ampproject.org/docs/getting_started/create/basic_markup basic_markup] says the `<script async>` has to be the second child of `<head>`, right after `<meta charset="utf-8">`. * The `<style>body {opacity: 0}</style><noscript><style>body {opacity: 1}</style></noscript>` doesn't match the documented [https://www.ampproject.org/docs/reference/spec/amp-boilerplate.html AMP boilerplate code], which is much longer. Is there a reason? * `<link rel="canonical" href="https://ampproject.org" />`: it would be better not to refer to any third parties. The AMP guide says this can be a relative URL, so I would prefer one of these if they work: * `<link rel="canonical" href="">` * `<link rel="canonical" href="#">` * `<link rel="canonical" href="?">` ---- About the encoded data inside the AMP HTML container: I'm wondering if we should always put the data payload ''inside'' the same JSON object that can signal an error, rather than making the payload be sent ''instead of'' the error object when there is no error. This is how it currently works: 1. Try to parse the response body as JSON with an `"error"` field. If the JSON parsing is successful, handle the error and stop. 2. Otherwise (JSON parsing failed), return the entire response body as a blob of bytes. That is, under the base64 encoding, a message can look either like this: {{{ {"error": "No snowflake proxies available"} }}} or like this: {{{ {"type": "answer", "sdp": "..."} }}} One problem with this design, as far as amper is concerned, is that it's impossible for the server to send a payload that happens to parse as JSON with an `"error"` field—it will be interpreted as an error, not a payload. This consideration doesn't affect Snowflake, because Snowflake's payloads don't have an `"error"` field even though they are JSON. But another consideration is that you can't tell the difference between a non-error response and an improperly encoded error response (if the body is truncated, for example). I propose to remove this ambiguity by including the data payload inside the JSON object that represents an error. Just like in an HTTP response, we have two things—a status code and a message—inside one package. A message would either look like this: {{{ {"error": "No snowflake proxies available", "data": null} }}} or like this: {{{ {"error": null, "data": "eyJ0eXBlIjogImFuc3dlciIsICJzZ..."} }}} (The `null` fields are included for clarity and could be omitted in practice.) To handle this, a client needs to `json.Unmarshal` on a `struct { Error, Data: string }`. Any JSON decoding error signifies an error in the AMP transport, not a Snowflake-level error. Assuming JSON decoding is successful, the `"error"` field indicates whether there was a Snowflake- level error. About signaling errors, I would prefer to use machine-readable strings (e.g. `"ErrNoSnowflakes"`) rather than human-readable strings (e.g. `"No snowflake proxies available"`). The latter seems more likely to get broken accidentally by someone adding punctuation, for example. The variable names you've chosen are good, but I would just them as the actual values, not just the identifiers. We can set up a canonical mapping, e.g. 200⇒`""`, 503⇒`"ErrNoSnowflakes"`, 504⇒`"ErrAnswerTimeout"`, to map the status codes from POST into the error strings for common handling. ---- Does [https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/client/rendezvous.go#L69 NewBrokerChannel] need to instantiate an `amper.Client` when it's not using `-codec amp`? -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25985#comment:24> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs