Mike Perry <mikepe...@torproject.org> writes: > George Kadianakis: >> Hello Mike, >> >> I had a talk with Marc and Mohsen today about WTF-PAD. I now understand >> much more about WTF-PAD and how it works with regards to histograms. I >> think I might even understand enough to start some sort of conversation >> about it: >> >> Here are some takeaways: >> >> 1) Marc and Mohsen think that WTF-PAD might not be the way forward >> because of its various drawbacks and its complexity. Apparently there >> are various attacks on WTF-PAD that Roger has discovered (SENDME >> cells side-channels?) and also the deep learning crowd has done some >> pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They >> also told me that achieving needed precision on the timings might be >> a PITA. > > Are there citations for any of this? Last I heard Matt Wright was > working on a deep learning study but the results were mixed. >
I think this is the best we have in terms of public results: https://arxiv.org/abs/1801.02265 >> 2) From what I understand you are also hoping to use WTF-PAD to protect >> against circuit fingerprinting and not just website >> fingerprinting. They told me that while this might be plausible, >> there is no current research on how well it can achieve that. Are we >> hoping to do that? And what research remains here? How can I help? >> Which parts of the Tor circuit protocol are we hoping to hide? > > I am designing WTF-PAD to be a framework for deploying padding against > arbitrary traffic analysis attacks. It is meant to allow us to define > histograms on the fly (in the Tor consensus) as these are studied. The > fact that they have not yet been studied is not super relevant to > deploying the framework for it now. > ACK. What other traffic analysis attacks are we looking at addressing here? I'm thinking of stuff like "circuit fingerprinting of onion services", but I wonder if histograms and random sampling is too crude to actually be able to help against sophisticated attacks. I don't have a suggestion for something better currently. On that topic, is it decided whether the adaptive padding of WTF-PAD will also happen during circuit construction, or only after that? >> 3) Marc and Mohsen suggested using application-layer defences because >> the application-layer has much better view of the actual structures >> that are sent on the wire, instead of the black box view that the >> network layer has. >> >> In particular they were mainly concerned about onion services >> fingerprinting because they are part of a restricted closed world, >> whereas they were less concerned about the entire internet because of >> its vast size. >> >> They suggested that we could investigate using the service-side >> "alpaca" library for onion services (e.g. as part of securedrop?) >> which should resolve the most pressing concern of HS identification. > > I mean yeah application-layer defenses are useful for website traffic > fingerprinting, but that is a very narrow slice of the traffic analysis > problems that I want this framework to solve. > > WTF-PAD also doesn't rule out hidden service operators using alpaca, > either. > Agreed. >> 4) They also told me of research by Tobias Pulls which eliminates the >> needs for histograms in WTF-PAD and instead it samples from the >> probability distribution directly. They think that this can simplify >> things somewhat. Any thoughts on this? > > Yes this is actually exactly what I want to do with the next iteration > of WTF-PAD! The question is what form/model to use for these probability > distributions. Right now we're encoding inter-burst and inter-packet > timings with some weird geometric distribution determining how long > these bursts should go on for, when it might be more natural to encode > and sample from length-based distributions/histograms. > > (Histograms vs distribution is not the problem -- its what they encode > and how they encode it that matters). > > I don't see this paper on Tobias's website. Is it up anywhere yet? > Hmm. Looking at the README of wtfpad (see the APE section), I think this blog post is the best resource we have on this: https://www.cs.kau.se/pulls/hot/thebasketcase-ape/ _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev