Re: packet traffic analysis
> I very much doubt it. Where did that factor of "half" come frome. During lulls, you are constantly sending chaff packets. On average, you're halfway through transmitting a chaff packet when you want to send a real one. The system has to wait for it to finish before sending another. QED. > Ah, but if you generate unequal-length packets then they are > vulnerable to length-analysis, which is a form of traffic analysis. I'm talking about a stream, with packets embedded in it. For circuit-switched circuits, this is no problem. For a packet-switched network, you must packetize the stream, which is unrelated to the packets embedded in the stream. This is somewhat inefficent, which is why I suggested that it is more applicable ot something like PPP, SSH, or OpenVPN links, which are already virtual circuits. This is a fair criticism, but just think of the number of such circuit/packet conversions when someone uses a TCP virtual circuit over packet-based IP over an analog POTS link, which is itself a virtual circuit that is packetized and sent over a circuit (long-haul wirepair or fiber) in the telco network. If you explain to me how an eavesdropper can tell where plaintext packet begins or ends, then I'll agree with you that it is indeed vulnerable to length analysis. > A better solution would be to leave the encryption on and use constants > (not PRNG output) for the chaff, as previously discussed. That might or might not be a problem. With ECB, it's vulnerable to analysis (chaff is constant, so encryption of it is constant). With some modes, the amount you can transmit is limited (e.g. CTR mode). Modes that are based on a small window of previous plaintext, such as OFB, would be vulnerable too. It could very well be that it's a bad idea to send a lot of constant plaintext under other modes, as well. For example, if most of the data is constant, then you have a close approximation of known-plaintext. > The notion of synchronized PRNGs is IMHO crazy -- complicated as well as > utterly unnecessary. It's not necessary to run a PRNG on the receiver. You just have to be able to tell when you're looking at random data, or an encrypted version of an escape sequence and a valid packet, which can be recognized, as per your point 4a. If you find that it's not a legitimate packet, you treat it as PRNG data, and start looking for the encrypted escape sequence. However, with a 32-bit escape sequence, the chances of getting such a false positive are low. I personally think sending encrypted versions of constant data under the same key you use for real data is not crazy, but somewhat imprudent. Do you know what the unicity distance is? Have you read of attacks that require a large amount of ciphertext encrypted under the same key? -- http://www.lightconsulting.com/~travis/ -><- "We already have enough fast, insecure systems." -- Schneier & Ferguson GPG fingerprint: 50A1 15C5 A9DE 23B9 ED98 C93E 38E9 204A 94C2 641B
Re: packet traffic analysis
> Modes that are based on a small window of previous plaintext, such as > OFB, would be vulnerable too. My mistake, OFB does not have this property. I thought there was a common mode with this property, but it appears that I am mistaken. If it makes you feel any better, you can consider the PRNG the encryption of constant text, perhaps using the real datastream as some kind of IV. The content of the chaff is not relevant; ideally you would use a high-bandwidth HWRNG such as Quantis. -- http://www.lightconsulting.com/~travis/ -><- "We already have enough fast, insecure systems." -- Schneier & Ferguson GPG fingerprint: 50A1 15C5 A9DE 23B9 ED98 C93E 38E9 204A 94C2 641B
Re: packet traffic analysis
Good catch on the encryption. I feel silly for not thinking of it. > If your plaintext consists primarily of small packets, you should set the MTU > of the transporter to be small. This will cause fragmentation of the > large packets, which is the price you have to pay. Conversely, if your > plaintext consists primarily of large packets, you should make the MTU large. > This means that a lot of bandwidth will be wasted on padding if/when there > are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's > the price you have to pay to thwart traffic analysis. I'm not so sure. If we're talking about thwarting traffic on the link level (real circuit) or on the virtual-circuit level, then you're adding, on average, a half-packet latency whenever you want to send a real packet. And then there's the bandwidth tradeoff you mention, which is probably of a larger concern (although bandwidth will increase over time, whereas the speed of light will not). I don't see any reason why it's necessary to pay these costs if you abandon the idea of generating only equal-length packets and creating all your chaff as packets. Let's assume the link is encrypted as before. Then you merely introduce your legitimate packets with a certain escape sequence, and pad between these packets with either zeroes, or if you're more paranoid, some kind of PRNG. In this way, if the link is idle, you can stop generating chaff and start generating packets at any time. I assume that the length is explicitly encoded in the legitimate packet. Then the peer for the link ignores everything until the next "escape sequence" introducing a legitimate packet. This is not a tiny hack, but avoids much of the overhead in your technique. It could easily be applied to something like openvpn, which can operate over a TCP virtual circuit, or ppp. It'd be a nice optimization if you could avoid retransmits of segments that contained only chaff, but that may or may not be possible to do without giving up some TA resistance (esp. in the presence of an attacker who may prevent transmission of segments). -- http://www.lightconsulting.com/~travis/ -><- "We already have enough fast, insecure systems." -- Schneier & Ferguson GPG fingerprint: 50A1 15C5 A9DE 23B9 ED98 C93E 38E9 204A 94C2 641B
Re: packet traffic analysis
In the context of: >>If your plaintext consists primarily of small packets, you should set the MTU >>of the transporter to be small. This will cause fragmentation of the >>large packets, which is the price you have to pay. Conversely, if your >>plaintext consists primarily of large packets, you should make the MTU large. >>This means that a lot of bandwidth will be wasted on padding if/when there >>are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's >>the price you have to pay to thwart traffic analysis. Travis H. wrote: I'm not so sure. If we're talking about thwarting traffic on the link level (real circuit) or on the virtual-circuit level, then you're adding, on average, a half-packet latency whenever you want to send a real packet. I very much doubt it. Where did that factor of "half" come frome. I don't see any reason why it's necessary to pay these costs if you abandon the idea of generating only equal-length packets Ah, but if you generate unequal-length packets then they are vulnerable to length-analysis, which is a form of traffic analysis. I've seen analysis systems that do exactly this. So the question is, are you trying to thwart traffic analysis, or not? I should point out that encrypting PRNG output may be pointless, *is* pointless, as previously discussed. and perhaps one optimization is to stop encrypting when switching on the chaff. A better solution would be to leave the encryption on and use constants (not PRNG output) for the chaff, as previously discussed. Some minor details involving resynchronizing when the PRNG happens to The notion of synchronized PRNGs is IMHO crazy -- complicated as well as utterly unnecessary.
Re: packet traffic analysis
> I assume that the length is > explicitly encoded in the legitimate packet. Then the peer for the > link ignores everything until the next "escape sequence" introducing a > legitimate packet. I should point out that encrypting PRNG output may be pointless, and perhaps one optimization is to stop encrypting when switching on the chaff. The peer can then encrypt the escape sequence as it would appear in the encrypted stream, and do a simple string match on that. In this manner the peer does not have to do any decryption until the [encrypted] escape sequence re-appears. Another benefit of this is to limit the amount of material encrypted under the key to legitimate traffic and the escape sequences prefixing them. Some minor details involving resynchronizing when the PRNG happens to produce the same output as the expected encrypted escape sequence is left as an exercise for the reader. -- http://www.lightconsulting.com/~travis/ -><- "We already have enough fast, insecure systems." -- Schneier & Ferguson GPG fingerprint: 50A1 15C5 A9DE 23B9 ED98 C93E 38E9 204A 94C2 641B
packet traffic analysis
Travis H. wrote: Part of the problem is using a packet-switched network; if we had circuit-based, then thwarting traffic analysis is easy; you just fill the link with random garbage when not transmitting packets. OK so far ... There are two problems with this; one, getting enough random data, and two, distinguishing the padding from the real data in a computationally efficient manner on the remote side without giving away anything to someone analyzing your traffic. I guess both problems could be solved by using synchronized PRNGs on both ends to generate the chaff. This is a poor statement of the problem(s), followed by a "solution" that is neither necessary nor sufficient. 1) Let's assume we are encrypting the messages. If not, the adversary can read the messages without bothering with traffic analysis, so the whole discussion of traffic analysis is moot. 2) Let's assume enough randomness is available to permit encryption of the traffic ... in particular, enough randomness is available _steady-state_ (without stockpiling) to meet even the _peak_ demand. This is readily achievable with available technology. 3) As a consequence of (1) and (2), we can perfectly well use _nonrandom_ chaff. If the encryption (item 1) is working, the adversary cannot tell constants from anything else. If we use chaff so that the steady-state traffic is indistinguishable from the peak traffic, then (item 2) we have enough randomness available; TA-thwarting doesn't require anything more. 4) Let's consider -- temporarily -- the scenario where the encryption is being done using IPsec. This will serve to establish terminology and expose some problems heretofore not mentioned. 4a) IPsec tunnel mode has "inner headers" that are more than sufficient to distinguish chaff from other traffic. (Addressing the chaff to UDP port 9 will do nicely.) 4b) What is not so good is that IPsec is notorious for "leaking" information about packet-length. Trying to make chaff with a distribution of packet sizes indistinguishable from your regular traffic is rarely feasible, so we must consider other scenarios, somewhat like IPsec but with improved TA-resistance. 5) Recall that IPsec tunnel mode can be approximately described as IPIP encapsulation carried by IPsec transport mode. If we abstract away the details, we are left with a packet (called an "envelope") that looks like ---++ | outer header | inner header | payload | [1] ---++ where the inner header and payload (together called the "contents" of the envelope) are encrypted. (The "+" signs are meant to be opaque to prying eyes.) The same picture can be used to describe not just IPsec tunnel mode (i.e. IPIP over IPsec transport) but also GRE over IPsec transport, and even PPPoE over IPsec transport. Note: All the following statements apply *after* any necessary fragmentation has taken place. The problem is that the size of the envelope (as described by the length field in the outer header) is conventionally chosen to be /just/ big enough to hold the contents. This problem is quite fixable ... we just need constant-sized envelopes! The resulting picture is: --- | outer header | inner header | payload | padding |[2] --- where padding is conceptually different from chaff: chaff means packets inserted where there would have been no packet, while padding adjusts the length of a packet that would have been sent anyway. The padding is not considered part of the contents. The decoding is unambiguous, because the size of the contents is specified by the length field in the inner header, which is unaffected by the padding. This is a really, really tiny hack on top of existing protocols. If your plaintext consists primarily of small packets, you should set the MTU of the transporter to be small. This will cause fragmentation of the large packets, which is the price you have to pay. Conversely, if your plaintext consists primarily of large packets, you should make the MTU large. This means that a lot of bandwidth will be wasted on padding if/when there are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's the price you have to pay to thwart traffic analysis. (Sometimes you can have two virtual circuits, one for big packets and one for small packets. This degrades the max performance in both cases, but raises the minimum performance in both cases.) Remark: FWIW, the MTU (max transmission unit) should just be called the TU in this case, because all transmissions have the same size now!