Re: Web Traffic Analysis

Gabriel Belingueres Mon, 27 Sep 1999 23:40:06 -0700
"René G. Eberhard" wrote:
> 
> Hi
> 
> > TLS defines random padding only when CipherSuites using block ciphers
> > are agreed by the parties. If the CipherSuite chosen by the parties use
> > a stream cipher, there is no random padding, so the length of the
> > application layer data is bigger in a constant size, then there is no
> > padding protection when stream ciphers are used in TLS.
> 
> Agree. The problem still occurs using stream ciphers.
> 
> > However, random padding of records are not the only thing to stop
> > traffic analysis attacks. There is a lot of contextual data transmitted
> > in a HTTPS request/reply.
> > Using statistical methods, the attacker can guess the HTML file I'm
> > viewing by keeping a record of the last, current and next page I will go
> > (by clicking in a hiperlink).
> > Also, both the quantity of TLS connections and lengths of the embedded
> > resources into a HTML page (used with the HTML page length guessed) is
> > usefull to accomplish the attack (if the browser cache is disabled), but
> > although keeping the cache enabled is a common configuration in web
> > browsing, it might not be in others application protocols, from there
> > the idea of developing an extra layer below the app protocol and upper
> > the TLS protocol, so that it is reusable by others applications, not
> > only web browser and servers.
> 
> I see your point. I don't know whether you'll always have success
> with traffic analysis using statistical methods. In my opinion this is
> possible if the server only provides a few, completely different pages
> (or other resources).  There you can easily analyze the traffic.
> But you'll have a problem to analyze the traffic if the server has a huge
> amount of pages (or other resources).

HTML files are essentially words (you could call it a text file, if you
like). Is common practice that the Web designer write things in the
file, and there is no too much HTML files with the <same> length, even
in big web sites. So, as long as the file lengths are equal, the
attacker can create a database with (URL, Length) attributes, then check
for the approximate value of the transferred file (having account of the
previous link and the embedded objects transferred, such as .gif,...)
and chose the "more probable" URL from the database.
I agree that more test have to be done, but I think that a big Web site
(+1000 HTML files) have few files with the same length.

> The problem to analyze the traffic
> gets bigger if the pages are generated dynamically.

It is possible, but sometimes I believe it not gets more complicated.
Example: I see a person entering the bank web site to know the how much
has she spent of my credit card. So I turn on a sniffer in a near
computer, and I do the traffic analysis. I correlate to infer that she
comes to the page with the form witch ask him/her for the credit card
number, etc. When he/she press Send in the form, I only have to check if
the amount of information that comes from the server is "a few" bytes or
"a lot" of bytes (of course, it depends of what the attacker interpret
for "few" and "a lot") to know if he/she has buy or not with my credit
card (of course, I don't know what did he/she buy).
That's what traffic analysis is all about. To infer info from the
message length, frequency and delays.

> A real problem is the request of the browser. If you assume
> that all the header variables are the same during several
> requests you could analyze the GET request. As mentioned
> in another mail, you could add some additional headers.
> You might also send some OPTIONS * 'no-op' requests
> to frustrate an attacker.

I think that "OPTIONS *" can be sent once in a request, but I'm not
really sure.
Anyway, if "OPTIONS *" can be sent once in a request, then I have to
send various request in the near same time, witch is not a "normal"
browsing request. The OPTIONS * is like a <ping> to the web server, in
this case, the server will send a short reply. I don't know if the
attacker could get information from this, but I think yes, because
length information is not masked. Add some padding here would be good.

Also, the idea is the protocol to be independent of the application who
use it, I don't know if embed all the functionality into the HTTP
protocol is the right decision...Any ideas?
 
> We have to consider where traffic analysis is useful to
> gain relevant informations? You are welcome to complete
> the list.
> 
> 1.  Backuping or transferring data from one biz to another biz?
>     What kind of information can you get with a traffic analysis?
>     You know that biz A transfered something to biz B. But you
>     have no idea what.

That's right, I have no idea what was transferred.
For HTTPS transfers, it is what we were talking until now.
For SMTP over TLS, this protocol alone does not do a good job, because
email frequency is still there. For secure email to be protected against
traffic analysis, time delays can be considered (not the case with
HTTPS).
I think that anonymizers do a better job than this protocol. Even
better, anonymizers (see www.anonynizer.com) could be doing the best job
with email transmission if they use HTTPS with this protocol.
 
> 2. Doing internet banking?
>    Can you gain relevant information using statistical analysis?
>    You possibly find out that someone did a wired transfer. But
>    (if the system is designed well) you don't know who.

If the attacker have an account in the same bank, so he/she can retrieve
at least one time those html files protected with HTTPS, then could know
what kind of transactions or queries the client is doing.
Who? Very probably, the bank is using strong crypto (RC4 with 128 bits
of key, for example), and maybe the bank give to him/her a free
certificate for being a customer (I don't know if any bank in the world
have this service, but would be nice anyway :).
As the SSL handshake message Certificate travels in the clear, the
attacker may have a good chance of know who did the transactions. Even,
worst, because the SSL handshake is previous to the web transactions,
the attacker even could decide to mount an active attack against that
customer, or not.

> 3. Doing shopping over the internet.
>     Can you gain relevant information using statistical analysis?
> 
> I know there are lots of other scenarios. But its late in the evening =).

Despite some sites (Amazon.com, for example) only supports SSL 2.0!!
...anyway...
Always depends of what is privacy for you. You could add a book about
sex in Amazon.com's shopping cart, and the attacker can actually view
what book do you buy (because it NOT uses HTTPS), that only information
can be used by an attacker to mount a (physical!) attack to the buyer,
because the attacker knows what are the buyers' physical
"vulnerabilities" :)

> 
> > Even all that stuff, the "random" padding thing with stream ciphers is
> > not too much work. It could be added to a TLS implementation in a couple
> > of hours by any experienced TLS developer.
> 
> Doing that needs a revision of the RFC... would make sense either =).

The padding thing, yes. But the ClientHello's Escape character (in the
txt file, a few emails ago) could wait, because it is transparent to
current SSL 3.0 and TLS 1.0 implementations.

> Would it be possible to add a compression layer with different compression
> levels?

Do you mean in SSL and TLS implementations? or the protocol we are
discussing now?
Anyway, I don't know much about compression, but I think it must be a
"one-to-one" method.

Regards,
Gabriel
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]
Re: Web Traffic Analysis

Reply via email to