Hi Michael,


Michael Peters wrote:
Raymond Wan wrote:
I had looked at the effect compression has on web pages a while ago. Though not relevant to modperl, there is obviously a cost to compression and since most HTML pages are small, sometimes it is hard to justify.

Not to discredit the work you did researching this, but a lot of people are studying the same thing and coming to different conclusions:

http://developer.yahoo.com/performance/rules.html

Yes, backend performance matters, but more and more we realize that the front end tweaks we can make give a better performance for users.

Take google as an example. The overhead of compressing their content and decompressing it on the browser takes less time than sending the same content uncompressed over the network. I'd say the same is true for most other applications too.


It's ok; I don't consider another opinion as discrediting my work. :-) Actually, it was a while ago and it was only one aspect of my work and in a smaller test bed. My fault for handwaving in my reply, though.

The point is actually the "sometimes"... My research was more in general compression and web compression was only one aspect. My point is if you take a one byte file and run gzip -9 on it (again, the same algorithm as deflate), you get a 24 byte file. As you increase that file size, you will reach a point where it becomes more beneficial to compress. Though my example is both silly and pathological, it just shows that there are cases when compression may not be beneficial. And one can imagine the average file size of a web site to be some kind of knob and as it turns (average file size increases as you go from site to site), the benefits become more and more evident.

For example, compressing an already compressed file is generally pointless (if it was done right the first time). MP3, JPEG, GIF, etc. are all file formats that have or may have compression incorporated. PDFs can be compressed too if someone selected that option when creating it. English text compresses well (25%, in general?) but two-byte encodings such as Chinese and Japanese (I think) get around 40-50% [handwaving again :-) there are more updated numbers out there]. Also, compression works if it is a uniform file; if a web page has a mix of text, images, etc., then each one has to be compressed individually.

As for Google, you are right -- I can imagine why it would work well for Google. However, I can also hypothesize that it might be a special case. I presume you mean the results of a query. The result we get is a list of results which all are related to each other. i.e., if you searched for "apache2 modperl", we can expect those two words to be in every result and the type of words to be similar from result to result [they would all be computer-oriented]. As compression aims to reduce redundancy, their results are perfect for it. Especially if

Anyway, what I wanted to say is that there ought to be instances when compression is beneficial and when it isn't. I think it is fine to do what the Yahoo site says and have it "on" by default; but if someone examines the traffic and data and realizes it should be "off", that isn't beyond reason.


As for dialup, if I remember from those dark modem days :-)

Even non dialup customers can benefit. Many "broadband" connections aren't very fast, especially in rural places (I'm thinking large portions of the US).

But all this talk is really useless in the abstract. Take a tool like YSlow for a spin and see how your sites perform with and without compression. Especially looking at the waterfall display.


Well, one good thing about deflate is that it is *fast*. Very fast. So, while my silly one byte file example shows there are exceptions, it might be closer to one byte. :-)

One cost savings might be to pre-compress files since it is more time-consuming to compress than decompress using deflate. i.e., have them reside on the server in compressed form. Of course, that offers many problems and is one reason why things like Stacker didn't really catch on (much)...

Ray




Reply via email to