Peter Crowther wrote:
...
Jeffrey's since confirmed it's not a CPU issue - thanks Jeffrey! - so I
agree that bandwidth/latency is the one to solve, as the rest of the
thread's been working on while I've been asleep!

Elaborating a bit on what I mentioned before, and sorry Jeffrey if this sounds elementary. But my experience is that this kind of thing is often overlooked, and people spend time looking at all kinds of esoteric solutions when the real problem stares them in the face.

It does not take much to check this, and you need do nothing at the server side. Get Firefox, and the HttpFox add-on (or any similar combination that allows you to see what is really going back and forth between browser and server).

- open the browser, and start HttpFox in it's own window
- clear the HttpFox trace, and start capturing
- in the borwser, call up the first page of your site
- look at the HttpFox trace, to see how many requests/responses this really generated, and for what. Pay particular attention to all the "accessory" things, like images, stylesheets, images called up by stylesheets, javascript libraries, etc..
You also see the size of each of these things.

- also pay attention to any 4xx status responses (like 404 not found). It is often the case that, as an application is developed, people change the names of images, stylesheets etc.. without adapting the links in the pages which load these things. Each 404 means that one request went to the server, the server did not find it, and sent back a 404 response. Over a long/slow link, these things count.
(Another good source for this is the server access logs)

That was all for your first page, but I'd bet it may already be an eye-opener as to what is really going on.

- now call the second page, and do the same examination.
If the design and the caching are correct, then you should see quite a few of 30x responses ("not modified"). That means that the browser sent a request to the server for some object, conditional upon the fact that that object had been modified since the browser got it last. And the server may just answer "no, it was not modified, use your cached version". That saves bandwidth when it works as it should, because instead of resending the same object to the browser (an image, a stylesheet, a javascript library, a java applet), the server just sends one status line.

If you do not see a lot of 30x responses, but a lot of new requests for images, stylesheets, background images, etc.. with 200 OK responses, then maybe ask yourself why this second page needs so many things different from the first page. Maybe the server has 5 identical (in content) stylesheets stored, but under different names.

- it is also often the case that people endlessly duplicate the same style and graphic elements in many directories and sub-directories, because it is easier to have links like "images/image1.jpg" in all your pages (and it is also easier for the graphic designers). If these images (or other things) have different URL paths on the server, then for the browser they all look different, and have to be gotten and cached separately. If a given image is only in one (URL) place on the server, then it is retrieved and cached only once). (If such is the case and you do not want to revise all your pages, then there are things that can be done at the server side to mitigate the effects - like aliases, and URL rewrite rules).


And so on..

No kidding, I have seen cases where the graphic designer of the site thought it nice to have a different background picture for each page, à 250 KB or more per picture. That may look very nice, and be justified for some kinds of websites where graphics are a main concern, but for most business appplications it is less important than latency. YMMV.

Now when you do that, and tell these designers and programmers to clean up their act, you are not going to be loved. Nobody likes to clean up. But you may be able to save 50% of your bandwidth and reclaim a significant percentage of duplicate files on your servers.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to