Re: Introduction / accelerator feature ideas

Henrik Nordstrom Thu, 20 Feb 2003 13:05:22 -0800

Hi, and welcome to Squid-dev.

On Thursday 20 February 2003 20.03, Flemming Frandsen wrote:

> A) Race conditions exist in the webapplication (not that uncommon I
> guess) that means that having two identical requests running at the
> same time in different apache processes will either result in one
> of them blowing up or simply returning the wrong result.

Hmm. This is a new twist of an old problem. Basically a variant of the 
problem that Squid may initiate multiple requests for the same URL, 
but now with added twist that you only want to limit it per user..

The general problem that Squid may request the same URL multiple times 
before knowing that the result is cacheable should be addressed for 
accelerator setups where the resulting content is expected to be 
cacheable.

Not entirely sure about your twist however. It smells more of a 
bandaid fix in the reverse proxy to work around a inheriently broken 
application. For this to work it must be very carefully specified how 
to identify that the request is exacly the same and should be allowed 
to take over the pending result for a previously aborted identical 
requests.. and when to keep waiting for responses to aborted requests 
in hope that the user simply retried the same request..

The exact same problem will be seen if the application is published 
directly on the Internet with no reverse proxy infront of it even for 
light loads which the server can perfectly well handle on it's own.

> B) When a client hits a webserver it's more or less random what
> webserver he hits, now my application does a lot of caching so the
> first time a client hits another apache process it's a much harder
> hit than if the client had hit a resently used one.

This unfortunately is a bit harder to do anything about.. Squid have 
no means of indicating which web server process should accept the 
request on the same port..

If the connections are kept alive then sure, can be done. Such binding 
of server connections to client (users or connections) is also needed 
for proxying of NTLM authentication and is of interest. Will however 
increase the demand on your backend servers as more connections will 
be needed between Squid and the web server.. but maybe a good balance 
can be found allowing sharing of connections and keeping a good user 
locality per connection. Intuitively however I feel this is better 
solved by having a better per-user information cache in the 
application on the web server, and be persistent about which web 
server (if you have more than one) each user is sent to.

The latter (per user persistent selection of web server from a farm of 
servers) can be implemented in many ways. In our eMARA reverse proxy 
we have a simple weighted hashing based sheme using either username 
or source IP address as key which has proven very effective for the 
purpose of always sending the same user to the same web server in a 
farm of web servers.

> C) When the backlog is long enough clients will get impatient and
> abort the connection, but squidie seems more than happy to keep
> serving the request (I don't quite know if this is true or the
> clients just give up when the request is being run).

Well, this actually already have a partial solution. See squid.conf.
(hint: half_closed_clients)

Also related to 'A' I think.

> D) Almost 100% of the content on the site is dynamically generated,
> the only static bits are css files and a tiny bit of graphics on
> very few pages, so very few different requests will be cache hits,
> so all this writing everything to disk business seems a litte
> wasted.

Objects which are not cacheable should not be written to disk. What 
makes you think they are? I have seen no evidence that they are in 
all my hacking on Squid..

> B) When users are identified by their session id it's relatively
> easy to maintain a list of the 5-10 latest server processes that
> the client has talked to (this calls for the server connections to
> be kept alive, but squid already does this, right?). The number of
> open server connections will need to be limited, I havn't found
> that option anywhere.

Yes, squid keeps connections persistent where possible within HTTP/1.0 
+ keep-alive. By default all open server connections act as a pool 
where requests can be forwarded to that server, and only if there is 
no idle connection Squid opens a new connection.

Regards
Henrik

Re: Introduction / accelerator feature ideas

Reply via email to