Hi, and welcome to Squid-dev.
On Thursday 20 February 2003 20.03, Flemming Frandsen wrote: > A) Race conditions exist in the webapplication (not that uncommon I > guess) that means that having two identical requests running at the > same time in different apache processes will either result in one > of them blowing up or simply returning the wrong result. Hmm. This is a new twist of an old problem. Basically a variant of the problem that Squid may initiate multiple requests for the same URL, but now with added twist that you only want to limit it per user.. The general problem that Squid may request the same URL multiple times before knowing that the result is cacheable should be addressed for accelerator setups where the resulting content is expected to be cacheable. Not entirely sure about your twist however. It smells more of a bandaid fix in the reverse proxy to work around a inheriently broken application. For this to work it must be very carefully specified how to identify that the request is exacly the same and should be allowed to take over the pending result for a previously aborted identical requests.. and when to keep waiting for responses to aborted requests in hope that the user simply retried the same request.. The exact same problem will be seen if the application is published directly on the Internet with no reverse proxy infront of it even for light loads which the server can perfectly well handle on it's own. > B) When a client hits a webserver it's more or less random what > webserver he hits, now my application does a lot of caching so the > first time a client hits another apache process it's a much harder > hit than if the client had hit a resently used one. This unfortunately is a bit harder to do anything about.. Squid have no means of indicating which web server process should accept the request on the same port.. If the connections are kept alive then sure, can be done. Such binding of server connections to client (users or connections) is also needed for proxying of NTLM authentication and is of interest. Will however increase the demand on your backend servers as more connections will be needed between Squid and the web server.. but maybe a good balance can be found allowing sharing of connections and keeping a good user locality per connection. Intuitively however I feel this is better solved by having a better per-user information cache in the application on the web server, and be persistent about which web server (if you have more than one) each user is sent to. The latter (per user persistent selection of web server from a farm of servers) can be implemented in many ways. In our eMARA reverse proxy we have a simple weighted hashing based sheme using either username or source IP address as key which has proven very effective for the purpose of always sending the same user to the same web server in a farm of web servers. > C) When the backlog is long enough clients will get impatient and > abort the connection, but squidie seems more than happy to keep > serving the request (I don't quite know if this is true or the > clients just give up when the request is being run). Well, this actually already have a partial solution. See squid.conf. (hint: half_closed_clients) Also related to 'A' I think. > D) Almost 100% of the content on the site is dynamically generated, > the only static bits are css files and a tiny bit of graphics on > very few pages, so very few different requests will be cache hits, > so all this writing everything to disk business seems a litte > wasted. Objects which are not cacheable should not be written to disk. What makes you think they are? I have seen no evidence that they are in all my hacking on Squid.. > B) When users are identified by their session id it's relatively > easy to maintain a list of the 5-10 latest server processes that > the client has talked to (this calls for the server connections to > be kept alive, but squid already does this, right?). The number of > open server connections will need to be limited, I havn't found > that option anywhere. Yes, squid keeps connections persistent where possible within HTTP/1.0 + keep-alive. By default all open server connections act as a pool where requests can be forwarded to that server, and only if there is no idle connection Squid opens a new connection. Regards Henrik