WT> It's not only a matter of caching the request to replay it, it is that WT> you're simply not allowed to. I know a guy who ordered a book at a WT> large well-known site. His order was processed twice. Maybe there is WT> something on this site which grants itself the right to replay a user's WT> request when a server connection suddenly closes on keep-alive timeout WT> or count.
That's more of an issue with the site than a (proxy based) load balancer - the LB would be doing the exact same thing as the client. According to the rfc, if a connection is prematurely closed, then the client would (silently) retry the request. In our case the LB just emulated the client's behavior towards the servers. Unfortunately for your friend, it could mean the code on the site didn't do any duplicate order checking. A corner case taken care of by their support department I guess. WT> So probably that a reasonable balance can be found but it is WT> clear that from time to time a user will get an error. That sounds like the mantra of the internet in general. :-) WT> Maybe your LB was regularly sending dummy requests on the connections WT> to keep them alive, but since there is no NOP instruction in HTTP, you WT> have to send real work anyway. Well, the site was busy enough that it didn't require to do the equivalent of a NOP to keep connections open. :-) But the idea of NOPs can be mitigated by adjusting timeouts on stale connections. My understanding was that the loadbalancer actually just used a pool of open tcp sessions, and would send the next request (from any of it's clients) down the next open tcp connection that wasn't busy. If none were free, a new connection was established, which would eventually timeout and close naturally. I don't believe it was pipelining the requests. This would mean that multiple requests from clients A, B, C may go down tcp connections X, Y, Z in a 'random' order. (eg: tcp connection "X" may have requests from A, B, A, A, C, B) Sounds rather chaotic, but actually worked fine. >> Last time I looked into it, the squid people had made some progress into >> it, but hadn't gotten it to successfully proxy. After checking, I stand corrected - it looks to be that Squid have a working proxy helper application to make ntlm authentication work. WT> Was it really just an issue with the TCP stack ? maybe there was a firewall WT> loaded on the machine ? Maybe IIS was logging connections and not requests, WT> so that it almost stopped logging ? There was additional security measures on the machines, so yes, I should say the stack wasn't fully the issue, but once they got disabled in testing, we definitely still had better performance that before. WT> It depends a lot on what the server does behind. File serving will not WT> change, it's generally I/O bound. However if the server was CPU-bound, WT> you might have won something, especially if there was a firewall on WT> the server. CPU was our main issue - as this was quite a while ago, things have since dramatically improved with better offload support in drivers and on network cards, plus much profiling been done by OS vendors in their kernels with regards to network performance. So I doubt people would get the same level of performance increase these days that we saw back then. Cheers, Ross. --