WT> It's not only a matter of caching the request to replay it, it is that
WT> you're simply not allowed to. I know a guy who ordered a book at a
WT> large well-known site. His order was processed twice. Maybe there is
WT> something on this site which grants itself the right to replay a user's
WT> request when a server connection suddenly closes on keep-alive timeout
WT> or count.

That's more of an issue with the site than a (proxy based) load
balancer - the LB would be doing the exact same thing as the client.

According to the rfc, if a connection is prematurely closed, then the
client would (silently) retry the request. In our case the LB just
emulated the client's behavior towards the servers.

Unfortunately for your friend, it could mean the code on the site
didn't do any duplicate order checking.  A corner case taken care of
by their support department I guess.

WT> So probably that a reasonable balance can be found but it is
WT> clear that from time to time a user will get an error.

That sounds like the mantra of the internet in general.  :-)

WT> Maybe your LB was regularly sending dummy requests on the connections
WT> to keep them alive, but since there is no NOP instruction in HTTP, you
WT> have to send real work anyway.

Well, the site was busy enough that it didn't require to do the
equivalent of a NOP to keep connections open. :-) But the idea of NOPs
can be mitigated by adjusting timeouts on stale connections.

My understanding was that the loadbalancer actually just used a pool
of open tcp sessions, and would send the next request (from any of
it's clients) down the next open tcp connection that wasn't busy. If
none were free, a new connection was established, which would
eventually timeout and close naturally. I don't believe it was
pipelining the requests.

This would mean that multiple requests from clients A, B, C may go
down tcp connections X, Y, Z in a 'random' order. (eg: tcp connection
"X" may have requests from A, B, A, A, C, B)

Sounds rather chaotic, but actually worked fine.

>> Last time I looked into it, the squid people had made some progress into
>> it, but hadn't gotten it to successfully proxy.

After checking, I stand corrected - it looks to be that Squid have a
working proxy helper application to make ntlm authentication work.

WT> Was it really just an issue with the TCP stack ? maybe there was a firewall
WT> loaded on the machine ? Maybe IIS was logging connections and not requests,
WT> so that it almost stopped logging ?

There was additional security measures on the machines, so yes, I
should say the stack wasn't fully the issue, but once they got
disabled in testing, we definitely still had better performance that
before.

WT> It depends a lot on what the server does behind. File serving will not
WT> change, it's generally I/O bound. However if the server was CPU-bound,
WT> you might have won something, especially if there was a firewall on
WT> the server.

CPU was our main issue - as this was quite a while ago, things have
since dramatically improved with better offload support in drivers and
on network cards, plus much profiling been done by OS vendors in their
kernels with regards to network performance.  So I doubt people would
get the same level of performance increase these days that we saw back
then.

Cheers,
  Ross.




-- 


Reply via email to