Hi Amit, On Fri, Dec 24, 2010 at 12:24:55PM +0530, Amit Nigam wrote: (...) I see nothing wrong in your configs which could justify your issues.
> Now in new stats page I noticed one thing which was not in 1.3.22 is > LastChk, but I wonder tc1 is showing L7OK/302 in 324ms _and tc2 is showing > L7OK/302 in 104ms _ while currently haproxy is running on LB1 and there are > 13 retries at TC2. The only explanation I can see is a network connection issue. What you describe looks like packet loss over the wire. It's possible that one of your NICs is dying, or that the network cable or switch port is defective. You should try to perform a file transfer between the machine showing issues and another one from the local network to verify this hypothesis. If you can't achieve wire speed, it's possible you're having such a problem. Then you should first move to another switch port (generally easy), then swap the cable with another one (possibly swap the cables between your two LBs if they're close) then try another port on the machine. Another possible explanation which becomes quite rare nowadays would be that you'd be using a forced 100Mbps full duplex port on your switch with a gigabit port on your server, which would negociate half duplex. You can check for that with "ethtool eth0" on your LBs and TCs. > Also can this issue be due to time differences between cluster nodes? as I > have seen there is a time difference of around 2 minutes between physical > machine 1 vms and physical machine 2 vms. While it's a bad thing to have machines running at different times, I don't see why it could cause any such issue. Regards, Willy