Hello Jay, On Mar 19, 2013, at 2:09 , Jay Oster <j...@kodewerx.org> wrote:
> Hi again! > > On Sun, Mar 17, 2013 at 2:17 AM, Jason Oster <j...@kodewerx.org> wrote: > Hello Andrew, > > On Mar 16, 2013, at 8:05 AM, Andrew Alexeev <and...@nginx.com> wrote: >> Jay, >> >> You mean you keep seeing SYN-ACK loss through loopback? > > That appears to be the case, yes. I've captured packets with tcpdump, and > load them into Wireshark for easier visualization. I can see a very clear gap > where no packets are transmitting for over 500ms, then a burst of ~10 SYN > packets. When I look at the TCP stream flow on these SYN bursts, it shows an > initial SYN packet almost exactly 1 second earlier without a corresponding > SYN-ACK. I'm taking the 1-second delay to be the RTO. I can provide some > pieces of the tcpdump capture log on Monday, to help illustrate. > > I double-checked, and the packet loss does *not* occur on loopback interface. > It does occur when hitting the network with a machine's own external IP > address, however. This is within Amazon's datacenter; the packets bounce > through their firewall before returning to the VM. If I understand you right, issue can be repeated in the following cases: 1) client and server are on different EC2 instances, public IPs are used; 2) client and server are on different EC2 instances, private IPs are used; 3) client and server are on a single EC2 instance, public IP is used. And there are no problems when: 1) client and server are on a single EC2 instance, either loopback or private IP is used. Please correct me if I'm wrong. What about EC2 security group - do the client and the server use the same group? How many rules are present in this group? Have you tried to either decrease a number of rules used, or create "pass any to any" simple configuration? And just to clarify the things - under "external IP address" do you mean EC2 instance's public IP, or maybe Elastic IP? > >> That might sound funny, but what's the OS and the overall environment of >> that strangely behaving machine with nginx? Is it a virtualized one? Is the >> other machine any different? The more details you can provide, the better :) > > It's a 64-bit Ubuntu 12.04 VM, running on an AWS m3.xlarge. Both VMs are > configured the same. > >> Can you try the same tests on the other machine, where you originally didn't >> have any problems with your application? That is, can you repeat nginx+app >> on the other machine and see if the above strange behavior persists? > > Same configuration. I'm investigating this issue because it is common across > literally dozens of servers we have running in AWS. It occurs in all regions, > and on all instance types. This "single server" test is the first time the > software has been run with nginx load balancing to upstream processes on the > same machine. > > Here is some additional information in the form of screenshots from Wireshark! > > 10.245.2.254 is the VM's eth0 address. 50.112.82.196 is the VM's external IP, > as assigned by Amazon. All of these packets are being routed through Amazon's > firewall. > > This first screenshot shows the "gap" that ends with a SYN burst. This was > all captured during a single run of AB. > > > <Screen Shot 2013-03-18 at 11.58.49 AM.png> > > The gap is about 500ms where the server is idle. :( > > If I use "follow TCP stream" on the highlighted packet, I get this: > > <Screen Shot 2013-03-18 at 11.59.18 AM.png> > > The initial SYN packet was sent almost exactly 1 second prior, and a SYN-ACK > was not received for it. > _______________________________________________ > nginx mailing list > nginx@nginx.org > http://mailman.nginx.org/mailman/listinfo/nginx _______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx