Hello Jay,

On Mar 19, 2013, at 2:09 , Jay Oster <j...@kodewerx.org> wrote:

> Hi again!
> 
> On Sun, Mar 17, 2013 at 2:17 AM, Jason Oster <j...@kodewerx.org> wrote:
> Hello Andrew,
> 
> On Mar 16, 2013, at 8:05 AM, Andrew Alexeev <and...@nginx.com> wrote:
>> Jay,
>> 
>> You mean you keep seeing SYN-ACK loss through loopback?
> 
> That appears to be the case, yes. I've captured packets with tcpdump, and 
> load them into Wireshark for easier visualization. I can see a very clear gap 
> where no packets are transmitting for over 500ms, then a burst of ~10 SYN 
> packets. When I look at the TCP stream flow on these SYN bursts, it shows an 
> initial SYN packet almost exactly 1 second earlier without a corresponding 
> SYN-ACK. I'm taking the 1-second delay to be the RTO. I can provide some 
> pieces of the tcpdump capture log on Monday, to help illustrate.
> 
> I double-checked, and the packet loss does *not* occur on loopback interface. 
> It does occur when hitting the network with a machine's own external IP 
> address, however. This is within Amazon's datacenter; the packets bounce 
> through their firewall before returning to the VM.

If I understand you right, issue can be repeated in the following cases:

1) client and server are on different EC2 instances, public IPs are used;
2) client and server are on different EC2 instances, private IPs are used;
3) client and server are on a single EC2 instance, public IP is used.

And there are no problems when:

1) client and server are on a single EC2 instance, either loopback or private 
IP is used.

Please correct me if I'm wrong.

What about EC2 security group - do the client and the server use the same group?
How many rules are present in this group? Have you tried to either decrease
a number of rules used, or create "pass any to any" simple configuration?

And just to clarify the things - under "external IP address" do you mean EC2
instance's public IP, or maybe Elastic IP?


>  
>> That might sound funny, but what's the OS and the overall environment of 
>> that strangely behaving machine with nginx? Is it a virtualized one? Is the 
>> other machine any different? The more details you can provide, the better :)
> 
> It's a 64-bit Ubuntu 12.04 VM, running on an AWS m3.xlarge. Both VMs are 
> configured the same.
> 
>> Can you try the same tests on the other machine, where you originally didn't 
>> have any problems with your application? That is, can you repeat nginx+app 
>> on the other machine and see if the above strange behavior persists?
> 
> Same configuration. I'm investigating this issue because it is common across 
> literally dozens of servers we have running in AWS. It occurs in all regions, 
> and on all instance types. This "single server" test is the first time the 
> software has been run with nginx load balancing to upstream processes on the 
> same machine.
> 
> Here is some additional information in the form of screenshots from Wireshark!
> 
> 10.245.2.254 is the VM's eth0 address. 50.112.82.196 is the VM's external IP, 
> as assigned by Amazon. All of these packets are being routed through Amazon's 
> firewall.
> 
> This first screenshot shows the "gap" that ends with a SYN burst. This was 
> all captured during a single run of AB.
> 
> 
> <Screen Shot 2013-03-18 at 11.58.49 AM.png>
> 
> The gap is about 500ms where the server is idle. :(
> 
> If I use "follow TCP stream" on the highlighted packet, I get this:
> 
> <Screen Shot 2013-03-18 at 11.59.18 AM.png>
> 
> The initial SYN packet was sent almost exactly 1 second prior, and a SYN-ACK 
> was not received for it.
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Reply via email to