Hi Amit,

A few final observations:

o You should be seeing SYN packets that will track the IP packet
segments, particularly when the client sends 2 TCP
packets; which become three IP packets on the wire. Look for
retransmission requests, which would be SYN-ACK packets where the
sequence number doesn't increment (IOW NAKs)

o Compare router configurations in your lab and in the field,
particularly if this is a service provided using a virtual private
network

o Look for indications of mismanaged TCP flow control (window size)
and congestion control

Cheers,
jec

On Dec 18, 12:52 am, Amit Kasher <amitkas...@gmail.com> wrote:
> I really appreciate you attempts to help.
> In reply to your last post, jec:
> * We can't reproduce this in the lab...
> * We see a combination of FIN, FIN-ACK and RST.
> * We haven't seen any suspicious traceroutes... nothing
> differentiating suffering clients from non-suffering client.
> * We don't do anything "special" - these are normal GWT service call
> requests from browser to server.
> * We tried that, as well as different MTU sizes... no clue
> * This occurs without any SSL involved, and regardless what browser
> being used (IE, FF).
>
> Unfortunately we gave up on the persistent attempt to get to the root
> of that issue. We now just assume that it's some low level network
> issue (level 1-2) that causes some of the packets not to arrive, in an
> unexplained combination with a higher level network issue (level 3-6)
> that causes the packet's data to split at exactly 80%.
>
> In order to deal with this situation, we implemented a high level
> (GWT) configurable retry mechanism, with timeout support. This
> resolves the symptoms, and in effect solves the problem.
>
> We don't mind contributing this mechanism (both client and server
> code), if someone is interested or believes GWT needs this kind of
> mechanism.
>
> Thanks again,
> Amit
>
> On Dec 5, 8:09 pm, jchimene <jchim...@gmail.com> wrote:
>
> > Hi Amit,
>
> > You don't make this easy, do you...
>
> > o     Just to be clear: goodness happens when the client sends 2 TCP
> > packets; which become three IP packets on the wire; which are
> > reassembled by the server into 2 TCP packets.
> >        Badness happens when the client sends 2 TCP packets; which
> > become three IP packets on the wire; which are reassembled into one
> > complete TCP packet and 1 incomplete TCP packet.
> >       Can you reproduce this in your lab? I'm guessing "no", otherwise
> > you would not have deployed the app...
>
> > o     Do you see a NAK at the client after the dropped fragment?
>
> > o     Pls. try traceroute from your lab and from the client box. What
> > are the differences?
>
> > o     It's now appearing to be an IP issue. The fact that the
> > fragmentation doesn't occur on the larger packet is interesting.
>
> > o     The two separate TCP packets leads to an assumption that you can
> > identify requests from the same client box at the server. IOW, you
> > have an
> >        application-level protocol that lets you reassemble the two
> > packets into a single request. I'm sure this is the case, but such a
> > design isn't explicitly stated in your
> >        message. Your server application never sees the 2 -> 3 split,
> > since the normal case is that your server app only sees 2 packets from
> > the client. I'm reluctant to say this, but
> >        part of this process may require proof that the protocol design
> > is resilient to network transmission errors.
>
> > o     I'd start playing around w/ different packet sizes and
> > transmission rates (via ping) to see if you can trip any triggers. It
> > may be a combination of buffering/congestion
> >        between the client and the server.
> >        Did you try ping w/ different packet sizes? I realize that you
> > have different servers. Does the connection between the client and
> > server occur over the public switched network
> >        or does it use a private circuit?
>
> > o     There have been posts in this thread w/r/t/ SSL and IE. Are they
> > relevant?
>
> > Cheers,
> > jec
>
> > On Dec 5, 1:21 am, Amit Kasher <amitkas...@gmail.com> wrote:
>
> > > Hi,
> > > We have spent the past 2 days working on this, and have some new
> > > findings.
>
> > > We have made contact to one of our customers who is encountering this
> > > issue more frequently than others, and he granted us access to his
> > > computer (using logmein). We installed WireShark on his computer, as
> > > well as on the server. We managed to reproduced the problem with both
> > > sniffers in action, and analyze the exact correlating TCP segments
> > > according to their sequence and ack numbers. Here are the results.
>
> > > This is what happens in the valid state:
> > > The client sends 2 TCP segments for a GWT service calls, which are
> > > supposed to be reassembled to a single PDU which is the entire single
> > > HTTP request. The first segment always contains the HTTP request
> > > header, and the second TCP segment always contains the HTTP request
> > > body. For instance, we see that the client sends a first segment of
> > > size 969 bytes, and a second segment of size 454 bytes. In the server
> > > we see that these 2 segments become 3 segments. The first is still 969
> > > bytes and contains the HTTP request header; the second is 363 bytes
> > > (80% of the original second segment), and the third is the remaining
> > > 91 bytes (20% of the original 454 bytes).
>
> > > In the invalid state, when the problem occurs, the third segment
> > > simply does not arrive in the server. It seems that something in the
> > > way has split the second 454 bytes segment to 2 segments, and only
> > > sent the first one to the server.
>
> > > 1. If this is something in the client's machine, how come we don't see
> > > it in the sniffer? (we even tried removing all firewall/antivirus
> > > software, reinstalling the network card driver)
> > > 2. If this is not something in the client's machine, how come some
> > > clients encounter this much more than others, that never encounter
> > > this?
>
> > > Can it be some kind of network equipment that some of our clients
> > > (reminder - different ISPs) go through, and others don't?
>
> > > Unfortunately, this new info still leaves us clueless...
>
> > > On Dec 3, 5:16 pm, jchimene <jchim...@gmail.com> wrote:
>
> > > > On Dec 2, 11:20 pm, Amit Kasher <amitkas...@gmail.com> wrote:
>
> > > > > Hi and thanks again for your responses.
>
> > > > No Prob.
>
> > > > If this "opportunity for excellence" is as pervasive as you suspect,
> > > > installing software on a client's computer should be a non-starter
> > > > from the perspective that installing it on *any* computer *anywhere on
> > > > the planet* should reliably reproduce the issue. You say that tcpdump
> > > > shows the packet truncation, so I'm not sure I understand the
> > > > requirement to install something on a client machine. My goal in these
> > > > past responses has been to absolutely prove that it's the
> > > > serialization code (by factoring out the serialization code using
> > > > ping), not something peculiar to the transport or session layers.
>
> > > > Are you using the public switched network to provide client/server
> > > > connectivity? If not, nothing you've said so far would eliminate your
> > > > network transport service.
>
> > > > I find it hard to believe it's GWT, as the cargo size is so small as
> > > > to be insignificant, and others would have reported this issue by now.
> > > > I have to admit that I'm not a user of Java serialization, so there
> > > > may have been reports of this serialization issues of which I'm
> > > > blissfully unaware. From everything you're saying, it really looks
> > > > like the problem is in user-space. It may be a certain code path that
> > > > leads to the same serialization invocation logic. I'd start pulling
> > > > this code apart, instrumenting the hell out of it and running it
> > > > through JUnit or some such automated testing environment. Again, I
> > > > understand you've probably done this...
>
> > > > I'm wondering if there's a specific byte-pattern that's causing this.
> > > > Have you tried reordering the structure members? Also, have you
> > > > eliminated buffer corruption issues? Since it's cross-browser, what
> > > > does the -pretty flag + Firebug reveal? Esp. when profiling the code?
> > > > (Although I must admit that you've probably tried all that type of
> > > > debugging by now).
>
> > > > Bueno Suerte,
> > > > jec
>
> > > > > A few more subtle observations and insights:
> > > > > 1. It's probably not the server. There are several reasons that lead
> > > > > us to believe that the server is not the cause of this issue: (a) We
> > > > > switched hosting providers. (b) These providers reside in completely
> > > > > different geographical locations - countries. (c) We have always been
> > > > > using JBoss on CentOS, but this issue occurs both when we work with
> > > > > Apache as a front end using mod_jk to tomcat, as well as when
> > > > > eliminating this tier and having clients go directly to tomcat - using
> > > > > it as an HTTP server. (d) tcpdump sniffer explicitly shows that the
> > > > > server receives ALWAYS EXACTLY 80% of the request payload. Unless this
> > > > > is something even lower level in that machine (the VPS software used -
> > > > > virtuozzo, the network card/driver, etc.), these observations pretty
> > > > > much provides an alibi for the server... I think we'd better focus on
> > > > > other places.
> > > > > 2. There are indications that this is not inside the browser as well:
> > > > > (a) It happens in several GWT versions. (b) It happens "to" all
> > > > > browsers, which provides a strong clue, since this code is completely
> > > > > different from browser to browser - GWT uses MsXMLHTTP activeX in IE,
> > > > > while using completely other objects in other browsers. Since this is
> > > > > the underlying mechanism used to perform RPC, it seems that if it
> > > > > happens for more than one of them, low chances that this is the cause.
> > > > > Still it seems that this MUST be the GWT/client code, since these
> > > > > clients, to whom this issue occurs much more often, don't have
> > > > > problems in any other websites (we managed to talk to several of
> > > > > them).
> > > > > One thing that comes to mind is perhaps the GWT serialization code? I
> > > > > don't know...
>
> > > > > Therefore, currently, aside from the possibility that there's a bug in
> > > > > the GWT serialization code, there's also the possibility that it's
> > > > > something in the network, even though these clients are from various
> > > > > ISPs,
>
> ...
>
> read more »
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to