Re: Client did not send nnn bytes as expected
I really appreciate you attempts to help. In reply to your last post, jec: * We can't reproduce this in the lab... * We see a combination of FIN, FIN-ACK and RST. * We haven't seen any suspicious traceroutes... nothing differentiating suffering clients from non-suffering client. * We don't do anything "special" - these are normal GWT service call requests from browser to server. * We tried that, as well as different MTU sizes... no clue * This occurs without any SSL involved, and regardless what browser being used (IE, FF). Unfortunately we gave up on the persistent attempt to get to the root of that issue. We now just assume that it's some low level network issue (level 1-2) that causes some of the packets not to arrive, in an unexplained combination with a higher level network issue (level 3-6) that causes the packet's data to split at exactly 80%. In order to deal with this situation, we implemented a high level (GWT) configurable retry mechanism, with timeout support. This resolves the symptoms, and in effect solves the problem. We don't mind contributing this mechanism (both client and server code), if someone is interested or believes GWT needs this kind of mechanism. Thanks again, Amit On Dec 5, 8:09 pm, jchimene wrote: > Hi Amit, > > You don't make this easy, do you... > > o Just to be clear: goodness happens when the client sends 2 TCP > packets; which become three IP packets on the wire; which are > reassembled by the server into 2 TCP packets. > Badness happens when the client sends 2 TCP packets; which > become three IP packets on the wire; which are reassembled into one > complete TCP packet and 1 incomplete TCP packet. > Can you reproduce this in your lab? I'm guessing "no", otherwise > you would not have deployed the app... > > o Do you see a NAK at the client after the dropped fragment? > > o Pls. try traceroute from your lab and from the client box. What > are the differences? > > o It's now appearing to be an IP issue. The fact that the > fragmentation doesn't occur on the larger packet is interesting. > > o The two separate TCP packets leads to an assumption that you can > identify requests from the same client box at the server. IOW, you > have an > application-level protocol that lets you reassemble the two > packets into a single request. I'm sure this is the case, but such a > design isn't explicitly stated in your > message. Your server application never sees the 2 -> 3 split, > since the normal case is that your server app only sees 2 packets from > the client. I'm reluctant to say this, but > part of this process may require proof that the protocol design > is resilient to network transmission errors. > > o I'd start playing around w/ different packet sizes and > transmission rates (via ping) to see if you can trip any triggers. It > may be a combination of buffering/congestion > between the client and the server. > Did you try ping w/ different packet sizes? I realize that you > have different servers. Does the connection between the client and > server occur over the public switched network > or does it use a private circuit? > > o There have been posts in this thread w/r/t/ SSL and IE. Are they > relevant? > > Cheers, > jec > > On Dec 5, 1:21 am, Amit Kasher wrote: > > > Hi, > > We have spent the past 2 days working on this, and have some new > > findings. > > > We have made contact to one of our customers who is encountering this > > issue more frequently than others, and he granted us access to his > > computer (using logmein). We installed WireShark on his computer, as > > well as on the server. We managed to reproduced the problem with both > > sniffers in action, and analyze the exact correlating TCP segments > > according to their sequence and ack numbers. Here are the results. > > > This is what happens in the valid state: > > The client sends 2 TCP segments for a GWT service calls, which are > > supposed to be reassembled to a single PDU which is the entire single > > HTTP request. The first segment always contains the HTTP request > > header, and the second TCP segment always contains the HTTP request > > body. For instance, we see that the client sends a first segment of > > size 969 bytes, and a second segment of size 454 bytes. In the server > > we see that these 2 segments become 3 segments. The first is still 969 > > bytes and contains the HTTP request header; the second is 363 bytes > > (80% of the original second segment), and the third is the remaining > > 91 bytes (20% of the original 454 bytes). > > &
Re: Client did not send nnn bytes as expected
Hi, We have spent the past 2 days working on this, and have some new findings. We have made contact to one of our customers who is encountering this issue more frequently than others, and he granted us access to his computer (using logmein). We installed WireShark on his computer, as well as on the server. We managed to reproduced the problem with both sniffers in action, and analyze the exact correlating TCP segments according to their sequence and ack numbers. Here are the results. This is what happens in the valid state: The client sends 2 TCP segments for a GWT service calls, which are supposed to be reassembled to a single PDU which is the entire single HTTP request. The first segment always contains the HTTP request header, and the second TCP segment always contains the HTTP request body. For instance, we see that the client sends a first segment of size 969 bytes, and a second segment of size 454 bytes. In the server we see that these 2 segments become 3 segments. The first is still 969 bytes and contains the HTTP request header; the second is 363 bytes (80% of the original second segment), and the third is the remaining 91 bytes (20% of the original 454 bytes). In the invalid state, when the problem occurs, the third segment simply does not arrive in the server. It seems that something in the way has split the second 454 bytes segment to 2 segments, and only sent the first one to the server. 1. If this is something in the client's machine, how come we don't see it in the sniffer? (we even tried removing all firewall/antivirus software, reinstalling the network card driver) 2. If this is not something in the client's machine, how come some clients encounter this much more than others, that never encounter this? Can it be some kind of network equipment that some of our clients (reminder - different ISPs) go through, and others don't? Unfortunately, this new info still leaves us clueless... On Dec 3, 5:16 pm, jchimene <[EMAIL PROTECTED]> wrote: > On Dec 2, 11:20 pm, Amit Kasher <[EMAIL PROTECTED]> wrote: > > > Hi and thanks again for your responses. > > No Prob. > > If this "opportunity for excellence" is as pervasive as you suspect, > installing software on a client's computer should be a non-starter > from the perspective that installing it on *any* computer *anywhere on > the planet* should reliably reproduce the issue. You say that tcpdump > shows the packet truncation, so I'm not sure I understand the > requirement to install something on a client machine. My goal in these > past responses has been to absolutely prove that it's the > serialization code (by factoring out the serialization code using > ping), not something peculiar to the transport or session layers. > > Are you using the public switched network to provide client/server > connectivity? If not, nothing you've said so far would eliminate your > network transport service. > > I find it hard to believe it's GWT, as the cargo size is so small as > to be insignificant, and others would have reported this issue by now. > I have to admit that I'm not a user of Java serialization, so there > may have been reports of this serialization issues of which I'm > blissfully unaware. From everything you're saying, it really looks > like the problem is in user-space. It may be a certain code path that > leads to the same serialization invocation logic. I'd start pulling > this code apart, instrumenting the hell out of it and running it > through JUnit or some such automated testing environment. Again, I > understand you've probably done this... > > I'm wondering if there's a specific byte-pattern that's causing this. > Have you tried reordering the structure members? Also, have you > eliminated buffer corruption issues? Since it's cross-browser, what > does the -pretty flag + Firebug reveal? Esp. when profiling the code? > (Although I must admit that you've probably tried all that type of > debugging by now). > > Bueno Suerte, > jec > > > > > A few more subtle observations and insights: > > 1. It's probably not the server. There are several reasons that lead > > us to believe that the server is not the cause of this issue: (a) We > > switched hosting providers. (b) These providers reside in completely > > different geographical locations - countries. (c) We have always been > > using JBoss on CentOS, but this issue occurs both when we work with > > Apache as a front end using mod_jk to tomcat, as well as when > > eliminating this tier and having clients go directly to tomcat - using > > it as an HTTP server. (d) tcpdump sniffer explicitly shows that the > > server receives ALWAYS EXACTLY 80% of the request payload. Unless thi
Re: Client did not send nnn bytes as expected
Hi and thanks again for your responses. A few more subtle observations and insights: 1. It's probably not the server. There are several reasons that lead us to believe that the server is not the cause of this issue: (a) We switched hosting providers. (b) These providers reside in completely different geographical locations - countries. (c) We have always been using JBoss on CentOS, but this issue occurs both when we work with Apache as a front end using mod_jk to tomcat, as well as when eliminating this tier and having clients go directly to tomcat - using it as an HTTP server. (d) tcpdump sniffer explicitly shows that the server receives ALWAYS EXACTLY 80% of the request payload. Unless this is something even lower level in that machine (the VPS software used - virtuozzo, the network card/driver, etc.), these observations pretty much provides an alibi for the server... I think we'd better focus on other places. 2. There are indications that this is not inside the browser as well: (a) It happens in several GWT versions. (b) It happens "to" all browsers, which provides a strong clue, since this code is completely different from browser to browser - GWT uses MsXMLHTTP activeX in IE, while using completely other objects in other browsers. Since this is the underlying mechanism used to perform RPC, it seems that if it happens for more than one of them, low chances that this is the cause. Still it seems that this MUST be the GWT/client code, since these clients, to whom this issue occurs much more often, don't have problems in any other websites (we managed to talk to several of them). One thing that comes to mind is perhaps the GWT serialization code? I don't know... Therefore, currently, aside from the possibility that there's a bug in the GWT serialization code, there's also the possibility that it's something in the network, even though these clients are from various ISPs, and geographical locations. Yes, I notice the dead end as well... These observations somewhat reduce the anticipated benefit (let alone the feasibility...) of several of your (MUCH APPRECIATED, THOUGH) suggestions: 1. ping from the lab 2. perl HTTP server Despite that, we ARE happy about any suggestion and willing to put the required effort, so we'll try to make progress in these direction. Our situation now is that we assume that the data arrives corrupted to the server, and we should see how this data comes out of the client. Therefore we will also try to install a sniffer in a client computer in which this occurs (though we have been trying to do that for quite a long time now). On Dec 2, 10:29 pm, jchimene <[EMAIL PROTECTED]> wrote: > Hi Amit, > > One other thing: > > I'm getting the impression that you also have a custom server. If it's > an identical configuration across all server instances, than you also > have to prove that it's not the server. Again, I'd code a simple HTTP > server in Perl (because there's no problem so intractable that it > can't be made worse with a Perl application) and use it to test > against your application. > > Cheers, > jec > > On Dec 2, 9:11 am, Amit Kasher <[EMAIL PROTECTED]> wrote: > > > Hi, > > Thanks for your reply. Answers are inline. > > > On Dec 2, 5:50 pm, jchimene <[EMAIL PROTECTED]> wrote:> Hi, > > > > A few questions: > > > > o Are all packets sent to the server the same size? > > > No, they are not. > > > > o What is that size? > > > This depends on the service call - somewhere between 150 and 2000 > > bytes. > > I will mention again that by using a sniffer (tcpdump), it seems that > > EVERY time this issue occurs, the actual packets the server receives > > are ALWAYS EXACTLY 80% of what it should have received. This, again, > > was very encouraging to find as a clue, but unfortunately led me > > nowhere. > > > > o Have you checked for other types of congestion? > > > Congestion? Unfortunately, I don't have any control over the client's > > environment since this is an internet application and I can't > > reproduce it. > > > > o Is this entirely TCP/IP? Have you checked maxrss? > > > maxrss? I'm not sure I understood the relevance... TCP/IP is obviously > > used, it is the underlying protocol of HTTP... > > > > o Have you enabled logging on intermediate nodes to see if there are > > > congestion issues? > > > I wish I could... I don't have any control over any node before the > > server. It is a CentOS VPS hosted internet application. I will state > > that this occurred in several hosting providers, in several countries > > and geographical locations. > > > > o Is this related to
Re: Client did not send nnn bytes as expected
Hi, Thanks for your reply. Answers are inline. On Dec 2, 5:50 pm, jchimene <[EMAIL PROTECTED]> wrote: > Hi, > > A few questions: > > o Are all packets sent to the server the same size? No, they are not. > o What is that size? This depends on the service call - somewhere between 150 and 2000 bytes. I will mention again that by using a sniffer (tcpdump), it seems that EVERY time this issue occurs, the actual packets the server receives are ALWAYS EXACTLY 80% of what it should have received. This, again, was very encouraging to find as a clue, but unfortunately led me nowhere. > o Have you checked for other types of congestion? Congestion? Unfortunately, I don't have any control over the client's environment since this is an internet application and I can't reproduce it. > o Is this entirely TCP/IP? Have you checked maxrss? maxrss? I'm not sure I understood the relevance... TCP/IP is obviously used, it is the underlying protocol of HTTP... > o Have you enabled logging on intermediate nodes to see if there are > congestion issues? I wish I could... I don't have any control over any node before the server. It is a CentOS VPS hosted internet application. I will state that this occurred in several hosting providers, in several countries and geographical locations. > o Is this related to a specific time of day (although it probably > happens between 10:00 and 14:00...) I didn't find any correlation between the time of day and the occurrence of this. Obviously, this is normalized to the usage load, as you implied. > o Do you have a world-wide net? If so, does the problem travel across > time zones? My users are not from around the world, but as I stated - this issue occurred when using hosting providers around the world. > > Cheers, > jec > > On Dec 2, 2:13 am, Amit Kasher <[EMAIL PROTECTED]> wrote: > > > Hi, > > Does anyone has any new insights about this issue? We've been > > investigating for over a year(!), and we seem to not be the only > > ones... > > >http://tinyurl.com/5rqfp5 > > > Thanks. > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: Client did not send nnn bytes as expected
We will try Wireshark. BTW, the inherent linux sniffer, tcpdump, is pretty advanced and we used its filtering feature to pin point this packet reduction. However, the disruption seems to occur somewhere lower level in the server OS, or more likely before the server machine altogether - some network equipment or client side code / browser. Thanks again for your help. Amit On Dec 2, 12:05 pm, Lothar Kimmeringer <[EMAIL PROTECTED]> wrote: > Amit Kasher schrieb: > > > I have been trying tcpdump sniffer in the server side, and discovered > > that the server always receives 80% of the byte content (I described > > it here:http://tinyurl.com/5rqfp5). This is very interesting, but > > unfortunately led me nowhere. > > I just read the first post (shame on me ;-) but I still think > that Wireshark might help here. When the problem occurs, you can > simply reduce the view of the packets to the one session by > simply applying a filter on it. That way it should be possible > to see what was happening _before_ the packets got reduced. > > > I don't manage to reproduce it, for over a year now, so I can't run a > > sniffer in the client. Also, this is a high capacity internet > > application, not intranet, therefore contacting the users even just > > for a question is rather difficult, let alone installing a sniffer in > > the client side. > > The sniffer on the client-side would be a next step to be > considered. In the first place I think that it should be > enough to have one on the server-side (listening only to > HTTP-traffic). > > Regards, Lothar --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: Client did not send nnn bytes as expected
Thanks. I have been trying tcpdump sniffer in the server side, and discovered that the server always receives 80% of the byte content (I described it here: http://tinyurl.com/5rqfp5). This is very interesting, but unfortunately led me nowhere. I don't manage to reproduce it, for over a year now, so I can't run a sniffer in the client. Also, this is a high capacity internet application, not intranet, therefore contacting the users even just for a question is rather difficult, let alone installing a sniffer in the client side. Amit On Dec 2, 11:40 am, Lothar Kimmeringer <[EMAIL PROTECTED]> wrote: > Amit Kasher schrieb: > > > Does anyone has any new insights about this issue? We've been > > investigating for over a year(!), and we seem to not be the only > > ones... > > >http://tinyurl.com/5rqfp5 > > I have no insights but what about firing up Wireshark and > protocolling the packets that are exchanged between client > and server. At the moment the problem occurs you should be > able to come up with the protocol of that specific HTTP- > session. Maybe that helps to track down where the problem is. > > Regards, Lothar --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Client did not send nnn bytes as expected
Hi, Does anyone has any new insights about this issue? We've been investigating for over a year(!), and we seem to not be the only ones... http://tinyurl.com/5rqfp5 Thanks. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: IE6 ImageBundle + Anchor problem.
I encounter this problem as well. Has anyone found a solution? Thank On Sep 25, 8:28 pm, Jean-Lou Dupont <[EMAIL PROTECTED]> wrote: > Hi, > > I've got the following custom widget: > > > public class GearsStatus extendsAnchor{ > > Image img = null; > > public GearsStatus() { > super(); > > final WidgetMessages MSG = (WidgetMessages) > GWT.create(WidgetMessages.class); > > WidgetImageBundle bundle = (WidgetImageBundle) > GWT.create( WidgetImageBundle.class ); > > AbstractImagePrototype p = null; > > if (isGearsInstalled()) { > p = bundle.gears(); > this.setHref(MSG.gears_href_installed()); > this.setTitle(MSG.gears_title_installed()); > > } else { > p = bundle.gears_grey(); > this.setHref(MSG.gears_href_not_installed()); > this.setTitle(MSG.gears_title_not_installed()); > } > > img = p.createImage(); > > this.getElement().appendChild(img.getElement()); > } > > > which works fine on Chrome, FF, Safari but the can not be > clicked on IE6. > > Any hint? > > Thanks! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---