Re: Client did not send nnn bytes as expected

jchimene Wed, 03 Dec 2008 07:16:23 -0800

On Dec 2, 11:20 pm, Amit Kasher <[EMAIL PROTECTED]> wrote:
> Hi and thanks again for your responses.


No Prob.

If this "opportunity for excellence" is as pervasive as you suspect,
installing software on a client's computer should be a non-starter
from the perspective that installing it on *any* computer *anywhere on
the planet* should reliably reproduce the issue. You say that tcpdump
shows the packet truncation, so I'm not sure I understand the
requirement to install something on a client machine. My goal in these
past responses has been to absolutely prove that it's the
serialization code (by factoring out the serialization code using
ping), not something peculiar to the transport or session layers.

Are you using the public switched network to provide client/server
connectivity? If not, nothing you've said so far would eliminate your
network transport service.

I find it hard to believe it's GWT, as the cargo size is so small as
to be insignificant, and others would have reported this issue by now.
I have to admit that I'm not a user of Java serialization, so there
may have been reports of this serialization issues of which I'm
blissfully unaware. From everything you're saying, it really looks
like the problem is in user-space. It may be a certain code path that
leads to the same serialization invocation logic. I'd start pulling
this code apart, instrumenting the hell out of it and running it
through JUnit or some such automated testing environment. Again, I
understand you've probably done this...

I'm wondering if there's a specific byte-pattern that's causing this.
Have you tried reordering the structure members? Also, have you
eliminated buffer corruption issues? Since it's cross-browser, what
does the -pretty flag + Firebug reveal? Esp. when profiling the code?
(Although I must admit that you've probably tried all that type of
debugging by now).

Bueno Suerte,
jec

>
> A few more subtle observations and insights:
> 1. It's probably not the server. There are several reasons that lead
> us to believe that the server is not the cause of this issue: (a) We
> switched hosting providers. (b) These providers reside in completely
> different geographical locations - countries. (c) We have always been
> using JBoss on CentOS, but this issue occurs both when we work with
> Apache as a front end using mod_jk to tomcat, as well as when
> eliminating this tier and having clients go directly to tomcat - using
> it as an HTTP server. (d) tcpdump sniffer explicitly shows that the
> server receives ALWAYS EXACTLY 80% of the request payload. Unless this
> is something even lower level in that machine (the VPS software used -
> virtuozzo, the network card/driver, etc.), these observations pretty
> much provides an alibi for the server... I think we'd better focus on
> other places.
> 2. There are indications that this is not inside the browser as well:
> (a) It happens in several GWT versions. (b) It happens "to" all
> browsers, which provides a strong clue, since this code is completely
> different from browser to browser - GWT uses MsXMLHTTP activeX in IE,
> while using completely other objects in other browsers. Since this is
> the underlying mechanism used to perform RPC, it seems that if it
> happens for more than one of them, low chances that this is the cause.
> Still it seems that this MUST be the GWT/client code, since these
> clients, to whom this issue occurs much more often, don't have
> problems in any other websites (we managed to talk to several of
> them).
> One thing that comes to mind is perhaps the GWT serialization code? I
> don't know...
>
> Therefore, currently, aside from the possibility that there's a bug in
> the GWT serialization code, there's also the possibility that it's
> something in the network, even though these clients are from various
> ISPs, and geographical locations. Yes, I notice the dead end as
> well...
>
> These observations somewhat reduce the anticipated benefit (let alone
> the feasibility...) of several of your (MUCH APPRECIATED, THOUGH)
> suggestions:
> 1. ping from the lab
> 2. perl HTTP server
>
> Despite that, we ARE happy about any suggestion and willing to put the
> required effort, so we'll try to make progress in these direction.
>
> Our situation now is that we assume that the data arrives corrupted to
> the server, and we should see how this data comes out of the client.
> Therefore we will also try to install a sniffer in a client computer
> in which this occurs (though we have been trying to do that for quite
> a long time now).
>
> On Dec 2, 10:29 pm, jchimene <[EMAIL PROTECTED]> wrote:
>
> > Hi Amit,
>
> > One other thing:
>
> > I'm getting the impression that you also have a custom server. If it's
> > an identical configuration across all server instances, than you also
> > have to prove that it's not the server. Again, I'd code a simple HTTP
> > server in Perl (because there's no problem so intractable that it
> > can't be made worse with a Perl application) and use it to test
> > against your application.
>
> > Cheers,
> > jec
>
> > On Dec 2, 9:11 am, Amit Kasher <[EMAIL PROTECTED]> wrote:
>
> > > Hi,
> > > Thanks for your reply. Answers are inline.
>
> > > On Dec 2, 5:50 pm, jchimene <[EMAIL PROTECTED]> wrote:> Hi,
>
> > > > A few questions:
>
> > > > o Are all packets sent to the server the same size?
>
> > > No, they are not.
>
> > > > o What is that size?
>
> > > This depends on the service call - somewhere between 150 and 2000
> > > bytes.
> > > I will mention again that by using a sniffer (tcpdump), it seems that
> > > EVERY time this issue occurs, the actual packets the server receives
> > > are ALWAYS EXACTLY 80% of what it should have received. This, again,
> > > was very encouraging to find as a clue, but unfortunately led me
> > > nowhere.
>
> > > > o Have you checked for other types of congestion?
>
> > > Congestion? Unfortunately, I don't have any control over the client's
> > > environment since this is an internet application and I can't
> > > reproduce it.
>
> > > > o Is this entirely TCP/IP? Have you checked maxrss?
>
> > > maxrss? I'm not sure I understood the relevance... TCP/IP is obviously
> > > used, it is the underlying protocol of HTTP...
>
> > > > o Have you enabled logging on intermediate nodes to see if there are
> > > > congestion issues?
>
> > > I wish I could... I don't have any control over any node before the
> > > server. It is a CentOS VPS hosted internet application. I will state
> > > that this occurred in several hosting providers, in several countries
> > > and geographical locations.
>
> > > > o Is this related to a specific time of day (although it probably
> > > > happens between 10:00 and 14:00...)
>
> > > I didn't find any correlation between the time of day and the
> > > occurrence of this. Obviously, this is normalized to the usage load,
> > > as you implied.
>
> > > > o Do you have a world-wide net? If so, does the problem travel across
> > > > time zones?
>
> > > My users are not from around the world, but as I stated - this issue
> > > occurred when using hosting providers around the world.
>
> > > > Cheers,
> > > > jec
>
> > > > On Dec 2, 2:13 am, Amit Kasher <[EMAIL PROTECTED]> wrote:
>
> > > > > Hi,
> > > > > Does anyone has any new insights about this issue? We've been
> > > > > investigating for over a year(!), and we seem to not be the only
> > > > > ones...
>
> > > > >http://tinyurl.com/5rqfp5
>
> > > > > Thanks.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Client did not send nnn bytes as expected

Reply via email to