Re: Single POST request being handled twice

2011-11-27 Thread Rainer Jung

On 19.11.2011 06:07, Jeremy wrote:

OK, we figured it out.  It's a case of too many timeout settings and not
having a real DevOps person on hand.  There was an obvious error message in
Apache's mod_jk.log that I failed to correlate with the problem because I
misread the timestamp on one of the many log entries.  Doh!

[info] ajp_connection_tcp_get_message::jk_ajp_common.c (1150): (node5)
can't receive the response header message from tomcat, network problems or
tomcat (10.xx.xx.xx:8009) is down (errno=11)
[error] ajp_get_reply::jk_ajp_common.c (1962): (node5) Tomcat is down or
refused connection. No response has been sent to the client (yet)
[info] ajp_service::jk_ajp_common.c (2447): (node5) sending request to
tomcat failed (recoverable),  (attempt=1)

There is, I now see, a socket_timeout and a socket_connect_timeout that do
not show up in the jkmanager status page, in addition to
connection_pool_timeout, connect_timeout,  prepost_timeout, and
reply_timeout which are listed by jkmanager.

We had socket_timeout set to 10 seconds and I didn't know it. Our
transactions only take longer than 10 seconds a few times a week, so that's
why we weren't seeing it that often.

We'll fix by setting:

 socket_timeout=90
 socket_connection_timeout=5000
 retry_options=25

unless someone has a better idea.


Have a look at the example configuration conatined in the mod_jk source 
download tarball. It provides a pretty decent default configuration.


It is "recovery_options" not "retry_options". Furthermore I personally 
do not recommend the general "socket_timeout", but I *do* recommend to 
use all other timeouts with appropriate values. There's more info on 
Timeouts at


http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html

Regards,

Rainer


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Single POST request being handled twice

2011-11-18 Thread Jeremy
OK, we figured it out.  It's a case of too many timeout settings and not
having a real DevOps person on hand.  There was an obvious error message in
Apache's mod_jk.log that I failed to correlate with the problem because I
misread the timestamp on one of the many log entries.  Doh!

[info] ajp_connection_tcp_get_message::jk_ajp_common.c (1150): (node5)
can't receive the response header message from tomcat, network problems or
tomcat (10.xx.xx.xx:8009) is down (errno=11)
[error] ajp_get_reply::jk_ajp_common.c (1962): (node5) Tomcat is down or
refused connection. No response has been sent to the client (yet)
[info] ajp_service::jk_ajp_common.c (2447): (node5) sending request to
tomcat failed (recoverable),  (attempt=1)

There is, I now see, a socket_timeout and a socket_connect_timeout that do
not show up in the jkmanager status page, in addition to
connection_pool_timeout, connect_timeout,  prepost_timeout, and
reply_timeout which are listed by jkmanager.

We had socket_timeout set to 10 seconds and I didn't know it. Our
transactions only take longer than 10 seconds a few times a week, so that's
why we weren't seeing it that often.

We'll fix by setting:

socket_timeout=90
socket_connection_timeout=5000
retry_options=25

unless someone has a better idea.

Thanks!
=Jeremy=


On Wed, Nov 16, 2011 at 5:29 PM, Jeremy  wrote:

> OK, I know this seems crazy, but I've looked long and hard and cannot
> explain this as other than a Tomcat bug.  I'm more than willing to dig up
> extra information where I can, but this is a very rare problem seen in
> production and not reproducible at will.  Also, because this is in
> production, there is very little I can change, and even if I did change it,
> it would take a month or more to verify that it is really a fix.  So what
> I'm really looking for is a detailed understanding of the problem so we can
> detect it when it happens and clean up the after-effects.
>
> What we are seeing is one POST request being handled twice by our web app.
>  It looks as though somehow the request is duplicated by Tomcat.  Here's
> our setup:
>
> Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
> as front end, load balancing.
> Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04 with
> 64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
> Tomcat mod_jk connector has keepAliveTimeout=5000 and
> connectionTimeout=6 (60 seconds).
>
> Here is a simplified timeline of what we see in various logs.  I'm going
> to leave some fields out and change others to protect confidentiality and
> make it easier to follow, but I'll try to keep from changing anything
> important.  Timestamps are just minutes and seconds.
>
>
> 51:48 POST request received by Apache over SSL.  We see the request logged
> in Apache's access.log with this timestamp, but the log entry is actually
> written at 52:08 (or seems to be, since that is the time stamp of the next
> entry in the access.log, though the entries immediately before this one
> have timestamps of 52:02).
>
> 51:48 log4j application logs show processing of the request by the
> application in thread TP-Processor18.   Processing continues with updates
> to database and external systems until:
>
> 51:55 last application log entry from TP-Processor18 shows normal
> completion of processing, which should then return from handleRequest().
>  The database and three external systems the app communicates with all show
> normal processing.
>
> 52:00 log4j application log entries from TP-Processor6 show it beginning
> to handle the request.  Processing continues normally, just as it did in
> TP-Processor18 until:
>
> 52:08 last application log entry from TP-Processor6.  Again external
> systems show normal processing.
>
> 52:08 Tomcat access log shows entry for post request stating it was
> handled by TP-Processor6 in 7283ms
>
> 52:08 mod_jk.log: loadbalancer www.site.com 19.400742
>
> 53:06 Tomcat access log shows entry for post request stating it was
> handled by TP-Processor18 in 77440ms
>
>
> How is it possible that Tomcat has 2 threads handling the same request?
> How can we detect this as opposed to the user legitimately sending the
> same request twice?
>
> Thanks for your help!
> =Jeremy=
>
>
>
>


Re: Single POST request being handled twice

2011-11-17 Thread Mark Thomas
On 17/11/2011 08:48, Jeremy wrote:
> Mark, thanks for taking a crack at this.  I'm not that familiar with how
> mod_jk and AJP work under the covers, but based on my level of
> understanding, your scenario comes close but does not exactly match the
> logging we see.  My detailed comments below.
> 
> On Wed, Nov 16, 2011 at 11:53 PM, Mark Thomas  wrote:
> 
>> On 17/11/2011 01:29, Jeremy wrote:
>>> OK, I know this seems crazy, but I've looked long and hard and cannot
>>> explain this as other than a Tomcat bug.  I'm more than willing to dig up
>>> extra information where I can, but this is a very rare problem seen in
>>> production and not reproducible at will.  Also, because this is in
>>> production, there is very little I can change, and even if I did change
>> it,
>>> it would take a month or more to verify that it is really a fix.  So what
>>> I'm really looking for is a detailed understanding of the problem so we
>> can
>>> detect it when it happens and clean up the after-effects.
>>>
>>> What we are seeing is one POST request being handled twice by our web
>> app.
>>>  It looks as though somehow the request is duplicated by Tomcat.  Here's
>>> our setup:
>>>
>>> Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
>>> as front end, load balancing.
>>> Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04
>> with
>>> 64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
>>> Tomcat mod_jk connector has keepAliveTimeout=5000 and
>>> connectionTimeout=6 (60 seconds).
>>>
>>> Here is a simplified timeline of what we see in various logs.  I'm going
>> to
>>> leave some fields out and change others to protect confidentiality and
>> make
>>> it easier to follow, but I'll try to keep from changing anything
>> important.
>>>  Timestamps are just minutes and seconds.
>>>
>>>
>>> 51:48 POST request received by Apache over SSL.  We see the request
>> logged
>>> in Apache's access.log with this timestamp, but the log entry is actually
>>> written at 52:08 (or seems to be, since that is the time stamp of the
>> next
>>> entry in the access.log, though the entries immediately before this one
>>> have timestamps of 52:02).
>>>
>>> 51:48 log4j application logs show processing of the request by the
>>> application in thread TP-Processor18.   Processing continues with updates
>>> to database and external systems until:
>>>
>>> 51:55 last application log entry from TP-Processor18 shows normal
>>> completion of processing, which should then return from handleRequest().
>>>  The database and three external systems the app communicates with all
>> show
>>> normal processing.
>>>
>>> 51:55 log4j application log entries from TP-Processor6 show it beginning
>> to
>>> handle the request.  Processing continues normally, just as it did in
>>> TP-Processor18 until:
>>>
>>> 52:08 last application log entry from TP-Processor6.  Again external
>>> systems show normal processing.
>>>
>>> 52:08 Tomcat access log shows entry for post request stating it was
>> handled
>>> by TP-Processor6 in 7283ms
>>>
>>> 52:08 mod_jk.log: loadbalancer www.site.com 19.400742
>>>
>>> 53:06 Tomcat access log shows entry for post request stating it was
>> handled
>>> by TP-Processor18 in 77440ms
>>>
>>>
>>> How is it possible that Tomcat has 2 threads handling the same request?
>>
>> Because I think there are two requests here, rather than one.
>>
>> My guess as to what is happening is as follows:
>> - httpd sends request to Tomcat
>> - Tomcat processes it
>> - network glitch or similar means httpd never receives the response
>>
> 
> I'm with you up to here.  However, if the response were sent, then I'd
> expect to see  TP-Processor18 log its response in the Tomcat access log at
> 51:55.  Not some 71 seconds later,

It depends. If the response wasn't fully sent you could be seeing a
retry timeout of some form here.

> Also we have mod_jk set up with a reply timeout of 0, which means a retry
> would be triggered by something other than a timeout.

Ah. OK. That probably rules out that option then.

> We do have mod_jk
> retry set to 2 on he Apache servers; not sure how that is logged or what
> protections are in place against creating this problem.

Someone more familiar with the mod_jk code will hopefully suggest an
appropriate log level to get some insight into this.

> 
> - httpd resends request to Tomcat
>> - Tomcat processes it
>> - httpd receives the response
>> - httpd sends response to user
>>
>> Examining the mod_jk logs on httpd should show this although I don't
>> know what level of logging you'll need to see it.
>>
> 
> Me neither.  We're using cping/cpong to test the connectors before sending
> requests.   Those do sometimes fail, but I think that keeps the request
> from being sent in the first place.  The mod_jk log just shows the one
> request taking 19 seconds, doesn't show anything about retrying.
> 
> I'm particularly curious about how TP-Processor6 manages to take over just
> as TP-Processor18 should 

Re: Single POST request being handled twice

2011-11-17 Thread Jeremy
Mark, thanks for taking a crack at this.  I'm not that familiar with how
mod_jk and AJP work under the covers, but based on my level of
understanding, your scenario comes close but does not exactly match the
logging we see.  My detailed comments below.

On Wed, Nov 16, 2011 at 11:53 PM, Mark Thomas  wrote:

> On 17/11/2011 01:29, Jeremy wrote:
> > OK, I know this seems crazy, but I've looked long and hard and cannot
> > explain this as other than a Tomcat bug.  I'm more than willing to dig up
> > extra information where I can, but this is a very rare problem seen in
> > production and not reproducible at will.  Also, because this is in
> > production, there is very little I can change, and even if I did change
> it,
> > it would take a month or more to verify that it is really a fix.  So what
> > I'm really looking for is a detailed understanding of the problem so we
> can
> > detect it when it happens and clean up the after-effects.
> >
> > What we are seeing is one POST request being handled twice by our web
> app.
> >  It looks as though somehow the request is duplicated by Tomcat.  Here's
> > our setup:
> >
> > Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
> > as front end, load balancing.
> > Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04
> with
> > 64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
> > Tomcat mod_jk connector has keepAliveTimeout=5000 and
> > connectionTimeout=6 (60 seconds).
> >
> > Here is a simplified timeline of what we see in various logs.  I'm going
> to
> > leave some fields out and change others to protect confidentiality and
> make
> > it easier to follow, but I'll try to keep from changing anything
> important.
> >  Timestamps are just minutes and seconds.
> >
> >
> > 51:48 POST request received by Apache over SSL.  We see the request
> logged
> > in Apache's access.log with this timestamp, but the log entry is actually
> > written at 52:08 (or seems to be, since that is the time stamp of the
> next
> > entry in the access.log, though the entries immediately before this one
> > have timestamps of 52:02).
> >
> > 51:48 log4j application logs show processing of the request by the
> > application in thread TP-Processor18.   Processing continues with updates
> > to database and external systems until:
> >
> > 51:55 last application log entry from TP-Processor18 shows normal
> > completion of processing, which should then return from handleRequest().
> >  The database and three external systems the app communicates with all
> show
> > normal processing.
> >
> > 51:55 log4j application log entries from TP-Processor6 show it beginning
> to
> > handle the request.  Processing continues normally, just as it did in
> > TP-Processor18 until:
> >
> > 52:08 last application log entry from TP-Processor6.  Again external
> > systems show normal processing.
> >
> > 52:08 Tomcat access log shows entry for post request stating it was
> handled
> > by TP-Processor6 in 7283ms
> >
> > 52:08 mod_jk.log: loadbalancer www.site.com 19.400742
> >
> > 53:06 Tomcat access log shows entry for post request stating it was
> handled
> > by TP-Processor18 in 77440ms
> >
> >
> > How is it possible that Tomcat has 2 threads handling the same request?
>
> Because I think there are two requests here, rather than one.
>
> My guess as to what is happening is as follows:
> - httpd sends request to Tomcat
> - Tomcat processes it
> - network glitch or similar means httpd never receives the response
>

I'm with you up to here.  However, if the response were sent, then I'd
expect to see  TP-Processor18 log its response in the Tomcat access log at
51:55.  Not some 71 seconds later,
Also we have mod_jk set up with a reply timeout of 0, which means a retry
would be triggered by something other than a timeout.  We do have mod_jk
retry set to 2 on he Apache servers; not sure how that is logged or what
protections are in place against creating this problem.

- httpd resends request to Tomcat
> - Tomcat processes it
> - httpd receives the response
> - httpd sends response to user
>
> Examining the mod_jk logs on httpd should show this although I don't
> know what level of logging you'll need to see it.
>

Me neither.  We're using cping/cpong to test the connectors before sending
requests.   Those do sometimes fail, but I think that keeps the request
from being sent in the first place.  The mod_jk log just shows the one
request taking 19 seconds, doesn't show anything about retrying.

I'm particularly curious about how TP-Processor6 manages to take over just
as TP-Processor18 should be sending its response and how TP-Processor18's
response ends up getting logged over a minute later.

>
> The only hitch with the above is that I'd expect httpd to take longer to
> realise that there is a problem before re-trying the request.
>
>
Yep, me too.


> > How can we detect this as opposed to the user legitimately sending the
> same
> > request twice?
>
> Use mod_unique_id with mod

Re: Single POST request being handled twice

2011-11-16 Thread Mark Thomas
On 17/11/2011 01:29, Jeremy wrote:
> OK, I know this seems crazy, but I've looked long and hard and cannot
> explain this as other than a Tomcat bug.  I'm more than willing to dig up
> extra information where I can, but this is a very rare problem seen in
> production and not reproducible at will.  Also, because this is in
> production, there is very little I can change, and even if I did change it,
> it would take a month or more to verify that it is really a fix.  So what
> I'm really looking for is a detailed understanding of the problem so we can
> detect it when it happens and clean up the after-effects.
> 
> What we are seeing is one POST request being handled twice by our web app.
>  It looks as though somehow the request is duplicated by Tomcat.  Here's
> our setup:
> 
> Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
> as front end, load balancing.
> Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04 with
> 64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
> Tomcat mod_jk connector has keepAliveTimeout=5000 and
> connectionTimeout=6 (60 seconds).
> 
> Here is a simplified timeline of what we see in various logs.  I'm going to
> leave some fields out and change others to protect confidentiality and make
> it easier to follow, but I'll try to keep from changing anything important.
>  Timestamps are just minutes and seconds.
> 
> 
> 51:48 POST request received by Apache over SSL.  We see the request logged
> in Apache's access.log with this timestamp, but the log entry is actually
> written at 52:08 (or seems to be, since that is the time stamp of the next
> entry in the access.log, though the entries immediately before this one
> have timestamps of 52:02).
> 
> 51:48 log4j application logs show processing of the request by the
> application in thread TP-Processor18.   Processing continues with updates
> to database and external systems until:
> 
> 51:55 last application log entry from TP-Processor18 shows normal
> completion of processing, which should then return from handleRequest().
>  The database and three external systems the app communicates with all show
> normal processing.
> 
> 51:55 log4j application log entries from TP-Processor6 show it beginning to
> handle the request.  Processing continues normally, just as it did in
> TP-Processor18 until:
> 
> 52:08 last application log entry from TP-Processor6.  Again external
> systems show normal processing.
> 
> 52:08 Tomcat access log shows entry for post request stating it was handled
> by TP-Processor6 in 7283ms
> 
> 52:08 mod_jk.log: loadbalancer www.site.com 19.400742
> 
> 53:06 Tomcat access log shows entry for post request stating it was handled
> by TP-Processor18 in 77440ms
> 
> 
> How is it possible that Tomcat has 2 threads handling the same request?

Because I think there are two requests here, rather than one.

My guess as to what is happening is as follows:
- httpd sends request to Tomcat
- Tomcat processes it
- network glitch or similar means httpd never receives the response
- httpd resends request to Tomcat
- Tomcat processes it
- httpd receives the response
- httpd sends response to user

Examining the mod_jk logs on httpd should show this although I don't
know what level of logging you'll need to see it.

The only hitch with the above is that I'd expect httpd to take longer to
realise that there is a problem before re-trying the request.

> How can we detect this as opposed to the user legitimately sending the same
> request twice?

Use mod_unique_id with mod_rewrite to add the unique ID as a url
parameter and then include this in Tomcat's access log and/or write a
filter that checks for duplicates.

> Thanks for your help!

The situation you describe could also be triggered by a Tomcat bug. You
might want to try protocol="org.apache.coyote.ajp.AjpProtocol" on the
connector to switch the the Coyote AJP BIO implementation.

Mark

> =Jeremy=
> 


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Single POST request being handled twice

2011-11-16 Thread Jeremy
Chuck,

Thank you for responding to my query.  I'd be more than happy to fix our
webapp if I understood what it was doing wrong, but unfortunately I don't
understand your answer.  Would you please give me an example of how I would
purposefully write a webapp to create the behavior I witnessed?  I do not
know of any way to get Tomcat to generate two separate response log lines
from two separate threads while reporting the timestamps and processing
time as I described.

The closest I can come up with is if I wrote a ServletFilter that saved a
request and later hijacked some other request thread, replacing that
incoming request with the saved request, but then the response would go to
the requester whose thread had been hijacked, which is not what we observe.
 We observe the user getting the second generated response (the one logged
first in the Tomcat log).  Also in that case the mod_jk log would show two
requests, not one.

If you can show me how I can intentionally generate the kind of results we
are seeing then I will have a clue as to what to look for to fix in the
app.  As it is now, my understanding is that it is fundamental to Tomcat's
architecture that one request is handled exclusively in one thread and what
the logs are showing is a violation of that contract. Which is why I think
it is a Tomcat (or Apache or mod_jk or AJP) bug.

=Jeremy=


On Wed, Nov 16, 2011 at 10:25 PM, Caldarale, Charles R <
chuck.caldar...@unisys.com> wrote:

> > From: Jeremy [mailto:asfbugzi...@nuru.net]
> > Subject: Single POST request being handled twice
>
> > How is it possible that Tomcat has 2 threads handling
> > the same request?
>
> This is usually the result of an application coding or design error:
> storing a reference to a request in an inappropriate scope such as the
> session or a servlet instance or static field.
>
> > How can we detect this as opposed to the user legitimately
> > sending the same request twice?
>
> A) Fix your webapp.
> [snip]


RE: Single POST request being handled twice

2011-11-16 Thread Caldarale, Charles R
> From: Jeremy [mailto:asfbugzi...@nuru.net] 
> Subject: Single POST request being handled twice

> How is it possible that Tomcat has 2 threads handling 
> the same request?

This is usually the result of an application coding or design error: storing a 
reference to a request in an inappropriate scope such as the session or a 
servlet instance or static field.

> How can we detect this as opposed to the user legitimately 
> sending the same request twice?

A) Fix your webapp.

B) Place a hidden token (e.g., a sequence number)on the web page that is 
transmitted with the request, and updated with a different one on each 
response.  If the same token is seen twice, it's a retransmission.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is thus for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all computers.


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Single POST request being handled twice

2011-11-16 Thread Igor Cicimov
Bypass the apache and send the POST request to tomcat directly. Thant will
tell you where the problem is tomcat or apache.

On Thu, Nov 17, 2011 at 12:29 PM, Jeremy  wrote:

> OK, I know this seems crazy, but I've looked long and hard and cannot
> explain this as other than a Tomcat bug.  I'm more than willing to dig up
> extra information where I can, but this is a very rare problem seen in
> production and not reproducible at will.  Also, because this is in
> production, there is very little I can change, and even if I did change it,
> it would take a month or more to verify that it is really a fix.  So what
> I'm really looking for is a detailed understanding of the problem so we can
> detect it when it happens and clean up the after-effects.
>
> What we are seeing is one POST request being handled twice by our web app.
>  It looks as though somehow the request is duplicated by Tomcat.  Here's
> our setup:
>
> Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
> as front end, load balancing.
> Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04 with
> 64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
> Tomcat mod_jk connector has keepAliveTimeout=5000 and
> connectionTimeout=6 (60 seconds).
>
> Here is a simplified timeline of what we see in various logs.  I'm going to
> leave some fields out and change others to protect confidentiality and make
> it easier to follow, but I'll try to keep from changing anything important.
>  Timestamps are just minutes and seconds.
>
>
> 51:48 POST request received by Apache over SSL.  We see the request logged
> in Apache's access.log with this timestamp, but the log entry is actually
> written at 52:08 (or seems to be, since that is the time stamp of the next
> entry in the access.log, though the entries immediately before this one
> have timestamps of 52:02).
>
> 51:48 log4j application logs show processing of the request by the
> application in thread TP-Processor18.   Processing continues with updates
> to database and external systems until:
>
> 51:55 last application log entry from TP-Processor18 shows normal
> completion of processing, which should then return from handleRequest().
>  The database and three external systems the app communicates with all show
> normal processing.
>
> 51:55 log4j application log entries from TP-Processor6 show it beginning to
> handle the request.  Processing continues normally, just as it did in
> TP-Processor18 until:
>
> 52:08 last application log entry from TP-Processor6.  Again external
> systems show normal processing.
>
> 52:08 Tomcat access log shows entry for post request stating it was handled
> by TP-Processor6 in 7283ms
>
> 52:08 mod_jk.log: loadbalancer www.site.com 19.400742
>
> 53:06 Tomcat access log shows entry for post request stating it was handled
> by TP-Processor18 in 77440ms
>
>
> How is it possible that Tomcat has 2 threads handling the same request?
> How can we detect this as opposed to the user legitimately sending the same
> request twice?
>
> Thanks for your help!
> =Jeremy=
>


Single POST request being handled twice

2011-11-16 Thread Jeremy
OK, I know this seems crazy, but I've looked long and hard and cannot
explain this as other than a Tomcat bug.  I'm more than willing to dig up
extra information where I can, but this is a very rare problem seen in
production and not reproducible at will.  Also, because this is in
production, there is very little I can change, and even if I did change it,
it would take a month or more to verify that it is really a fix.  So what
I'm really looking for is a detailed understanding of the problem so we can
detect it when it happens and clean up the after-effects.

What we are seeing is one POST request being handled twice by our web app.
 It looks as though somehow the request is duplicated by Tomcat.  Here's
our setup:

Apache 2.2.14 using APR 1.3.8 and mod_jk 1.2.28 running on Ubuntu 10.04.3
as front end, load balancing.
Tomcat 6.0.29 using built-in AJP/1.3 connector running on Ubuntu 9.04 with
64-bit Sun Java 1.5.0_19.  Tomcat is not using APR.
Tomcat mod_jk connector has keepAliveTimeout=5000 and
connectionTimeout=6 (60 seconds).

Here is a simplified timeline of what we see in various logs.  I'm going to
leave some fields out and change others to protect confidentiality and make
it easier to follow, but I'll try to keep from changing anything important.
 Timestamps are just minutes and seconds.


51:48 POST request received by Apache over SSL.  We see the request logged
in Apache's access.log with this timestamp, but the log entry is actually
written at 52:08 (or seems to be, since that is the time stamp of the next
entry in the access.log, though the entries immediately before this one
have timestamps of 52:02).

51:48 log4j application logs show processing of the request by the
application in thread TP-Processor18.   Processing continues with updates
to database and external systems until:

51:55 last application log entry from TP-Processor18 shows normal
completion of processing, which should then return from handleRequest().
 The database and three external systems the app communicates with all show
normal processing.

51:55 log4j application log entries from TP-Processor6 show it beginning to
handle the request.  Processing continues normally, just as it did in
TP-Processor18 until:

52:08 last application log entry from TP-Processor6.  Again external
systems show normal processing.

52:08 Tomcat access log shows entry for post request stating it was handled
by TP-Processor6 in 7283ms

52:08 mod_jk.log: loadbalancer www.site.com 19.400742

53:06 Tomcat access log shows entry for post request stating it was handled
by TP-Processor18 in 77440ms


How is it possible that Tomcat has 2 threads handling the same request?
How can we detect this as opposed to the user legitimately sending the same
request twice?

Thanks for your help!
=Jeremy=