Anurag,
On 10/15/23 04:48, Anurag Kumar wrote:
Hi, we are experiencing intermittent 404 errors with both GET and POST
calls. These errors are quite rare and have proven difficult to
reproduce in our testing environment. However, on our production system,
we encounter 3-4 cases daily out of 20-30 million requests where a 404
error appears in the Tomcat access logs, and the corresponding call
fails to reach the mapped servlet. Interestingly, the same calls work
perfectly just a few milliseconds before and after on the same node.
This inconsistency is causing significant issues, especially when
critical API calls fail and are not automatically retried.
Is there any open issue related to this problem that we should be aware of?
None that I know of personally.
Can you post your exact Tomcat version, your <Connector> configuration
with any secrets removed and a little more background on the type of
traffic you are seeing (e.g. HTTP/1.1 v h2, TLS or not, etc.). Are you
able to tell if these failed requests are part of any kind of pipelined
requests (HTTP Keep-Alive) or h2 single channels?
Understanding the network topology may be relevant, though its unlikely
that any lb/rp is doing this, as you can see the logs on the Tomcat
node. But it may change the way the requests are being handled based
upon the type of connection between the lb/rp and Tomcat.
Have you double-checked that the URIs are clean and don't contain
anything unexpected such as lookalike characters, etc.? I suspect this
is not an issue since you said "critical API calls fail" which leads me
to understand that you have legitimate customers reporting these
failures, instead of just investigating unexpected entries in your log
files.
Is your testing environment reasonably similar to production? What would
happen if you were to reply a whole day's worth of production-requests
through your testing environment?
Is there any pattern whatsoever in the failed requests? If you look at
every failed request for all time, are they randomly distributed
throughout your URI space, or do you find that some URIs are
over-represented in your failure data? You may have so few failures that
you can't draw any conclusions.
-chris
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org