Igal and Mark,

On 9/24/24 13:55, Igal Sapir wrote:
Mark,


On Mon, Sep 23, 2024 at 12:25 AM Mark Thomas <ma...@apache.org> wrote:

On 23/09/2024 04:28, Igal Sapir wrote:
Hello,

The current implementation of getRequestId() is optimized for speed and
generates IDs that are unique to a running instance of Tomcat.

But most server configurations nowadays require uniqueness across the
whole
system, and currently we do not offer that as:

1. Request IDs are only unique to a running Tomcat instance

2. Request IDs are reset to 0 each time Tomcat is restarted

3. Request IDs are sometimes generated by another system like a load
balancer or reverse proxy, and passed around via the HTTP header
"X-Request-Id"

I want to propose a patch that would:

1. Check for HTTP header "X-Request-Id" and if valid (e.g. does not
attempt
SQL or XSS injection etc.) returns it


That is behaviour we'd typically place in a Valve or Filter. Possibly an
extension to the RemoteIp[Valve|Filter] ?

Rather than us validate it, I'd make processing it optional and the
admins responsibility to ensure it is trusted if they opt to process it.


Yes, that makes sense.  I can add that part to the RemoteIp[Valve|Filter]
as long as we can add a Setter for requestId that can be called from the
filter/valve.

I could have sworn this feature already existed. Rainer added support for httpd-generated unique request identifiers to mod_jk a while back and maybe I just assumed that he also added the code in Tomcat to accept them and wire them into getRequestId.

2. Generates a URL-safe Base64-encoded UUID (22 CaSe sensitive
characters)

How expensive is that process compared to the existing mechanism?


I will run some benchmarks to find out.  While it would almost certainly be
more expensive than AtomicLong.incrementAndGet(), I would think that the
value that it adds can be acceptable up to some arbitrary threshold of
per-request overhead, e.g. 1ms is too much but 50us might be acceptable.

We could also make it lazy-init so that it is only processed when
getRequestId() is called the first time per Request (though I would want it
in the Logs so every request would trigger it in that case), and possibly
and opt-in and pluggable implementation according to the next point below.

We could also generate a "proper" UUID on startup, discard a portion of it, and then increment a counter on each request. OR maybe not even bother discarding anything. Just a huge integer value that we begin randomly and then increment from there.

I'd like to see the performance numbers, of course.

I like the idea of lazy work, here. Since this is a new feature in the spec, probably 0% of applications are using it right now. Why bother wasting CPU cycles generating request identifiers for applications that aren't even using them?

The value will be set to the requestId private variable to ensure
consistent return value for multiple calls on the same Request.

I have the code ready, but wanted to discuss the matter here first.

The Servlet spec requires only that the ID is unique for the lifetime of
the container.

How will this interact with ServletRequest.getRequestId() and the
associated methods?


My idea is that ServletRequest.getRequestId() would still delegate calls to
Coyote's getRequestId(), so the new implementation would be used for it.



Should we make the request ID generator a pluggable component? If so, of
what?


That would be great as it would allow us to keep the current behavior as
default, for example, make this enhancement opt-in, and allow for future
implementations e.g. UUIDv7 which allows for natural order sorting.

I like plugability, and then we can offer at least two implementations: the naive one we have now which guarantees uniqueness within a single JVM and another one which guarantees universal uniqueness[1] but likely has a bit of a performance penalty.

Which component is a great question.  On one hand I'm thinking that Coyote
is part of the Connectors, but on the other hand we might want to allow
admins to configure different behavior for different Hosts or Contexts?

I personally feel that running multiple Hosts or Contexts in a single
Tomcat deployment is something that was more valuable in the past when
compute resources were much more expensive than they are today.  Nowadays
it is easy to deploy Tomcat in a container like Docker and map the hosts
and ports as needed, and I rarely find myself needing to have different
configurations for different Hosts or Contexts.

Speaking of hosts... since an engine probably already has a jvmRoute defined on it, maybe we could use that to provide cross-cluster uniqueness. You could have jvmRoute + random seed + counter = request id and that should be unique across your cluster.

-chris

[1] Of course, UUIDs are not actually guaranteed to be unique. Just really really REALLY unlikely to experience colissions.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to