outlandishlizard commented on PR #619:
URL:
https://github.com/apache/httpcomponents-client/pull/619#issuecomment-2727681863
> > who may be several wrappers downstream, will have any idea that they
need to escape these values-- it's not a normal application security
consideration at all.
>
> This is a responsibility of people who wrap the library and enforce a
security model or provide a UI, not that of the library. Again, do not pin it
on us.
I have read what you said several times. The above quote seems to me to say
that you believe it is the responsibility of your users to escape the boundary
values, piercing the abstraction layer of "send a multipart message please" and
requiring a full scan of the request body in order to ensure it complies with
the specific quirks of multipart encoding this PR implements.
It seems to me that this is both onerous on the user, and substantially less
performant than using a random boundary value from a secure random source
(there will be a constant-time versus linear time cost breakpoint for certain--
for small enough multipart requests the linear scan probably wins!).
With regard to your comments about a "false sense of security":
The entire point of using a random value is to make the chance of a
collision of message content and boundary content negligible; the sense of
security provided is in no way false. The spec allows for up to 70 characters
of boundary; these characters are selected from a set of 75 potential options,
meaning that we have a total of:
75^70 =
179592599813797960985749775445106096943740867154224363601035145662298829808338299928818803698205019969691420556046068668365478515625
potential valid boundary delimiters, making the chance of a robust random
scheme producing a collision negligible, in the cryptographic sense of the word.
So, we are left with two options for providing users a robust path forward:
1. Add a requirement that users of this library perform a linear scan on
their message bodies for a library specific boundary identifier, in exchange
for a potential small performance boost for the library itself. Users who fail
to implement a linear scan will encounter some messages that are reliably
mangled by the library.
2. Use a random implementation, with a potential very small performance hit.
Assuming a trillion requests per second are processed by the library around the
world, the sun will have exploded before a collision occurs.
3. Entirely abdicate the responsibility for boundary generation as a
library, and require the boundary be supplied by the user. This is logically
consistent with the stance I'm seeing from you that the user should be
responsible for boundary encoding, although I personally think it's not a very
nice user experience.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]