how are errors handled, like if the format of php://input is unrecognized, not valid multipart/form-data and not valid application/x-www-form-urlencoded? errors? exceptions? nothing?
On Tue, 20 Jun 2023 at 11:26, Ilija Tovilo <tovilo.il...@gmail.com> wrote: > > Hi internals > > A while ago I encountered a limitation of how RFC1867 requests are > handled in PHP. PHP populates the $_POST and $_FILES superglobals when > the Content-Type is multipart/form-data or > application/x-www-form-urlencoded, but only when the method is POST. > For application/x-www-form-urlencoded PUT requests this is not a > problem because the format is simple, usually limited in size and PHP > offers functions to parse it, namely parse_str and parse_url. For > RFC1867 it's a different story. > > The code handling the request will need to use streams because RFC1867 > is often used with files, the format is much more complicated, files > should be cleaned up when the request ends if unused, etc. Handling > this manually is non-trivial. This has been reported many years ago, > and evidently caused a bit of frustration. > https://bugs.php.net/bug.php?id=55815 > > This is not limited to PUT either, multipart/form-data bodies are > valid with other requests. Here's the approach I believe is best. > > Introduce a new function (currently named populate_post_data()) to > read the input stream and populate the $_POST and $_FILES > superglobals. The function works for any non-POST requests. It assumes > that none of the input stream has been consumed, and that the > Content-Type is set accordingly. A nice side-effect of this approach > is that it may be used with the enable_post_data_reading ini setting > to decide whether to parse the RFC1867 bodies dynamically. For > example, a specific endpoint may accept bigger requests. The function > may be implemented in a more generic way 1. by returning the > data/files arrays instead of populating the superglobals and 2. by > providing an input stream manually. I don't know if there's such a > use-case and thus if this is worthwhile, as it would require bigger > changes in the RFC1867 handling. > > Here's the proof-of-concept implementation: > https://github.com/php/php-src/pull/11472 > > For completeness, here are other options I considered. > > 1. Create a new $_PUT superglobal that is always populated. Two > issues: The obvious one is that this is limited to PUT requests. While > we could also introduce $_PATCH, this seems like a poor solution. > While discouraged, other methods can also contain bodies. Another > issue is that the code for processing RFC1867 consumes the input > stream. This constitutes a BC break. Buffering the input is not > feasible for large requests that would be expected here. > 2. The same as option 1, but populate the existing $_POST global. This > comes with the same BC break. > 3. The same as options 1 or 2 with an additional ini setting to opt > into the behavior. The issue with this approach is that both the old > and new behavior might be desired in different parts of the same > application. The ini option can't be changed at runtime because the > populating of the superglobals happens before user code is being > executed. > > Let me know what your thoughts are. If there is consensus in the > feedback I'll update the implementation accordingly and post an update > to the list. If there is no consensus, I will create an RFC. > > Ilija > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php