Hi internals,

Since its version 6.62.0 [1], libcurl features a brand new URL API [2] that
can be used to parse and generate URLs, using libcurl’s own parser. One of
the goals of this API is to tighten a problematic vulnerable area for
applications where the URL parser library would believe one thing and
libcurl another. This could and has sometimes led to security problems [3].
The new API is handled base and composed of 5 new functions [4] (and one
more since 7.80.0 to get more verbose information upon error).

I started to work on an implementation [5] to expose this API to PHP
userland in the curl extension so that PHP users can benefit from it. My
first reflex was to stay consistent on how the extension is currently
working and do a one to one mapping of the new curl functions. I got
feedback from both Ilija and Christoph saying that the API is, I quote, "a
bit clunky" which I can't disagree with.

For a long time, ext/curl worked with handles as "curl resources" and
functions mapped to the libcurl functions, but since PHP8.0 "curl
resources" were converted to opaque classes with no methods. So my main
goal here is to come up with a solution for the new Curl URL API, but maybe
also to be consistent, make some changes to existing curl classes like
`CurlHandle` to add methods and improve the current state of the extension.

Here is the solution I would propose :
- First of course keep all the current ext/curl functions as is not to
break any compatibility (Seems obvious but just to be sure everybody is on
the same page).
- For consistency expose the new Curl URL API as functions mapped one to
one to libcurl functions :

function curl_url(?string $url = null): CurlUrl|false {}
function curl_url_set(CurlUrl $url, int $part, string $content, int $flags
= 0): bool {}
function curl_url_get(CurlUrl $url, int $part, int $flags = 0):
string|false {}
function curl_url_errno(CurlUrl $url): int {}
function curl_url_strerror(int $error_code): ?string {}

- Add methods to the CurlUrl object to make it less opaque and expose an
object oriented style API. I would keep it minimal and let userlanAd API
provide higher level APIs as guzzle for example. (You can see the current
implementation [5])

final class CurlUrl implements Stringable
{
    public function __construct(?string $url = null) {}
    public function get(int $part, int $flags = 0): string {}
    public function set(int $part, string $content, int $flags = 0): void {}
    public function getErrno(): int {}
    public function __toString(): string {}
    public function __clone() {}
}

- It would also be nice to add this object oriented API for existing
CurlHandle, CurlMultiHandle and CurlShareHandle classes. For example the
CurlHandle class would look like that (First implementation [6])

final class CurlHandle
{
    public function __construct(?string $url = null) {}
    public function setOpt(int $option, mixed $value): bool {}
    public function getInfo(?int $option = null): mixed {}
    public function exec(): string|bool {}
    public function escape(string $string): string|false {}
    public function unescape(string $string): string|false {}
    public function pause(int $flags): int {}
    public function getErrno(): int {}
    public function reset(): void;
    public function setOptArray(array $options): bool {}
    public function upkeep(): bool {}
    public function __clone() {}
}

As of right now I still have some unanswered questions like how should we
handle errors on the new CurlUrl API ?
- Throw `CurlUrlException` on both the procedural and object oriented style
API (that's how current implementation works [5])
- Throw `CurlUrlException` with the oriented style API, but return for
example boolean/null on errors when user uses the procedural API ?
- Always return null/booleans using both object oriented API and
procedural API ?

The same question applies if we add a new object oriented style API on
existing classes. Current functions MUST stay unchanged but should we throw
`CurlException` when the user uses the object oriented style API ? (That's
what I would do) Or should we return the same result from OO and procedural
API ? Should we make the new CurlApi consistent with that ?

What are your thoughts about all this ? Any feedback on any of those
subjects are more than welcome.

Pierrick

[1]
https://daniel.haxhttps://github.com/adoy/php-src/tree/curl-object-apix.se/blog/2018/10/31/curl-7-62-0-moar-stuff/
<https://daniel.haxx.se/blog/2018/10/31/curl-7-62-0-moar-stuff/>
[2] https://daniel.haxx.se/blog/2018/09/09/libcurl-gets-a-url-api/
[3]
https://www.blackhat.com/docs/us-17/thursday/us-17-Tsai-A-New-Era-Of-SSRF-Exploiting-URL-Parser-In-Trending-Programming-Languages.pdf
[4] https://github.com/curl/curl/blob/master/include/curl/urlapi.h
[5] https://github.com/php/php-src/pull/8770
[6] https://github.com/adoy/php-src/tree/curl-object-api

Reply via email to