Hi all,

On Fri, 2024-02-23 at 18:33 +0900, Tristan van Berkom wrote:
> Hi all,
> 
> This is a follow up, more elaborate extension of the proposal[0] to add
> SourceMirror plugins raised some months ago.
> 

I now have the following PRs up:

  Implement SourceMirror plugins:
  https://github.com/apache/buildstream/pull/1890

  DownloadableFileSource to support "auth-header-format"
  https://github.com/apache/buildstream/pull/1895

Here are my concerns with the current proposed patches, it would be
great to have some reviews to help me figure out whether we are ready
to go through with, whether it should be blocked, or whether some
improvements can be made post-hoc.


First...

In Jürg's previous reply on the last proposal, he asks:

    Could custom source mirror plugins also support URLs that are not
    aliased?

    https://lists.apache.org/thread/y2hoobkxdh38wytcfhpgcy4c93r2hd6w

Unfortunately not with the current implementation.

This is however a very good idea to support, especially considering
people may want to mirror sources from sub-projects which may not have
aliased all of their sources.

Further, given the addition of the `extra_data` parameter in #1890
above, I feel like it would be a good idea to even allow the project to
*replace* it's own default SourceMirror implementation with a custom
one.

This would allow for projects to use custom authorization in the
default case, fetching and tracking from the default URLs.

I feel like this could be extended in a subsequent addition, and does
not necessarily block these patches, but I defer to reviewers if they
have anything to add here.


Second... I'm not sure I'm completely satisfied with the currently
implemented API in #1895 (nor am I completely satisfied with the mock
http server I used to create the test case here).

In upstream python, we can observe how "Basic Authentication" is done
with the AbstractBasicAuthHandler (and HTTPBasicAuthHandler which is
the concrete class):

  https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L909

This "handler" is passed to "requests.build_opener()":

  https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L539

And that is what we currently use to perform "basic auth".

There may be a purpose to all this extra code - if we were to be more
stringent here, we would probably want to implement our own
"HTTPBearerAuthHandler" in the DownloadableFileSource code, and have it
run more stringent checks, rather than simply adding a header that
says:

  "Authentication: Bearer <token>"

If we wanted to go that route, then we should have
DownloadableFileSource implement a different API, e.g.:

  extra_data = {
    "auth-type" : "bearer-auth"
  }

And in the case that DownloadableFileSource finds this setting, it
could do it's whole fancy HTTPBearerAuthHandler thing more in line with
what upstream python is doing.

I feel like, if there is a compelling purpose for us to go that extra
mile and understand and implement HTTPBearerAuthHandler, then it
probably makes sense to block this from landing until we do so.

Thoughts ?

Cheers,
    -Tristan


Reply via email to