Hi all, On Fri, 2024-02-23 at 18:33 +0900, Tristan van Berkom wrote: > Hi all, > > This is a follow up, more elaborate extension of the proposal[0] to add > SourceMirror plugins raised some months ago. >
I now have the following PRs up: Implement SourceMirror plugins: https://github.com/apache/buildstream/pull/1890 DownloadableFileSource to support "auth-header-format" https://github.com/apache/buildstream/pull/1895 Here are my concerns with the current proposed patches, it would be great to have some reviews to help me figure out whether we are ready to go through with, whether it should be blocked, or whether some improvements can be made post-hoc. First... In Jürg's previous reply on the last proposal, he asks: Could custom source mirror plugins also support URLs that are not aliased? https://lists.apache.org/thread/y2hoobkxdh38wytcfhpgcy4c93r2hd6w Unfortunately not with the current implementation. This is however a very good idea to support, especially considering people may want to mirror sources from sub-projects which may not have aliased all of their sources. Further, given the addition of the `extra_data` parameter in #1890 above, I feel like it would be a good idea to even allow the project to *replace* it's own default SourceMirror implementation with a custom one. This would allow for projects to use custom authorization in the default case, fetching and tracking from the default URLs. I feel like this could be extended in a subsequent addition, and does not necessarily block these patches, but I defer to reviewers if they have anything to add here. Second... I'm not sure I'm completely satisfied with the currently implemented API in #1895 (nor am I completely satisfied with the mock http server I used to create the test case here). In upstream python, we can observe how "Basic Authentication" is done with the AbstractBasicAuthHandler (and HTTPBasicAuthHandler which is the concrete class): https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L909 This "handler" is passed to "requests.build_opener()": https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L539 And that is what we currently use to perform "basic auth". There may be a purpose to all this extra code - if we were to be more stringent here, we would probably want to implement our own "HTTPBearerAuthHandler" in the DownloadableFileSource code, and have it run more stringent checks, rather than simply adding a header that says: "Authentication: Bearer <token>" If we wanted to go that route, then we should have DownloadableFileSource implement a different API, e.g.: extra_data = { "auth-type" : "bearer-auth" } And in the case that DownloadableFileSource finds this setting, it could do it's whole fancy HTTPBearerAuthHandler thing more in line with what upstream python is doing. I feel like, if there is a compelling purpose for us to go that extra mile and understand and implement HTTPBearerAuthHandler, then it probably makes sense to block this from landing until we do so. Thoughts ? Cheers, -Tristan
