Hello Martín,

Thanks for the write up. See my comments inline below.

Le jeu. 21 nov. 2024 à 11:42, Martín Abente Lahaye
<[email protected]> a écrit :
> Ultimately, what both of these are trying to determine is:
>
> * The full upstream URL to the source, e.g.,
> https://sourceware.org/git/glibc.git, to include it in the output
> manifest or the lorry configuration file.
> * The version of each source, e.g., 2.40, to include it in the output
> manifest.
> * Other information such as what the source is configured to track, to
> include it in the lorry configuration file.

I think this is pretty accurate. I would just add that it tries to
guess the version based on the tarball name or the git describe output
for a git repository (which it assumes is the format used by the ref).

> Therefore, instead of letting plugins and tools do all that unreliable
> guessing, we could provide what these ultimately need by adding new
> abstract methods that each Source can implement. For example, something
> like:
>
> * Source.get_urls(self) -> List[str]: Which would provide a list of full
> upstream URL without any guessing or relying on accessing private
> attributes, for the caller.
> * Source.get_versions(self) -> List[str]: Similarly, for the versions,
> but the tricky piece with this would be the need for a regexp for each
> source, e.g., in case the version needs to be extracted from the Source
> URL.
> * Source.get_trackings(self) -> List[Optional[str]]: Similarly, for the
> tracking strings. Of course this would only make sense for sources that
> can actually be tracked.

I'd add as prior art what was done in buildstream-plugins-community
(for collect_manifest plugin, but was extended for bst-to-lorry). See
[1]

Source.export_manifest(self) -> Dict[str, str]

This returns a more or less free-form dictionary with keys defined in
the collect_manifest plugin for a few things
* "type" for the type of the source (archive/git/patch)
* "url" or "path" for the provenance
* "commit" or "sha256" for the ref

As it's freeform, it can be extended to support other things, such as
the tracked branch which is read by bst-to-lorry.

> Or perhaps, something that better groups these tuples, e.g.,
> Source.get_actual_sources(self) -> List[Tuple[str, str, Optional[str]]],
> providing URL, version and tracking strings tuples or equivalent object.

This reminds me of the proposal to use the Remote Asset API [2] for
fetching the sources. If we end up going with an idea similar to this,
it should at least take into consideration a future use of Remote
Asset.

> An idea that was mentioned when discussing this topic with Abderrahim
> was to introduce to sources something similar to the elements public
> data, e.g., this way we could add that version matching regexp.

This would essentially be another way to have sources export free-form
data to other plugins. It's interesting in that we would reproduce the
same concept that we already have for elements, rather than inventing
something new.

Cheers,

Abderrahim

[1] 
https://buildstream.gitlab.io/buildstream-plugins-community/elements/collect_manifest.html
[2] https://github.com/apache/buildstream/issues/1274

Reply via email to