Hello everyone,

Currently, some of our community plugins like collect_manifest [1] and tools like bst-to-lorry [2], rely on a combination of assumptions based on the reported “kind” of the Source and private Python APIs.

This can be problematic as both collect_manifest and bst-to-lorry query sources for their "kind" and assume that certain attributes and methods will be present (e.g., source.url). In fact, this has been discussed before at least once [3].

Although this seems to work, it’s unreliable because even if the “kind” string matches, there’s no guarantee that it is the expected Source, as it can be a different Source with the same “kind” string. Plus, even if it really is the expected Source, a future refactor could break these assumptions as these aren’t public APIs.

Ultimately, what both of these are trying to determine is:

* The full upstream URL to the source, e.g., https://sourceware.org/git/glibc.git, to include it in the output manifest or the lorry configuration file. * The version of each source, e.g., 2.40, to include it in the output manifest. * Other information such as what the source is configured to track, to include it in the lorry configuration file.

Therefore, instead of letting plugins and tools do all that unreliable guessing, we could provide what these ultimately need by adding new abstract methods that each Source can implement. For example, something like:

* Source.get_urls(self) -> List[str]: Which would provide a list of full upstream URL without any guessing or relying on accessing private attributes, for the caller. * Source.get_versions(self) -> List[str]: Similarly, for the versions, but the tricky piece with this would be the need for a regexp for each source, e.g., in case the version needs to be extracted from the Source URL. * Source.get_trackings(self) -> List[Optional[str]]: Similarly, for the tracking strings. Of course this would only make sense for sources that can actually be tracked.

Or perhaps, something that better groups these tuples, e.g., Source.get_actual_sources(self) -> List[Tuple[str, str, Optional[str]]], providing URL, version and tracking strings tuples or equivalent object.

A key question here is whether something like the above would still be too rigid or over-specified for that plugin and tool, and perhaps we should be thinking of a more free-form API to query for these.

An idea that was mentioned when discussing this topic with Abderrahim was to introduce to sources something similar to the elements public data, e.g., this way we could add that version matching regexp.

What do you all think?

Regards,
Martín.

Refs:
[1] https://gitlab.com/BuildStream/buildstream-plugins-community/-/blob/22023ad60e91ff3f635c556ed4c32ce4dfd7c2b5/src/buildstream_plugins_community/elements/collect_manifest.py [2] https://gitlab.com/CodethinkLabs/lorry/bst-to-lorry/-/blob/d6d3782071502c56611ceffa574d2f81e2a1eedd/bst_to_lorry.py [3] https://gitlab.com/BuildStream/buildstream-plugins-community/-/issues/2

Reply via email to