abderrahim commented on code in PR #1997:
URL: https://github.com/apache/buildstream/pull/1997#discussion_r2037693646
##########
src/buildstream/plugins/sources/local.py:
##########
@@ -110,6 +110,9 @@ def init_workspace_directory(self, directory):
#
self.__do_stage(directory)
+ def collect_source_info(self):
+ return [SourceInfo(self.path, SourceInfoMedium.LOCAL,
SourceVersionType.DIGEST, self.get_unique_key())]
Review Comment:
`get_unique_key()` only uses the hash of the digest, which isn't enough to
get the content from CAS.
Buildbox uses the format `{hash}/{size}`, which is what we're planning to
use in #1994 and #1991. We should use it here too. This would allow downloading
the source from CAS using e.g. `casdownload` from buildbox
##########
src/buildstream/downloadablefilesource.py:
##########
@@ -25,6 +25,46 @@
implementation.
+Built-in functionality
+----------------------
+The DownloadableFileSource class provides built in keys which can be set when
+intantiating any Source which derives from DownloadableFileSource
+
+* Guess pattern
+
+ This ``version-guess-pattern`` sets the regular expression which will be
used to attempt
+ to guess the version of a source when parsing the source's URI.
+
+ The DownloadableFileSource provides a default implementation of
+ :func:`Source.collect_source_info()
<buildstream.source.Source.collect_source_info>`,
+ which will use the ``version-guess-pattern`` to attempt to extract a human
readable
+ version string from the specified URI, in order to fill out the reported
+ :attr:`~buildstream.source.SourceInfo.version_guess`.
+
+ The URI will be *searched* using this regular expression, and is allowed to
+ yield a number of *groups*. For example the value ``(\\d+)_(\\d+)_(\\d+)``
would
+ report 3 *groups* if 3 numerical values separated by underscores were found
in
+ the URI.
+
+ The default value for ``version-guess-pattern`` is
``\\d+\\.\\d+(?:\\.\\d+)?``.
+
+ **Since: 2.5**.
+
+Built-in functionality
Review Comment:
I think this section doesn't belong here (or at least it should be merged
with the previous, similarly named one)
##########
src/buildstream/plugins/sources/workspace.py:
##########
@@ -108,6 +108,9 @@ def stage_directory(self, directory):
with self._cache_directory(digest=self.__digest) as cached_directory:
directory._import_files_internal(cached_directory,
collect_result=False)
+ def collect_source_info(self):
+ return [SourceInfo(self.path, SourceInfoMedium.WORKSPACE,
SourceVersionType.DIGEST, self.get_unique_key())]
Review Comment:
same comment as local
##########
src/buildstream/downloadablefilesource.py:
##########
@@ -270,6 +318,21 @@ def fetch(self): # pylint: disable=arguments-differ
"File downloaded from {} has sha256sum '{}', not
'{}'!".format(self.url, sha256, self.ref)
)
+ def collect_source_info(self):
+ version_match = self._guess_pattern.search(self.original_url)
+ if not version_match:
+ version_guess = None
+ elif self._guess_pattern.groups == 0:
+ version_guess = version_match.group(0)
+ else:
+ version_guess = ".".join(version_match.groups())
Review Comment:
Maybe we could have this version guessing logic in a function in utils?
##########
src/buildstream/source.py:
##########
@@ -262,10 +261,147 @@ def __init__(
@dataclass
class AliasSubstitution:
+ """AliasSubstitution()
+ An opaque data structure which may be passed through
+ :func:`SourceFetcher.fetch() <buildstream.source.SourceFetcher.fetch>` and
in such cases
+ must be provided to :func:`Source.translate_url()
<buildstream.source.Source.translate_url>`.
+ """
+
_effective_alias: str
_mirror: Union[SourceMirror, str]
+class SourceInfoMedium(FastEnum):
+ """
+ Indicates the medium in which the source is obtained
+
+ *Since: 2.5*
+ """
+
+ WORKSPACE = "workspace"
+ """
+ Files in an open workspace
+ """
+
+ LOCAL = "local"
+ """
+ Files stored locally in the project
+ """
+
+ REMOTE_FILE = "remote-file"
+ """
+ A remote file
+ """
+
+ GIT = "git"
+ """
+ A git repository
+ """
+
+
+class SourceVersionType(FastEnum):
+ """
+ Indicates the type of the version string
+
+ *Since: 2.5*
+ """
+
+ COMMIT = "commit"
+ """
+ A commit string which accurately represents a version in a source
+ code repository or VCS
+ """
+
+ SHA256 = "sha256"
+ """
+ An sha256 checksum
+ """
+
+ DIGEST = "digest"
+ """
+ A CAS digest representing the unique version of this source input
Review Comment:
We could be more explicit about the `{hash}/{size}` format?
##########
src/buildstream/source.py:
##########
@@ -684,6 +821,19 @@ def is_cached(self) -> bool:
"""
raise ImplError("Source plugin '{}' does not implement
is_cached()".format(self.get_kind()))
+ def collect_source_info(self) -> Iterable[SourceInfo]:
+ """Get the :class:`.SourceInfo` objects describing this source
+
+ This method is guaranteed to only be called whenever
+ :func:`Source.is_resolved() <buildstream.source.Source.is_resolved>`
+ returns `True`.
+
+ Returns: the :class:`.SourceInfo` objects describing this source
+
+ *Since: 2.5*
+ """
+ raise ImplError("Source plugin '{}' does not implement
collect_source_info()".format(self.get_kind()))
Review Comment:
Could we have this collect source info from source fetchers by default? Any
plugin that uses source fetchers will definitely want to implement source info
in source fetchers.
Of course, this is more for standardisation than anything, one can always
implement it like you did in
https://github.com/apache/buildstream-plugins/pull/87
##########
src/buildstream/_frontend/widget.py:
##########
@@ -437,6 +437,23 @@ def show_pipeline(self, dependencies, format_):
runtime_deps = [e._get_full_name() for e in
element._dependencies(_Scope.RUN, recurse=False)]
line = p.fmt_subst(line, "runtime-deps",
_yaml.roundtrip_dump_string(runtime_deps).rstrip("\n"))
+ # Source Information
+ if "%{source-info" in format_:
+
+ # Get all the SourceInfo objects
+ #
+ all_source_infos = []
+ for source in element.sources():
Review Comment:
A was playing with this a little, and came up with this:
```suggestion
source_element = element._get_source_element()
for source in source_element.sources():
```
given `Element._get_source_element()` gives the element that has the sources
for this element, it really made sense. It gives a nice solution to the issue
of actual sources of filter elements not being accessible.
I'm thinking we can push it further and move this whole loop into an
`Element.collect_source_info()` method. This way, something like the
`collect_manifest` plugin wouldn't need to call `Element.sources()` at all (and
we can deprecate it). It would be able to call `_get_source_element()` as
needed, and users won't have to worry about it.
Moving forward with this idea, we could even have the `compose` plugin
collect sources from all build dependencies that contribute files to the final
image. Though this might need some more thinking.
##########
src/buildstream/source.py:
##########
@@ -262,10 +261,147 @@ def __init__(
@dataclass
class AliasSubstitution:
+ """AliasSubstitution()
+ An opaque data structure which may be passed through
+ :func:`SourceFetcher.fetch() <buildstream.source.SourceFetcher.fetch>` and
in such cases
+ must be provided to :func:`Source.translate_url()
<buildstream.source.Source.translate_url>`.
+ """
+
_effective_alias: str
_mirror: Union[SourceMirror, str]
+class SourceInfoMedium(FastEnum):
+ """
+ Indicates the medium in which the source is obtained
+
+ *Since: 2.5*
+ """
+
+ WORKSPACE = "workspace"
+ """
+ Files in an open workspace
+ """
+
+ LOCAL = "local"
+ """
+ Files stored locally in the project
+ """
+
+ REMOTE_FILE = "remote-file"
+ """
+ A remote file
+ """
+
+ GIT = "git"
+ """
+ A git repository
+ """
+
+
+class SourceVersionType(FastEnum):
+ """
+ Indicates the type of the version string
+
+ *Since: 2.5*
+ """
+
+ COMMIT = "commit"
+ """
+ A commit string which accurately represents a version in a source
+ code repository or VCS
Review Comment:
Should we comment something about git specifically? In particular, our
plugins can use either the commit sha1, or a the long version of the git
describe.
I think the current description fits both, which I think is a good thing.
But maybe it's better to give an example.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]