gtristan commented on code in PR #1991:
URL: https://github.com/apache/buildstream/pull/1991#discussion_r2158310157
##########
src/buildstream/buildelement.py:
##########
@@ -285,10 +297,19 @@ def configure_sandbox(self, sandbox):
command_dir = build_root
sandbox.set_work_directory(command_dir)
+ def stage(self, sandbox):
# Setup environment
- sandbox.set_environment(self.get_environment())
+ env = self.get_environment()
- def stage(self, sandbox):
+ for digest_variable, element_list in self.__digest_environment.items():
+ dummy_sandbox = SandboxDummy(
+ self._get_context(), self._get_project(), plugin=self,
stdout=None, stderr=None, config={}
+ )
+ self.stage_dependency_artifacts(dummy_sandbox, element_list)
Review Comment:
@abderrahim writes:
> > In the case that we are staging elements in element_list which don't
depend on eachother, I don't believe that Element.stage_dependency_artifacts()
will do any sorting.
>
> I find that statement weird given that stage_dependency_artifact() is
advertised to do just that.
Right, this text could be clarified further, this function says:
> *"This is primarily a convenience wrapper around
`Element.stage_artifact()` which takes care of staging all the dependencies in
staging order and issueing the appropriate warnings."*
Which is not false, it does stage *all of the specified element* and it does
so *in staging order*, however if you look at what the code does, it does not
do any toplevel sorting of the input element sequence.
> I would also argue that if this is true, then it is orthogonal to the
changes in this pull request. There is no difference between:
>
> ```yaml
> filename:
> - a.bst
> - b.bst
> - c.bst
> config:
> location: /sysroot
> ```
> and
>
> ```yaml
> filename:
> - a.bst
> - b.bst
> - c.bst
> config:
> digest-environment: DEPENDENCIES_CAS_DIGEST
> ```
> in the current code, except for the fact that the former stages in a
subdirectory of the build sandbox and the latter stages in a dummy sandbox.
I think you are missing my meaning with your example.
So here is my hypothesis; given your `a`, `b`, `c` example, let us assume
that the toplevel element `e` in question depends on `b` and `c` which may also
depend on `a`, where `b` and `c` install the same file.
e.g.:
```
e
/ \
b c <--- both "b" and "c" install /etc/abc.conf
\ /
a
```
What I am saying, is that with this implementation, the value of
`DEPENDENCIES_CAS_DIGEST` may differ in the following two configurations.
### Configuration 1
Here we stage element `e` deterministically into the sandbox, and stage
`c`'s version of `/etc/abc.conf` over `b`'s version of `/etc/abc.conf` in an
orthogonal, extra stage operation which we use to determine the cas digest
`DEPENDENCIES_CAS_DIGEST`.
```yaml
filename:
- a.bst
- b.bst
config:
digest-environment: DEPENDENCIES_CAS_DIGEST
- c.bst
config:
digest-environment: DEPENDENCIES_CAS_DIGEST
```
### Configuration 2
Here we do the same as configuration 1, except that `b`'s version of
`/etc/abc.conf` is staged on top of `c`'s version of the same file, resulting
in a *different digest*.
```yaml
filename:
- a.bst
- c.bst
config:
digest-environment: DEPENDENCIES_CAS_DIGEST
- b.bst
config:
digest-environment: DEPENDENCIES_CAS_DIGEST
```
### A closer look
The body of `Element.stage_dependency_artifacts()` is doing the following:
```python
with self._overlap_collector.session(action, path):
for dep in self.dependencies(selection):
dep._stage_artifact(sandbox, path=path, include=include,
exclude=exclude, orphans=orphans, owner=self)
```
And, looking at `Element.dependencies()`, we can quickly see that when the
`selection` argument is given to select *"Subsets of the dependency graph"*,
the `self` element is not considered at all in the algorithm.
This means that the overall deterministic staging order *of element `e`* is
not considered when looping over the dependencies of the `selection` argument
given to `Element.stage_dependency_artifacts()`.
Arguably this algorithm in `Element.stage_dependency_artifacts()` needs to
be made more deterministic in this regard, although I wouldn't necessarily say
that the docs are lying here - they are yielded *in staging order* relative to
*eachother*, but they are not yielded in *the staging order of the given
element*.
And so I come back to your original argument:
> I would also argue that if this is true, then it is orthogonal to the
changes in this pull request.
I have a point of contention here.
I have not yet looked at whether the `location` dependency configuration
suffers from this issue as well, however, if we have this issue with the
`location` dependency configuration, it needs to be fixed, and I also do not
want to go as far as landing *this change* in the knowledge that this is indeed
an issue.
### Prove me wrong
There may be other ways that I am wrong about this bug, but I can think of
one possibility:
It is possible that in the `Element` loading/instantiation process, that the
dependency configurations are exposed to the plugin *in staging order* rather
than in the order in which they are listed in the element declaration (`.bst`
file)... if this is the case, `Element.dependencies()` may be sufficient for
both of these use cases.
It would however be good to add a test with this merge request, asserting
that we get the same CAS digest environment variable using both orderings of
element `e` in my overlapping example above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]