abderrahim commented on PR #1991:
URL: https://github.com/apache/buildstream/pull/1991#issuecomment-2994281772
> I should point out that `SandboxDummy` is declared in
`sandboxing/_sandboxdummy.py`, and not exposed in the toplevel `__init__.py`,
and so it is not a public API, I'm not sure there is a sensible reason to make
this public.
>
> It does appear in `sandbox/__init__.py` which may strictly speaking be a
bug on our part, as it may mislead people into thinking it is public.
I understand it's not intended to be public, and I commented about it
earlier. However my main point is that the only public way to gain access to an
element's artifact is through `Element.stage_artifact()` (and
`Element.stage_dependency_artifacts()`) which depends on having a sandbox.
I feel that adding an API to expose a dummy sandbox is easier than trying to
figure out what part of `self.__artifact.get_files()` to expose. Do we expose
the `Artifact` class? or just the files virtual directory?
> > I think the difference we need to consider between the two approaches is
whether or not to consider the dependencies? i.e. if I depend on gcc, is your
expectation that this only gets the digest of the gcc artifact or should it be
gcc and its runtime dependencies?
>
> I think this definitely needs to consider dependencies, although I'm not
sure how that is related to using a `SandboxDummy`.
`Element.stage_dependency_artifacts()` is the easiest (and I believe the
recommended) way to get an element's artifact including its dependencies. And
that requires a sandbox. It doesn't have to be a dummy sandbox, but a dummy
sandbox is already used elsewhere as a fallback when a "real" sandbox is not
available.
> Frankly, the use of `SandboxDummy` is not a huge deal breaker for me if it
really saves you some lines of code, my concerns about using it are:
> * Bringing in unrelated lines of code, i.e. _the use of a sandbox object_
> * Is it possible that sharing code in this way may cause sandbox related
changes to break this feature ?
> * The possibility that those orthogonal lines of code may incur an
unnecessary runtime overhead
> * Does this increase I/O in the calculation of the `digest-environment` ?
I don't think we need to worry about any of these. When staging, the
`Sandbox` object is only used for `Sandbox.get_virtual_directory()`. The only
unrelated lines of code that get executed are the `Sandbox.__init__()` and
`SandboxDummy.__init__()`, which are only doing attribute assignment. I don't
think the runtime overhead is significant, and there is no extra I/O involved
compared to staging directly to a virtual (cas-based) directory.
In an ideal world, I'd prefer that `Element.stage_directory()` take a
`Directory` rather than a `Sandbox`. Then for this MR, I'd want a way for a
plugin to create a cas-based `Directory` instance. But we're stuck with
`Element.stage_directory()` taking a `Sandbox` for now.
I'm happy to do it differently if you can think of a better way of doing it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]