Re: RECC (or bazel) within BuildStream

Abderrahim Kitouni Fri, 20 Dec 2024 11:37:45 -0800

Hi all,

updating this after a private conversation with Tristan and Jürg.


Le mer. 11 déc. 2024 à 10:27, Tristan van Berkom
<[email protected]> a écrit :
> > I think the best course of action is to allow unlimited access to the
> > ActionCache, and document this caveat.
>
> I would like to hear a proposal of how a client can *reliably* use this
> in some recommendable way at least before moving forward with this.
>
> [...]
>
> How would we achieve this "quasi safe" approach to using RECC to speed
> up build performance without other BuildStream changes ?
>
> If it requires BuildStream changes to use RECC in an at least "quasi
> safe" way, what would those changes look like ?

The safe way to use RECC would be to use remote execution: buildbox
supports a platform property called chrootRootDigest. When this
property is set to the CAS digest of a directory, it is merged with
the uploaded input root. This allows recc to run using
buildbox-run-bubblewrap (which only stages the input root by default).
By setting the chrootRootDigest to a directory containing the
toolchain, and setting recc to upload the system headers, the
compilation runs in a sandbox and is indeed repeatable and safe.

Based on this, the quasi-safe approach would be to do the same thing
but only use "local execution" (i.e. build in the element sandbox and
upload the result to the cache). By setting the chrootRootDigest
platform property to the digest of the toolchain, it will be part of
the cache key, and any change to the toolchain will cause the file to
be rebuilt. It is quasi-safe because what is set to be the toolchain
could be insufficient or incorrect.

To be able to do this, BuildStream needs to be able to set an
environment variable in the element sandbox corresponding to the
digest of an element's artifact designated as "the toolchain". We need
to be able to choose the name of the environment variable: in the case
of RECC, this would be RECC_REMOTE_PLATFORM_chrootRootDigest. The
usual way we would do this is to have this value in a buildstream
variable (like the current element-name or max-jobs) and let the user
use it to set the appropriate environment variable. However, this
can't be done in this case as the CAS digest of the toolchain can only
be resolved at staging time.

For the actual implementation, I think we can have a new dependency
configuration for setting this. BuildElement would implement it, and
it can be used by all the build elements. There are some issues to
consider for the implementation, but nothing too hard. For instance,
currently the only way to get the virtual Directory of an Element's
artifact is to stage it (which requires a sandbox). Another one is
that setting the environment variables happens at configure_sandbox()
time, which is currently before staging.

I think those implementation issues can be resolved in due time, but
the proposal is now more solid. I'll comment in a separate post about
remote execution, we might want to include it in the proposal after
all.

Please let me know what you think.

Abderrahim

Re: RECC (or bazel) within BuildStream

Reply via email to