On Thu, 14 Sept 2023 at 14:56, Richard Purdie
<richard.pur...@linuxfoundation.org> wrote:
> For the task signatures, we need to think about some questions. If I
> make a change locally, can I query how much will rebuild and how much
> will be reused? There is bitbake --dry-run but perhaps it is time for a
> an option (or dedicated separate command?) to give some statistics
> about what bitbake would do? How much sstate would be reused?
>
> That then logically leads into the questions, can we tell what has
> changed? Why isn't my sstate being reused? For that we perhaps should
> define some existing scenarios where it is currently very difficult to
> work this out and then work out how we can report that information to
> the user. These could become test cases?

So I think there are two questions here that the tools should answer:

1. If I would run a build, what would be missing in the cache and need
to be built? The missing cache objects are in a dependency hierarchy,
so only those missing objects with no dependecies on other missing
objects would be printed. That should be comparatively easy to add as
bitbake already does those checks all the time. Is there something
else that's easily done and useful to print?

2. Then there's the question of *why* they are missing, which is
harder to answer. If, say, curl:do_package is not in the cache, then
the tool would have to walk the cache tree (I/O heavy operation as
there is no index), make a list of all curl:do_package objects that
are there, and do a recursive bitbake-diffsig (going up the task tree)
on them vs the one we want. Then print them starting with the newest.
Something like:

Existing cache objects are not suitable because:
<object id 1> was built on <date> and has a mismatching SRCREV
<object id 2> was built on <earlier date> and has a different do_compile()

> One of the big problems in the past was that we lost much of the hash
> information after parsing completed. This meant that if the hashes then
> didn't match, we couldn't tell why as the original computation was
> lost. I did some work on allowing us to retain more of the information
> so that we didn't have to recompute it every time to be able to do
> processing with it. I have to admit I've totally lost track of where I
> got to with that.

Here's an idea I can't get out of my head. Right now, the cache is
simply an amorphous mass of objects, with no information regarding how
they were created. How about storing complete build confgurations as
well into the same directory? There would be a dedicated, separate
area for each configuration that placed objects into the cache,
containing:
- list of layers and revisions
- config template used
- complete content of build/conf
- bitbake invocation (e.g. targets and prefixed variables like MACHINE etc.)
- complete list of sstate objects that were produced as a result, so
they can be checked for existence

This would be written into the cache dir at the very end of the build
when everything else is already there.

Right now, everyone sets up their own builds first, then points
local.conf or site.conf to the cache, and hopes for the best regarding
hit rates. Having stored build configs would allow inverting the
workflow, so that you first ask from the cache what it can provide
(e.g. it can provide mickledore or kirkstone core-image-minimal for
qemux86, and that's exactly what you want as a starting point), then
you use the build config stored in the cache to set up a build, and
run it - and that would guarantee complete sstate reuse and getting to
a functional image as soon as possible. Kind of like binary distro,
but implemented with sstate.

Alex
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#61010): https://lists.yoctoproject.org/g/yocto/message/61010
Mute This Topic: https://lists.yoctoproject.org/mt/101356418/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to