> On Mon, 2023-08-07 at 11:30 -0400, Maksim Chichikalov wrote:
> > > On Mon, Aug 7, 2023 at 8:41 AM Maksim Chichikalov
> > > <max.chichika...@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 7, 2023 at 4:01 AM Richard Purdie
> > > > <richard.pur...@linuxfoundation.org> wrote:
> > > > >
> > > > > On Sun, 2023-08-06 at 13:15 -0600, Joshua Watt wrote:
> > > > > > On Sat, Aug 5, 2023 at 7:54 PM Maksim Chichikalov
> > > > > > <max.chichika...@gmail.com> wrote:
> > > > > > > My name is Max; nice to e-meet you. I need your help with
> > > > > > > Equivalence Server :)
> > > > > > > I watched your presentation on Youtube - it 100% helped me
> > > > > > > to
> > > > > > > understand the logic better and debug the issue we are
> > > > > > > facing.
> > > > > > > Thank you for all your input into OE development.
> > > > > > >
> > > > > > > I found in OE repo that you introduced "def
> > > > > > > OEOuthashBasic()", so I
> > > > > > > decided to write to you first before opening the topic on
> > > > > > > an email
> > > > > > > list.
> > > > > > >
> > > > > > > The problem:
> > > > > > >
> > > > > > > Input data are propagated to the output hash generation
> > > > > > > function,
> > > > > > > which generates a different out-hash.
> > > > > > >
> > > > > > > Description:
> > > > > > >
> > > > > > > All "_git.bb" recipes have to append SRCPV to PV. As a
> > > > > > > result, PV
> > > > > > > is different on each commit.
> > > > > > >
> > > > > > > The OEOuthashBasic function includes SSTATE_PKGSPEC to
> > > > > > > generate
> > > > > > > hash, which contains PV (PV contains git hash). As a
> > > > > > > result, there
> > > > > > > is no way to generate the same out-hash even if changes
> > > > > > > introduced
> > > > > > > within a commit were trivial.
> > > > > > >
> > > > > >
> > > > > > Right, this is sort of on purpose, because the hash
> > > > > > equivalence is
> > > > > > basically trying to say that an sstate object can be used in
> > > > > > place of
> > > > > > another one, even when the task hashes aren't the same (but
> > > > > > the
> > > > > > output hashes are). However, the sstate code itself will only
> > > > > > look
> > > > > > for sstate object with a certain name (which include PV);
> > > > > > hash
> > > > > > equivalency does have _some_ control over the file name
> > > > > > sstate looks
> > > > > > for, since it will replace the taskhash portion of the name
> > > > > > with the
> > > > > > unified hash, but it doesn't have complete control.
> > > > > >
> > > > > > >
> > > > > > > In our codebase, our components have API part, which is
> > > > > > > managed by
> > > > > > > an independent recipe per component. The described above
> > > > > > > problem
> > > > > > > caused the recompilation of all components dependent on
> > > > > > > API, even
> > > > > > > in cases when API was not changed. CI for pull requests
> > > > > > > recompiles
> > > > > > > mostly the entire code base, I need to do something with
> > > > > > > it.
> > > > > > > (sorry, quite hard for me to explain it in a nutshell, let
> > > > > > > me know
> > > > > > > if you like to know slightly more details)
> > > > > > >
> > > > > >
> > > > > > Ya, sounds like a typical mono-repo design?
> > > > > >
> > > > > > >
> > > > > > > I see a couple of options for us:
> > > > > > > * Add a custom implementation of out-hash generated
> > > > > > > function and
> > > > > > > overwrite SSTATE_HASHEQUIV_METHOD.
> > > > > > > * Better understand why it's mandatory to append SRCPV to
> > > > > > > PV, and
> > > > > > > maybe it's flexible in our cases to do it.
> > > > > > >
> > > > > >
> > > > > > This might be the best option, at least for your recipes, but
> > > > > > I've
> > > > > > CC'd the list for additional feedback
> > > > > >
> > > > > > > * Propose a patch to fix OEOuthashBasic().
> > > > > > >
> > > > > > > In my humble opinion, the commit's hash shouldn't be
> > > > > > > included in
> > > > > > > out-hash generation, it doesn't make sense. Unless I'm
> > > > > > > missing
> > > > > > > something important - What are your thoughts?
> > > > > > >
> > > > > >
> > > > > > Yes and no. It's not intentional, but a side effect of hash
> > > > > > equivalency trying to make sure that the things it's marking
> > > > > > as
> > > > > > equivalent can actually be found in sstate (basically,
> > > > > > because sstate
> > > > > > include the commit hash, hash equivalency kinda has to
> > > > > > include it).
> > > > >
> > > > > This all sounds a bit unfortunate.
> > > > >
> > > > > sstate only works as long as the filenames are predictable.
> > > > > Some
> > > > > elements of the sstate filenames are essential to operation,
> > > > > e.g. the
> > > > > package architecture since one input would result in multiple
> > > > > files
> > > > > with the same hash in the filename of the output otherwise. The
> > > > > recipe
> > > > > name and version are there mainly for debugging to allow
> > > > > someone to
> > > > > more easily know where an sstate object came from and what it
> > > > > represents. This is summarised by the comment in sstate.bbclass
> > > > > "Fields
> > > > > 0,5,6 are mandatory, 1 is most useful, 2,3,4 are just for
> > > > > information"
> > > > > in generate_sstatefn().
> > > > >
> > > > > When we added hash equivalence, we added the ability to equate
> > > > > the
> > > > > hashes but we'd not considered that the version string mismatch
> > > > > may
> > > > > stop significant artefact reuse. I suspect at the time we
> > > > > reasoned that
> > > > > if the version changes, the output probably does too.
> > > > >
> > > > > Sadly fixing this isn't simple. Changing the hash algorithm
> > > > > isn't
> > > > > enough, we need to stop the SRCREV part of PV being used in the
> > > > > sstate
> > > > > filename. If we stop that happening, the output hash algorithm
> > > > > may well
> > > > > "just work" at that point, I'm not sure if it directly encodes
> > > > > PV or
> > > > > just the sstate filename, hopefully the latter.
> > > >
> > > > It's including SSTATE_PKGSPEC in the output hash calculation,
> > > > specifically so that a change in sstate filename changes the
> > > > outhash
> > > > (that way, hash equivalence will never map two difference sstate
> > > > files
> > > > as equivalent).
> > > >
> > > > >
> > > > > The hard part is how to do this generically without adding a
> > > > > lot of
> > > > > complex and potentially fragile code. The datastore context in
> > > > > that
> > > > > function is the core configuration, not the recipe's datastore.
> > > > > The
> > > > > options I can think of offhand:
> > > > >
> > > > > a) Live with the issue
> > > > >
> > > > > b) Put a hack into generate_sstatefn() which changed the PV
> > > > > element if
> > > > > it matched a pattern but we'd have to do this for each SCM as
> > > > > needed.
> > > > > Ugly but lowest overhead.
> > > > >
> > > > > c) Indirect PV to SSTATEPV (or similar) and then alter SSTATEPV
> > > > > to drop
> > > > > SRCPV to a fixed string. Nicer code in some ways but horrible
> > > > > parsing/performance overhead since every recipe, even non-SCM
> > > > > ones
> > > > > would be affected.
> > > > >
> > > > > d) A partial version of c) where recipes can set SSTATEPV to a
> > > > > function
> > > > > if they need it. Solves your specific case with overhead
> > > > > without
> > > > > affecting everyone else. Would not solve the issue generally
> > > > > without
> > > > > manual user intervention.
> > > > >
> > > > > I'm not sure which of these will end up making the most sense.
> > > > > These
> > > > > assume the output hash code uses the sstate filename and not
> > > > > PV. If it
> > > > > uses PV there would be more work needed.
> > > >
> > > > I'm not seeing where SRCPV or SRCREV are being explicitly added
> > > > to
> > > > SSTATE_PKGSPEC; PV is added, but SRCPV or SRCREV would have to be
> > > > manually added to PV in the recipe itself for that to happen
> > > > correct?
> > > > Maybe it's as simple as not doing that in these particular
> > > > recipes?
> > > >
> > > >
> > > > What are pitfalls of not adding SRCPV to PV for packages with
> > > > SRCREV:${PN} = ${AUTOREV}? Is it recommended just for improving
> > > > traceability and debugging?
> > > >
> > > > When I remove, bitbake is not happy - for example: if I remove
> > > > explicit addition of SRCPV, the bitbake throws the following
> > > > exception (need to dig deeper tp figure out why it happens only
> > > > for certain packages)
> > > >
> > > > raise bb.fetch2.FetchError("Recipe uses a floating tag/branch
> > > > without a fixed SRCREV yet doesn't call bb.fetch2.get_srcrev()
> > > > (use SRCPV in PV for OE).")
> > >
> > > Ah, I didn't realize you were using AUTOREV; I would recommend not
> > > doing that and instead manually specifying the SRCREV you want to
> > > build from instead; aside from giving a more reproducible history,
> > > I
> > > think it means you can remove SRCPV from PV and hash equivalency
> > > will
> > > work
> > >
> > > At my $DAYJOB, we don't use AUTOREV. Instead, we have a (rather
> > > dumb)
> > > script that makes a commit to update the SRCREV to match the
> > > current
> > > head revision of the BRANCH specified in a recipe. We run this
> > > every
> > > night for recipes where we want AUTOREV-like behavior. Upstream
> > > also
> > > has the Auto-Upgrade-Helper (AUH) that does something similar if
> > > you
> > > want to look at that too.
> >
> > TY Joshua!
> > AUTOREV is in use for pull request builds, when a package(s -> multi
> > PR builds) should be built from the latest commit of the specific
> > branch. I don't think we can stop using it, unless we add some
> > mechanism of setting SRCREV to hash of the latest commit on each
> > build invocation.
> >
> > Richard, what are your thoughts about:
> > * What are pitfalls of not adding SRCPV to PV for packages with
> > SRCREV:${PN} = ${AUTOREV}?
>
> If you don't add SRCREV (via SRCPV) to PV, PV doesn't change when the
> source control revision changes.
>
> If the version doesn't change, how does bitbake know to rebuild? How do
> packages get the right version in them so upgrades work?
>
> I guess in modern times with hashes in task stamps, SRCREV can be
> included directly into the task hashes so that part of this
> functionality is partly obsolete. Reworking things to adapt to that
> would be interesting but a fair amount of change/work.
>
> > * Is it recommended just for improving traceability and debugging,
> > or it has some bitabke/OE business logic implication?
>
> As it stands today, it is relied upon to trigger rebuilds with a number
> of follow on expectations.
>
> I'm not against changing this as when I step back and think about it,
> the reasoning for some of what we do is obsolete and we have better
> mechanisms for it now. It will be a lot of work to change though.
>
> Cheers,
> Richard

Richard, Joshua, thank you so much for being responsive and helping to 
understand the situation better. I need time to grasp it, consult with my 
colleagues, and decide how to overcome it.
JFYI, I observed it, and was under the impression that bitbake rebuilds even if 
SRCPV is not part of PV for packages using AUTORV. Because of it, I decided to 
try it out, and yes, bitbake fetches updated code (new commits) and recompiling 
package - not sure how it works and what information it's using :) Based on 
details you shared above, it shouldn't. Maybe it's some sort of side effect of 
something else in my environment.
Decided to share it with you.

I have a question about the discussion above - Do you guys want me to create a 
bug in the Bugzila?
BR,
Max

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#185662): 
https://lists.openembedded.org/g/openembedded-core/message/185662
Mute This Topic: https://lists.openembedded.org/mt/100586415/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to