It doesn't seem that long ago (kernel 3.x era) that the kernel repo was less than 1G in size, and while it wasn't actively shared, it was kind of one of those "we'll deal with that later" type things.
But today, where more people care about CI/CD - if you make use of both linux-yocto and linux-yocto-dev - well, you are looking at over 3G and 2.2G respectively. So that is over 5G - and we still don't have any effective sharing between them, or any formalized method for which another derived kernel in another layer can share and avoid duplicating downloaded data. So, add in a 3rd party kernel, such as a SoC variant (e.g. r-pi layer) and you've now got another 3G of kernel, and are up to 8G. What I'm going to show is we can do all that in less than two. OK, so what are we doing at a relatively high level? We are treating external repositories (mainline, stable, preempt-rt) as core building blocks, and exposing them as "jumping off" reference points for our two core kernel repositories, and for any other 3rd party kernel derivatives, so that we never download the same common git objects twice. In addition, we selectively download/fetch from the stable and -rt repos so we only pull a few megs and skip the dead end leaf nodes based on 2.x and 3.x and 4.x kernel versions we simply don't care about anymore. We use the same selective approach on our linux-yocto and linux-yocto-dev repos as well. This is all achieved right at the fetch level, using what are essentially "fetch-only" meta-packages that establish the core components to be shared/referenced. These fetch only recipes are in turn triggered by the parent recipe having a fetch dependency on their desired reference. Right now, if you do: bitbake -c cleanall linux-yocto-dev ; bitbake -c fetch linux-yocto-dev what will happen is that you will w-get 920M of a pre-mirror tarball of a 2014 copy of a 3.17 yocto-dev kernel and all its useless branches. It gets untarred (and deleted) and then a fetch chews and grinds on it doing the mega uplift from 3.17 --> 5.12 for a final 2.2G repo spread across five git packs (4 from 2014 and one new one). This all takes 10-15 minutes even on a well connected fast machine. Whee! With this series, the above fetch will take ~30s because we never throw away the mainline/stable/rt content, and hence only fetch less than 20MB of yocto-dev specific git objects from the yocto server/repo. The kernel as components: ------------------------- Lets look at what we really need, in order to build any BSP on v5.10 from linux-yocto, and similarly for v5.12 on linux-yocto-dev; showing the sizes of repacked git bare repos, and references symbolically, with the core mainline changes up to and including v5.10 as shared: 1.5G git.kernel.org.torvalds.linux-5.10 35M |--> git.kernel.org.stable.linux-5.10.y 7.1M |--> git.kernel.org.preempt-rt.linux-5.10.y 38M |--> git.yoctoproject.org.linux-yocto.git linux-5.10 (shared base) 148M |--> git.kernel.org.torvalds.linux-master (v5.10 --> latest) 2.9M |--> git.kernel.org.stable.linux-5.12.y 5.6M |--> git.kernel.org.preempt-rt.linux-5.12.y 18M |--> git.yoctoproject.org.linux-yocto-dev.git 1516+35+7+38+148+3+6+18 = 1771. So that is 1.7G. Add in ~100MB more for stable-5.4 and v5.4-rt and yocto v5.4 BSPs if you want them too. That would be pretty sweet, right? A heck of a lot less than 5 gigs! But can we do that? Actually, yes we can. And we can open the door for other sharing as well, and even split that big 1.5G chunk into smaller chunks. A bit more on that sharing, since it is important, and easy to gloss over. Consider the r-pi kernel repo, for no other reason than everyone is familiar with it. As of today, if you clone it w/o making any effort to share/reference, you'll get just shy of 3G. Instead, if you selectively fetch the rpi-5.10 (and reference stable-5.10 above) you'll get 6M and another 4M for rpi-5.12 (referencing linux-master above). So you'd be using 10M of that 3G (about 0.33%) but since we've not made it easy for sharing, many custom SoC/SDK kernels are similar, and you clone a whole bunch of stuff with 99% shared DNA with what you already have. Similar sizes/factors are in play when we consider the linux-stable and preempt-rt repositories - multiple gigs as a whole, but as we see above, the selected v5.10 components are only 35M and 7M respectively. Objectives: ----------- There were some key goals at play, even if I sort of back-declare them now with some level of revisionist reflection: -enable sharing off of key universally public reference repos for everyone -compartmentalize tech blocks to eliminate overlapping downloads of git objects -reduce download size further where possible (exclusion of content/gc/repack) -be ready to absorb further kernel growth and also EOL dead-end leaf nodes -make only minimal specific objective based changes to the git fetcher -remain compatible with the generally accepted Yocto workflows/design. With that in mind, it makes sense to now consider the above requirements in a bit more detail. Consider the 1st and last together - share + compatible. Adding a one line patch to the git fetcher to support "--reference" would enable a level of sharing but we'd have stuff spilling outside the normal download paths, and absolute paths inside SRC_URIs and non-portable (pre)mirror tarballs - failing miserably on the compatible goal. So we start by only allowing repos to reference others which are peers in the download dir - no /home/bob/my-super-kernel type stuff. This goes a long way towards keeping a portable download dir and SRC_URIs clean of absolute paths, and remaining compatible with the (pre)mirror type Yocto work flow and use cases. As will be seen, we don't even really expose the "--reference" flag use outside of the internal fetcher code. Technology blocks -- Looking back ~10 years, we did a lot more "kernel patching" (via git am) than just using (and merging) pre-applied commits in a branch of a technology feature - such as "stable" or "preempt-rt". But now the Yocto tree builds on the -rt tree which builds on stable which builds on mainline. We can see how they are chained together in the ascii nesting diagram above. If we ignored this internal ordering, we could end up with the stable content (git objects) duplicated inside the preempt-rt repo and even duplicated in the linux-yocto repos themselves. By exposing that building block on building block, we also get the universally recognized share points (mainline, stable-5.x, rt-5.x) that can be used by anyone to complete their full git object history. Reduce/exclude - as time has moived on, more and more repos are of such size and varying content, that using "--mirror" and "refs/*:refs/*" as a "grab everything" approach simply doesn't scale. So instead we allow a selective clone/fetch of just what we need and exclude everything else. Reduce/garbage-collect - there are opportunities for "low hanging fruit" in terms of getting rid of unused references/tags in linux-yocto[-dev] but since I can't directly control what is in those repos (or tar mirrors of them) we'll have to pursue that outside of this changeset. Reduce/repack - most people don't realize but the size of the pack you get from a server depends on how generous the server is with its CPU time. And that multiple packs can be significantly larger than one single pack. While we can't control individual servers, we can (and should) consider agressively repacking any repo we put into service for a (pre)mirror. That includes any repo content we encourage compartmentalizing in this changeset. Growth - The epoch --> v5.10 block of mainline commits is static and need not ever change from the group of objects we have in its pack today. The v5.10 is the effective merge-base between linux-yocto and -dev currently, and as such makes a sensible line in the sand for sharing. We currently have v5.10+ up to mainline/master coverage in the next repo in the chain, but it is trivial to create/insert a new block of static content covering v5.10 to v5.13 between v5.10 and master in the future to absorb new growth in manageable chunk sizes. We already add/use the same technology here to opptionally "split" the v5.10 block for those who need sub-gigabyte downloads (and repositories) for infrastructural reasons. EOL - While the repo sizes for the stable and -rt chunks above may not seem significant compared to the v5.10 basline size, they are specific to a particular baseline as leaf-node content. As such we can simply unlink them from our SRC_URI driven reference chains after which they won't appear (or download) for new users/builds but they will remain in people's old download dirs and build workspaces. Fetcher - we add support of sharing through a "--reference" like manner, and enforce our relative path requirements there. We also see a clear need for selective clone/fetch as per above, so we allow clone args to replace "--mirror" with "--single-branch" and an override of the fetch args and their current default of everything via. "refs/*:refs/*". Finally, if we want to have the stable-5.4, the stable-5.10 and the stable-5.12 as separate content for independent introduction and EOL, then we also have to allow them to be in separate repos in the download dir, even though they all were sourced from the same server/path/repo. This is achieved by allowing an optional recipe specified download name. I won't go into more detail here, since the fetcher commits all have proper commit logs and make sense in their own right, independent of the larger overall goals described above. Similarly, all the kernel recipe changes provide the working example/context of how all the fetcher changes are used. So even though the two groups are separate repos, I've chosen to present it all together against the poky repo, at least initially. Next Steps: ----------- With this being a functional implementation, it seems like a good time to get other people looking at it. Ideally step #1 will be getting general agreement that this is something we need, something that is overdue, and that the implementation as shown here makes sense in the absence of any similar effort from anyone that does the same but in a better way. >From there, we'll want more people not just looking at it, but testing it as well. I know I want to write a commit (script?) that will avoid any "transition tax" by prepopulating new repos with "old" already downloaded git objects where we can. And to add/do tests with my own popluated mirror and NO_NETWORK, and also try to ensure nothing in BB_SHALLOW gets upset, but I wasn't going to hold up starting a review of this any longer. I suspect I can get some co-workers using/testing it too, but Yocto gets used in a bunch of different ways by different groups, so we'll no doubt have to do some additional fixups to ensure everybody gets the benefits of this sharing. But I'm hopeful that when people see the benefits above, they'll pitch in to help take this the final mile by ensuring it works for their use case as well. I'm not too worried about pontificating out beyond that until we get past the acceptance/testing hurdles outlined above. So, please do have a read of the commits, kick the tires, put on your bikeshedding clothes and grab a brush, and lets see where it goes from here... https://github.com/paulgortmaker/poky/compare/reference-RC1 --- Paul Gortmaker (21): bitbake: fetch2/git: allow override of clone args with GITCLONEARGS bitbake: fetch2/git: allow limiting upstream fetch refs to a subset bitbake: fetch2/git: allow optional git download name overrride bitbake: fetch2/git: allow specifying repos as static/unchanging bitbake: fetch2/git: ensure static repos have at least one refs/heads bitbake: fetch2/git: allow alt references within download dir bitbake: fetch2/git: append new altref line if/when SRC_URI changed value bitbake: fetch2/git: allow pack references within download dir bitbake: fetch2/git: use constant names for packs in static repos kernel: add basic boilerplate for fetch-only recipes kernel: add a fetch-only recipe for mainline v5.10 source kernel: allow splitting mainline v5.10 source download in two kernel: allow splitting mainline v5.10 source download in three kernel: allow splitting mainline v5.10 source download in four kernel: add recipe for linux-master (mainline latest) kernel: add stable fetch recipes for v5.4.x, v5.10.x and v5.12.x kernel: add preempt-rt fetch recipes for v5.4.x, v5.10.x and 5.12.x kernel: make v5.4.x Yocto recipes use shared source kernel: make v5.10.x Yocto recipes use shared source kernel: make linux-yocto-dev recipe use shared source kernel: disable (pre)mirror for linux-yocto and linux-yocto-dev .../bitbake-user-manual-fetching.rst | 24 ++++ bitbake/lib/bb/fetch2/git.py | 135 +++++++++++++++++- documentation/ref-manual/variables.rst | 22 +++ meta/recipes-kernel/linux/fetch-linux.inc | 20 +++ meta/recipes-kernel/linux/fetch-only.inc | 20 +++ meta/recipes-kernel/linux/fetch-rt.inc | 25 ++++ meta/recipes-kernel/linux/fetch-stable.inc | 24 ++++ meta/recipes-kernel/linux/linux-3.3.bb | 9 ++ meta/recipes-kernel/linux/linux-3.8.bb | 9 ++ meta/recipes-kernel/linux/linux-4.0.bb | 9 ++ meta/recipes-kernel/linux/linux-4.12.bb | 10 ++ meta/recipes-kernel/linux/linux-4.18.bb | 10 ++ meta/recipes-kernel/linux/linux-4.3.bb | 10 ++ meta/recipes-kernel/linux/linux-5.10.bb | 38 +++++ meta/recipes-kernel/linux/linux-master.bb | 13 ++ meta/recipes-kernel/linux/linux-rt-5.10.bb | 9 ++ meta/recipes-kernel/linux/linux-rt-5.12.bb | 12 ++ meta/recipes-kernel/linux/linux-rt-5.4.bb | 9 ++ meta/recipes-kernel/linux/linux-yocto-dev.bb | 11 +- .../linux/linux-yocto-rt_5.10.bb | 7 +- .../linux/linux-yocto-rt_5.4.bb | 7 +- .../linux/linux-yocto-tiny_5.10.bb | 7 +- .../linux/linux-yocto-tiny_5.4.bb | 7 +- meta/recipes-kernel/linux/linux-yocto.inc | 10 ++ meta/recipes-kernel/linux/linux-yocto_5.10.bb | 7 +- meta/recipes-kernel/linux/linux-yocto_5.4.bb | 7 +- meta/recipes-kernel/linux/stable-5.10.bb | 10 ++ meta/recipes-kernel/linux/stable-5.12.bb | 16 +++ meta/recipes-kernel/linux/stable-5.4.bb | 11 ++ 29 files changed, 498 insertions(+), 10 deletions(-) create mode 100644 meta/recipes-kernel/linux/fetch-linux.inc create mode 100644 meta/recipes-kernel/linux/fetch-only.inc create mode 100644 meta/recipes-kernel/linux/fetch-rt.inc create mode 100644 meta/recipes-kernel/linux/fetch-stable.inc create mode 100644 meta/recipes-kernel/linux/linux-3.3.bb create mode 100644 meta/recipes-kernel/linux/linux-3.8.bb create mode 100644 meta/recipes-kernel/linux/linux-4.0.bb create mode 100644 meta/recipes-kernel/linux/linux-4.12.bb create mode 100644 meta/recipes-kernel/linux/linux-4.18.bb create mode 100644 meta/recipes-kernel/linux/linux-4.3.bb create mode 100644 meta/recipes-kernel/linux/linux-5.10.bb create mode 100644 meta/recipes-kernel/linux/linux-master.bb create mode 100644 meta/recipes-kernel/linux/linux-rt-5.10.bb create mode 100644 meta/recipes-kernel/linux/linux-rt-5.12.bb create mode 100644 meta/recipes-kernel/linux/linux-rt-5.4.bb create mode 100644 meta/recipes-kernel/linux/stable-5.10.bb create mode 100644 meta/recipes-kernel/linux/stable-5.12.bb create mode 100644 meta/recipes-kernel/linux/stable-5.4.bb -- 2.25.1
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#9635): https://lists.yoctoproject.org/g/linux-yocto/message/9635 Mute This Topic: https://lists.yoctoproject.org/mt/81808149/21656 Group Owner: linux-yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-