Sorry, I should have mentioned that I am broadly aware of git sparse-checkout <https://github.blog/open-source/git/bring-your-monorepo-down-to-size-with-sparse-checkout/> (most similar to a git native approach for this) but have not gone into this in detail or evaluated whether it could make sense in the context of something like GoCD - or is really only something that could effectively be used by an end-user.
This git feature was added to git subsequent to most of the rework I did on the GoCD git-path plugin and I haven't looked at whether other build automation tools have support for use of this server side. -Chad On Sun, Dec 29, 2024 at 2:32 PM Chad Wilson <[email protected]> wrote: > Hiya Jason > > On Sun, Dec 29, 2024 at 8:46 AM Jason Smyth <[email protected]> wrote: > >> Hi everyone, >> >> I am starting with the git-path plugin and I am having trouble >> understanding how it should be configured to ensure the files end up where >> I expect them when the agent fetches them. >> >> I am working with a pipeline that uses the following materials (whittled >> down to what I understand to be the relevant bits): >> >> materials: >> App.Trunk: >> plugin_configuration: >> id: "git-path" >> options: >> url: https://dev.azure.com/Org/Project/_git/App >> path: Trunk >> destination: source/Project/App/Trunk >> App.Documents.Spec: >> plugin_configuration: >> id: "git-path" >> options: >> url: https://dev.azure.com/Org/Project/_git/App >> path: Documents/Spec >> destination: source/Project/App/Documents/Spec >> >> The intention was that the contents of App$/Trunk should be placed in >> source/Project/App/Trunk and the contents of App$/Documents/Spec should be >> placed in source/Project/App/Documents/Spec. Instead, the plugin seems to >> be fetching the entire repo into each of the destinations. Is this the >> expected behaviour? >> > > Yes, it is the expected behaviour. It clones the entire repo and leaves > everything else behind in other paths, at the versions they are current to > as of the specific `path` (aka git ref spec). To my knowledge there is no > native git way to get part of a file system tree like you'd suggest (as a > git ref represents the state of the *entire repo* at a given commit > independent of file system knowledge), so the only other alternative for > the plugin's implementation would likely be to do some file system level > hijinx to remove paths not fetched, which from a practical perspective > would likely mean removing ability to use all possible git ref specs > (documented > here > <https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#constructing-path-expressions>) > and instead allowing only simple path prefixes (which would be difficult to > validate in its own right without diving into a git ref spec parser. > > Basically the git path plugin allows you to mitigate excessive triggering > and reinterpret up-to-dateness for a subset of a repo (as opposed to the > allowlist/denylist approach which have other problems) - but doesn't > introduce some fuller concept of only fetching part of the git repo. The > remaining clone is still a fully functional git working directory and > repository off a given commit, which it would cease to be if doing some > non-git-native hijinx afterwards. This is somewhat discussed at > https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#stale-data-in-agent-repository-clones > and is why the language/examples focus on how it monitors for changes. > > The other problem with this approach here is that if a commit was made > that changes contents of both "Trunk" and "Documents/Spec", the independent > materials could detect this single commit at different times due to the way > material polling works. A triggered build may kick off with only the > changes for "Trunk" and the previous ref for "Documents/Spec" (or vice > versa). If these paths are not sufficiently independent, modelling as > separate materials is likely to hurt rather than help. > > >> >> If so, are there any guidelines for how to deal with multiple git-path >> materials that need to poll different paths in a single repo, while >> ensuring that the relative paths remain intact on the agent at job run time? >> > > >> >> Things that I need to consider: >> >> - App.Trunk and App.Documents.Spec are likely to be reused across >> various pipelines, though not necessarily always together. >> - We probably do not want to configure a custom git-path material for >> every existing combination of paths. >> - There is a significant amount of cross-repository code, so relative >> paths both inside and across repositories can be relevant. (I.E., for any >> given file, the right version of that file needs to be downloaded to >> "./Project/Repo/path/to/file".) >> >> > Only sensible option IMHO is to use a single material off the same "wider" > repo for both paths (violating your second requirement) > > path: "Trunk, Documents/Spec" > > If you used something non-yaml to generate your config repo contents you > could conceptually programmatically generate this. > > I'm not sure I understand what "cross repository code" means, but you > could perhaps consider shifting some of that responsibility to git > submodules - although I don't really like them personally due to complexity > and the way they change developer flow. I also don't know how effectively > submodules work with the git-path plugin specifically. > > >> >> I'm thinking I will need to pull the git-path materials into a separate >> location, then copy the relevant files to the expected location in the >> first (few) task(s) of the job. (E.G., fetch them into >> ./git-path/<materialName>, then copy "./git-path/App.Trunk/Trunk" to >> "./source/Project/App/Trunk" ) This feels incredibly hacky, though. Are >> there any cleaner options? >> > > Personally I don't think what you are trying to do is compatible with the > conceptual design goals of either git or GoCD. If these various paths are > really independent of one another and cannot be looked at within the scope > of the wider repo or organised into a "simple" mono repo structure, should > they possibly be independent repositories? > > Otherwise you are losing all the guarantees of GoCD that the materials are > at consistent ref versions with one another, etc, and that a git repo at a > given ref is a complete representation of the repo at that ref/sha. The > git-path plugin already slightly moves away from the GoCD integrity > guarantee to "allow" for different subsets of a repo to be considered as > independent materials and push more "risk" into the hands of the user - but > there's probably a limit to how far you should consider pushing that > compromise. > > But to answer your question, no there are no cleaner options if trying to > slice-and-dice a repository at various repository versions/refs and > assemble it back together. Personally I would (and have historically) > combined paths together if I still felt the plugin was useful enough to use > in its current form. > > -Chad > > -- You received this message because you are subscribed to the Google Groups "go-cd" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/go-cd/CAA1RwH9y-B62v6Q7kKB0TXavh3yLc7qVSm%2BA72pYdOUrUCRagA%40mail.gmail.com.
