Sorry, I should have mentioned that I am broadly aware of git
sparse-checkout
<https://github.blog/open-source/git/bring-your-monorepo-down-to-size-with-sparse-checkout/>
(most
similar to a git native approach for this) but have not gone into this in
detail or evaluated whether it could make sense in the context of something
like GoCD - or is really only something that could effectively be used by
an end-user.

This git feature was added to git subsequent to most of the rework I did on
the GoCD git-path plugin and I haven't looked at whether other build
automation tools have support for use of this server side.

-Chad

On Sun, Dec 29, 2024 at 2:32 PM Chad Wilson <[email protected]> wrote:

> Hiya Jason
>
> On Sun, Dec 29, 2024 at 8:46 AM Jason Smyth <[email protected]> wrote:
>
>> Hi everyone,
>>
>> I am starting with the git-path plugin and I am having trouble
>> understanding how it should be configured to ensure the files end up where
>> I expect them when the agent fetches them.
>>
>> I am working with a pipeline that uses the following materials (whittled
>> down to what I understand to be the relevant bits):
>>
>>     materials:
>>       App.Trunk:
>>         plugin_configuration:
>>           id: "git-path"
>>         options:
>>           url: https://dev.azure.com/Org/Project/_git/App
>>           path: Trunk
>>         destination: source/Project/App/Trunk
>>       App.Documents.Spec:
>>         plugin_configuration:
>>           id: "git-path"
>>         options:
>>           url: https://dev.azure.com/Org/Project/_git/App
>>           path: Documents/Spec
>>         destination: source/Project/App/Documents/Spec
>>
>> The intention was that the contents of App$/Trunk should be placed in
>> source/Project/App/Trunk and the contents of App$/Documents/Spec should be
>> placed in source/Project/App/Documents/Spec. Instead, the plugin seems to
>> be fetching the entire repo into each of the destinations. Is this the
>> expected behaviour?
>>
>
> Yes, it is the expected behaviour. It clones the entire repo and leaves
> everything else behind in other paths, at the versions they are current to
> as of the specific `path` (aka git ref spec). To my knowledge there is no
> native git way to get part of a file system tree like you'd suggest (as a
> git ref represents the state of the *entire repo* at a given commit
> independent of file system knowledge), so the only other alternative for
> the plugin's implementation would likely be to do some file system level
> hijinx to remove paths not fetched, which from a practical perspective
> would likely mean removing ability to use all possible git ref specs 
> (documented
> here
> <https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#constructing-path-expressions>)
> and instead allowing only simple path prefixes (which would be difficult to
> validate in its own right without diving into a git ref spec parser.
>
> Basically the git path plugin allows you to mitigate excessive triggering
> and reinterpret up-to-dateness for a subset of a repo (as opposed to the
> allowlist/denylist approach which have other problems) - but doesn't
> introduce some fuller concept of only fetching part of the git repo. The
> remaining clone is still a fully functional git working directory and
> repository off a given commit, which it would cease to be if doing some
> non-git-native hijinx afterwards. This is somewhat discussed at
> https://github.com/TWChennai/gocd-git-path-material-plugin?tab=readme-ov-file#stale-data-in-agent-repository-clones
> and is why the language/examples focus on how it monitors for changes.
>
> The other problem with this approach here is that if a commit was made
> that changes contents of both "Trunk" and "Documents/Spec", the independent
> materials could detect this single commit at different times due to the way
> material polling works. A triggered build may kick off with only the
> changes for "Trunk" and the previous ref for "Documents/Spec" (or vice
> versa). If these paths are not sufficiently independent, modelling as
> separate materials is likely to hurt rather than help.
>
>
>>
>> If so, are there any guidelines for how to deal with multiple git-path
>> materials that need to poll different paths in a single repo, while
>> ensuring that the relative paths remain intact on the agent at job run time?
>>
>
>
>>
>> Things that I need to consider:
>>
>>    - App.Trunk and App.Documents.Spec are likely to be reused across
>>    various pipelines, though not necessarily always together.
>>    - We probably do not want to configure a custom git-path material for
>>    every existing combination of paths.
>>    - There is a significant amount of cross-repository code, so relative
>>    paths both inside and across repositories can be relevant. (I.E., for any
>>    given file, the right version of that file needs to be downloaded to
>>    "./Project/Repo/path/to/file".)
>>
>>
> Only sensible option IMHO is to use a single material off the same "wider"
> repo for both paths (violating your second requirement)
>
> path: "Trunk, Documents/Spec"
>
> If you used something non-yaml to generate your config repo contents you
> could conceptually programmatically generate this.
>
> I'm not sure I understand what "cross repository code" means, but you
> could perhaps consider shifting some of that responsibility to git
> submodules - although I don't really like them personally due to complexity
> and the way they change developer flow. I also don't know how effectively
> submodules work with the git-path plugin specifically.
>
>
>>
>> I'm thinking I will need to pull the git-path materials into a separate
>> location, then copy the relevant files to the expected location in the
>> first (few) task(s) of the job. (E.G., fetch them into
>> ./git-path/<materialName>, then copy "./git-path/App.Trunk/Trunk" to
>> "./source/Project/App/Trunk" ) This feels incredibly hacky, though. Are
>> there any cleaner options?
>>
>
> Personally I don't think what you are trying to do is compatible with the
> conceptual design goals of either git or GoCD. If these various paths are
> really independent of one another and cannot be looked at within the scope
> of the wider repo or organised into a "simple" mono repo structure, should
> they possibly be independent repositories?
>
> Otherwise you are losing all the guarantees of GoCD that the materials are
> at consistent ref versions with one another, etc, and that a git repo at a
> given ref is a complete representation of the repo at that ref/sha. The
> git-path plugin already slightly moves away from the GoCD integrity
> guarantee to "allow" for different subsets of a repo to be considered as
> independent materials and push more "risk" into the hands of the user - but
> there's probably a limit to how far you should consider pushing that
> compromise.
>
> But to answer your question, no there are no cleaner options if trying to
> slice-and-dice a repository at various repository versions/refs and
> assemble it back together. Personally I would (and have historically)
> combined paths together if I still felt the plugin was useful enough to use
> in its current form.
>
> -Chad
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/go-cd/CAA1RwH9y-B62v6Q7kKB0TXavh3yLc7qVSm%2BA72pYdOUrUCRagA%40mail.gmail.com.

Reply via email to