BTW: wrt fork (clone) vs subtree (I'll have to read https://medium.com/@porteneuve/mastering-git-subtrees-943d29a798ec again) - the way I understand it is:
One can do push/pull/merge with subtree that are equivalent to push/pull/merge with a clone. i.e; with a local clone one does: [branch is implicit here so lets stick with default master branch] git clone repo-URL cd repo <edit> git commit git pull git push with subtree one does: cd petsc git subtree add --prefix=location repo-url [branch] <edit> git commit git subtree pull --prefix=location repo-url [branch] git subtree push --prefix=location repo-url [branch] i.e the complexity of branches/merge issues don't go away - it just avoids the extra repo [and avoids making 2 commits in sync in 2 different repos] Notes: - wrt petsc4py we don't have a 'subtree pull' in the workflow - and 'subtree push' is currently broken - so petsc4py workflow is currently broken. Also note: with the previous petsc4py workflow we had: - make MR to petsc4py - use this commit with --download-petsc4py in petsc [in MR branch] - do not merge petsc-MR until petsc4py-MR is merged [this way - any revisions in petsc4py-MR - that will change the hash can be updated in petsc-MR] With the current discussion - the workflow for examples is more complicated - i.e the changes petsc-developers do and that get accepted upstream can be different [perhaps temporarily.] - so there is extra complexity here where local changes (fork or subrepo) are not in sync with upstream changes. One way to sync is: - make local changes (and use them with petsc CI) - make separate MR to upstream (revise and update the MR - as needed for acceptance). Note: I'm not sure if there is a workflow to test these revisions in petsc cI - once merged to upstream example repo - sync [subtree, or fork] - either a merge - or revert local changes, pull upstream changes. So a simpler model is [as Ed referred to]: use upstream only, and the CI use (allow_failure:true) - and then Ed can fix the issues as he sees fit [and there will be CI errors until the fix is in]. And the petsc-MR is not held back (even with fundamental failures) Satish On Sun, 1 Nov 2020, Satish Balay wrote: > I should add - we should: > > - ignore git subtree or a fork > - assume one (or more) petsc developers have write access to create branches > in upstream repo > - figure out workflow for the requirements [and current constraints wrt book]. > > - a fork primarily helps with write access [wrt branches, when no write > access is available for upstream repo] - there is a mirror sync cost > - a subtree [assuming its not buggy] - helps with a single commit workflow > [vs multiple commits in different repos - and keeping them in sync] > - [a default is neither - using upstream repo directly] > > And in all cases there is similar synchronization cost between upstream repo > and fork/subtree [this depends on the requirements - and current constranits] > > Satish > > On Sun, 1 Nov 2020, Satish Balay via petsc-dev wrote: > > > On Sun, 1 Nov 2020, Barry Smith wrote: > > > > > > > > > > > > On Nov 1, 2020, at 10:30 AM, Satish Balay <ba...@mcs.anl.gov> wrote: > > > > > > > > Just a note: subtree workflow is currently broken with petsc4py > > > > > > > > We've imported petsc4py using subtree - but now we are unable to export > > > > back changes in petsc repo to the standalone petsc4py repo due to git > > > > subtree errors. > > > > > > > > I spent some time on workarounds - but didn't go anywhere. > > > > > > > > [likely this is a non-issue with this examples repo workflow - as there > > > > is no direct push back to upstream repo] > > > > > > > > And with subtree - not all developers need to know about it. Only the > > > > person who is syncing the 2 repos.. [if they know how to sync/resolve > > > > conflicts] > > > > > > Since each person is reponsable for their branch getting merged, > > > doesn't this mean anyone who changes something in PETSc that requires a > > > change in p4pdes has to fix it via gittree process ( that is have have to > > > sync the 2 repose?) and hence now how to sync 2 repos via gittree. > > > > > > For one its not clear what problem is being addressed here [as mentioned in > > my other e-mail.] > > > > >From Ed's e-mail - my understanding is - whatever changes petsc developers > > >might have - might not be acceptable. > > > > So I think first we need to figure out what is acceptable and not - and > > then figure out the workflow. > > > > " Since each person is reponsable for their branch getting merged, " > > > > So its not clear to me if this fits with what Ed mentioned. [wrt his goals > > of keeping the examples in sync with the book] > > > > BTW I should note: the current process [which I don't understand > > completely] looks more complicated than what we had with petsc4py - which > > you were against at - and raised this at every opportunity - until > > petsc4py was imported into petsc (using subtree) > > > > Satish > > > > > > > > Until I see the entire gittree workflow I will not understand. > > > > > > Barry > > > > > > > > > > > Satish > > > > > > > > On Sun, 1 Nov 2020, Jed Brown wrote: > > > > > > > >> Barry Smith <bsm...@petsc.dev> writes: > > > >> > > > >>>>> git subtree does require use of new commands every time you mess > > > >>>>> with it (say every three months) that we do not know and since each > > > >>>>> of > > > >>>>> us will do this infrequently it is likely we will not remember them > > > >>>>> (I won't) while my approach does not require remembering new > > > >>>>> commands. > > > >>>>> My approach only requires branching, push, and pulling on the fork > > > >>>>> and updating EdsRepo.py which is things all the developers know > > > >>>>> about and > > > >>>>> do regularly since we update other external packages. > > > >>> > > > >>>>> Circular dependencies (PETSc CI depends on Ed's p4pdes depends on > > > >>>>> PETSc) are significantly more labor-intensive and it would need to > > > >>>>> be done on each change, versus once per release cycle. > > > >>> > > > >>> I think we are not communicating on exactly the same wave length. > > > >>> > > > >>> I am not advocating we test on Ed's I am advocating we test on our > > > >>> fork of Ed's and how often our fork gets passed to Ed with MR is a > > > >>> choice all of us make together, now we could do it once at release > > > >>> time with all the fixes we made, every month or each time. Totally up > > > >>> to Ed how often he wants to get them, and he is free to ignore them > > > >>> for weeks (up to the next release if he likes) since our testing will > > > >>> continue because we use our fork. How often we pass them on to Ed is > > > >>> not related to how we maintain our CI. > > > >>> > > > >>> There is no circular dependency on my approach with Ed's p4pdes. > > > >> > > > >> The circular dependency is on our fork of p4pdes. > > > >> > > > >>> I vote to fix things in our fork or gittree thing continuously > > > >>> since it makes it easier to fix things rather than wait to the > > > >>> release when we try to find and fix everything and it also helps tell > > > >>> us if we introduced a real bug into PETSc and fix PETSc immediately > > > >>> instead of waiting up to 6 months, just like we now we test > > > >>> immediately with Petsc4py and we should do with SLEPc. How often we > > > >>> give the updates to Ed is a completely different issue. > > > >>> > > > >>> So again back to my original statement ,it comes down to if the > > > >>> subtree or the fork approach is easier for all the PETSc developers > > > >>> who do not currently know gittree and would need to learn it with > > > >>> your approach. I don't know which is easier learning to use gittree > > > >>> which has its own gotcha's or using mine which we all know but may > > > >>> require an extra step (not involving Ed, just updating the p4pdes.py > > > >>> commit each time we change something in the fork.) > > > >>> > > > >>> I think using --download-p4pdes on a couple of systems in the CI is > > > >>> enough, I don't think we need to put it all CI pipelines (I would > > > >>> like slepc in all pipelines).but we could put in all pipelines if we > > > >>> want. > > > >>> > > > >>> For completeness I show the exact the work flow for my suggestion > > > >>> > > > >>> pipelines --download-p4pdes and runs its tests > > > >>> if it breaks the developer uses --download-p4pdes on their system > > > >>> they fix the problem either by fixing PETSc or what is downloaded > > > >>> from the p4pde fork > > > >>> if the fix is in the p4pdes fork they make a branch in the > > > >>> p4pdes fork, which they already have since they used > > > >>> --download-p4pdes and thus > > > >>> have the fork on their system > > > >>> they put the fix in new branch in the p4pdes fork and push > > > >>> it > > > >>> they edit p4pdes.py and put a new commit in it pointing to > > > >>> their branch in the fork > > > >>> run the pipeline again > > > >>> if fails with p4pdes they do the above again > > > >>> else the PETSc branch gets accepted and merged to master > > > >>> depending on Ed's choice we make an MR for p4pdes depending > > > >>> on the agreed upon cycle. If Ed puts the fix into his master then we > > > >>> just update > > > >>> our fork with his latest master with a simple merge of > > > >>> his. This will then become the new one we test against. If Ed > > > >>> doesn't respond to the MR all is fine we > > > >>> just continue on our fork. If he puts other things in > > > >>> his branch but not our MR we just merge that into our fork and so are > > > >>> still testing with his latest master. > > > >> > > > >> As you've enumerated here, this requires three MRs per change: > > > >> > > > >> 1. in PETSc with the actual change and to point at a p4pdes commit > > > >> 2. in p4pdes (Ed's or our fork) to implement needed changes (note: > > > >> this can't merge until #1 merges) > > > >> 3. in PETSc to point at the merge commit of p4pdes (after #2 merges) > > > >> > > > >> With subtree, there is only one MR and it's in the PETSc repository. > > > >> > > > >> Recall that we exchanged hundreds of emails to subtree in BuildSystem > > > >> long ago, then quite a lot more to subtree in petsc4py recently, and > > > >> now we're doing it again. > > > >> > > > >> The arguments against subtreeing p4pdes are > > > >> > > > >> 1. We want an extra hurdle so developers think twice before changing > > > >> those interfaces > > > >> 2. We want to drive traffic to Ed's repo > > > >> 3. The repository is so big we don't want every `git clone > > > >> gitlab:petsc/petsc` to include it > > > >> > > > >> This repo is small so I'm most sympathetic to #2, but nobody has made > > > >> these arguments yet. > > > >> > > > >>> This happens to be the exact same thing I do now with slepc and any > > > >>> other git based package that I use now (except for some I need to > > > >>> make my own fork of the external package because we don't have one > > > >>> always available eventually likely we will likely put more forks into > > > >>> gitlab/petsc so each person doesn't constantly need to make fresh > > > >>> forks of external packages) > > > >>> > > > >>> If subtree requires fewer steps than this and has no new oddball git > > > >>> subtree commands then we should definitely use subtree, if it > > > >>> requires other steps involving subtree we need see those commands > > > >>> written in a workflow like I have written above and decide if it is > > > >>> still simpler or not. > > > >>> > > > >>> I am not rejecting subtree, we just need to explicitly see its > > > >>> complete work flow and hence the differences to decide, that is what > > > >>> I asking for. > > > >>> > > > >>> Barry > > > >>> > > > >>> It seems to me that with subtree we also need to maintain a fork of > > > >>> Ed's stuff or else that will have a circular dependency. But perhaps > > > >>> I do not understand it. > > > >> > > > >> Such a fork would only be used once per release to submit PRs to Ed. > > > >> > > > > > > > > > > >