I should add - we should: - ignore git subtree or a fork - assume one (or more) petsc developers have write access to create branches in upstream repo - figure out workflow for the requirements [and current constraints wrt book].
- a fork primarily helps with write access [wrt branches, when no write access is available for upstream repo] - there is a mirror sync cost - a subtree [assuming its not buggy] - helps with a single commit workflow [vs multiple commits in different repos - and keeping them in sync] - [a default is neither - using upstream repo directly] And in all cases there is similar synchronization cost between upstream repo and fork/subtree [this depends on the requirements - and current constranits] Satish On Sun, 1 Nov 2020, Satish Balay via petsc-dev wrote: > On Sun, 1 Nov 2020, Barry Smith wrote: > > > > > > > > On Nov 1, 2020, at 10:30 AM, Satish Balay <ba...@mcs.anl.gov> wrote: > > > > > > Just a note: subtree workflow is currently broken with petsc4py > > > > > > We've imported petsc4py using subtree - but now we are unable to export > > > back changes in petsc repo to the standalone petsc4py repo due to git > > > subtree errors. > > > > > > I spent some time on workarounds - but didn't go anywhere. > > > > > > [likely this is a non-issue with this examples repo workflow - as there > > > is no direct push back to upstream repo] > > > > > > And with subtree - not all developers need to know about it. Only the > > > person who is syncing the 2 repos.. [if they know how to sync/resolve > > > conflicts] > > > > Since each person is reponsable for their branch getting merged, doesn't > > this mean anyone who changes something in PETSc that requires a change in > > p4pdes has to fix it via gittree process ( that is have have to sync the 2 > > repose?) and hence now how to sync 2 repos via gittree. > > > For one its not clear what problem is being addressed here [as mentioned in > my other e-mail.] > > >From Ed's e-mail - my understanding is - whatever changes petsc developers > >might have - might not be acceptable. > > So I think first we need to figure out what is acceptable and not - and then > figure out the workflow. > > " Since each person is reponsable for their branch getting merged, " > > So its not clear to me if this fits with what Ed mentioned. [wrt his goals of > keeping the examples in sync with the book] > > BTW I should note: the current process [which I don't understand completely] > looks more complicated than what we had with petsc4py - which you were > against at - and raised this at every opportunity - until petsc4py was > imported into petsc (using subtree) > > Satish > > > > > Until I see the entire gittree workflow I will not understand. > > > > Barry > > > > > > > > Satish > > > > > > On Sun, 1 Nov 2020, Jed Brown wrote: > > > > > >> Barry Smith <bsm...@petsc.dev> writes: > > >> > > >>>>> git subtree does require use of new commands every time you mess with > > >>>>> it (say every three months) that we do not know and since each of > > >>>>> us will do this infrequently it is likely we will not remember them > > >>>>> (I won't) while my approach does not require remembering new > > >>>>> commands. > > >>>>> My approach only requires branching, push, and pulling on the fork > > >>>>> and updating EdsRepo.py which is things all the developers know about > > >>>>> and > > >>>>> do regularly since we update other external packages. > > >>> > > >>>>> Circular dependencies (PETSc CI depends on Ed's p4pdes depends on > > >>>>> PETSc) are significantly more labor-intensive and it would need to be > > >>>>> done on each change, versus once per release cycle. > > >>> > > >>> I think we are not communicating on exactly the same wave length. > > >>> > > >>> I am not advocating we test on Ed's I am advocating we test on our > > >>> fork of Ed's and how often our fork gets passed to Ed with MR is a > > >>> choice all of us make together, now we could do it once at release time > > >>> with all the fixes we made, every month or each time. Totally up to Ed > > >>> how often he wants to get them, and he is free to ignore them for weeks > > >>> (up to the next release if he likes) since our testing will continue > > >>> because we use our fork. How often we pass them on to Ed is not related > > >>> to how we maintain our CI. > > >>> > > >>> There is no circular dependency on my approach with Ed's p4pdes. > > >> > > >> The circular dependency is on our fork of p4pdes. > > >> > > >>> I vote to fix things in our fork or gittree thing continuously since > > >>> it makes it easier to fix things rather than wait to the release when > > >>> we try to find and fix everything and it also helps tell us if we > > >>> introduced a real bug into PETSc and fix PETSc immediately instead of > > >>> waiting up to 6 months, just like we now we test immediately with > > >>> Petsc4py and we should do with SLEPc. How often we give the updates to > > >>> Ed is a completely different issue. > > >>> > > >>> So again back to my original statement ,it comes down to if the > > >>> subtree or the fork approach is easier for all the PETSc developers who > > >>> do not currently know gittree and would need to learn it with your > > >>> approach. I don't know which is easier learning to use gittree which > > >>> has its own gotcha's or using mine which we all know but may require an > > >>> extra step (not involving Ed, just updating the p4pdes.py commit each > > >>> time we change something in the fork.) > > >>> > > >>> I think using --download-p4pdes on a couple of systems in the CI is > > >>> enough, I don't think we need to put it all CI pipelines (I would like > > >>> slepc in all pipelines).but we could put in all pipelines if we want. > > >>> > > >>> For completeness I show the exact the work flow for my suggestion > > >>> > > >>> pipelines --download-p4pdes and runs its tests > > >>> if it breaks the developer uses --download-p4pdes on their system > > >>> they fix the problem either by fixing PETSc or what is downloaded > > >>> from the p4pde fork > > >>> if the fix is in the p4pdes fork they make a branch in the > > >>> p4pdes fork, which they already have since they used --download-p4pdes > > >>> and thus > > >>> have the fork on their system > > >>> they put the fix in new branch in the p4pdes fork and push it > > >>> they edit p4pdes.py and put a new commit in it pointing to > > >>> their branch in the fork > > >>> run the pipeline again > > >>> if fails with p4pdes they do the above again > > >>> else the PETSc branch gets accepted and merged to master > > >>> depending on Ed's choice we make an MR for p4pdes depending on > > >>> the agreed upon cycle. If Ed puts the fix into his master then we just > > >>> update > > >>> our fork with his latest master with a simple merge of > > >>> his. This will then become the new one we test against. If Ed doesn't > > >>> respond to the MR all is fine we > > >>> just continue on our fork. If he puts other things in his > > >>> branch but not our MR we just merge that into our fork and so are still > > >>> testing with his latest master. > > >> > > >> As you've enumerated here, this requires three MRs per change: > > >> > > >> 1. in PETSc with the actual change and to point at a p4pdes commit > > >> 2. in p4pdes (Ed's or our fork) to implement needed changes (note: this > > >> can't merge until #1 merges) > > >> 3. in PETSc to point at the merge commit of p4pdes (after #2 merges) > > >> > > >> With subtree, there is only one MR and it's in the PETSc repository. > > >> > > >> Recall that we exchanged hundreds of emails to subtree in BuildSystem > > >> long ago, then quite a lot more to subtree in petsc4py recently, and now > > >> we're doing it again. > > >> > > >> The arguments against subtreeing p4pdes are > > >> > > >> 1. We want an extra hurdle so developers think twice before changing > > >> those interfaces > > >> 2. We want to drive traffic to Ed's repo > > >> 3. The repository is so big we don't want every `git clone > > >> gitlab:petsc/petsc` to include it > > >> > > >> This repo is small so I'm most sympathetic to #2, but nobody has made > > >> these arguments yet. > > >> > > >>> This happens to be the exact same thing I do now with slepc and any > > >>> other git based package that I use now (except for some I need to make > > >>> my own fork of the external package because we don't have one always > > >>> available eventually likely we will likely put more forks into > > >>> gitlab/petsc so each person doesn't constantly need to make fresh forks > > >>> of external packages) > > >>> > > >>> If subtree requires fewer steps than this and has no new oddball git > > >>> subtree commands then we should definitely use subtree, if it requires > > >>> other steps involving subtree we need see those commands written in a > > >>> workflow like I have written above and decide if it is still simpler or > > >>> not. > > >>> > > >>> I am not rejecting subtree, we just need to explicitly see its > > >>> complete work flow and hence the differences to decide, that is what I > > >>> asking for. > > >>> > > >>> Barry > > >>> > > >>> It seems to me that with subtree we also need to maintain a fork of > > >>> Ed's stuff or else that will have a circular dependency. But perhaps I > > >>> do not understand it. > > >> > > >> Such a fork would only be used once per release to submit PRs to Ed. > > >> > > > > > >