I should add - we should:

- ignore git subtree or a fork
- assume one (or more) petsc developers have write access to create branches in 
upstream repo
- figure out workflow for the requirements [and current constraints wrt book].

- a fork primarily helps with write access [wrt branches, when no write access 
is available for upstream repo] - there is a mirror sync cost
- a subtree [assuming its not buggy] - helps with a single commit workflow [vs 
multiple commits in different repos - and keeping them in sync]
- [a default is neither - using upstream repo directly]

And in all cases there is similar synchronization cost between upstream repo 
and fork/subtree [this depends on the requirements - and current constranits]

Satish

On Sun, 1 Nov 2020, Satish Balay via petsc-dev wrote:

> On Sun, 1 Nov 2020, Barry Smith wrote:
> 
> > 
> > 
> > > On Nov 1, 2020, at 10:30 AM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > > 
> > > Just a note: subtree workflow is currently broken with petsc4py
> > > 
> > > We've imported petsc4py using subtree - but now we are unable to export 
> > > back changes in petsc repo to the standalone petsc4py repo due to git 
> > > subtree errors.
> > > 
> > > I spent some time on workarounds - but didn't go anywhere.
> > > 
> > > [likely this is a non-issue with this examples repo workflow - as there 
> > > is no direct push back to upstream repo]
> > > 
> > > And with subtree - not all developers need to know about it. Only the 
> > > person who is syncing the 2 repos.. [if they know how to sync/resolve 
> > > conflicts]
> > 
> >   Since each person is reponsable for their branch getting merged, doesn't 
> > this mean anyone who changes something in PETSc that requires a change in 
> > p4pdes has to fix it via gittree process ( that is have have to sync the 2 
> > repose?) and hence now how to sync 2 repos via gittree.
> 
> 
> For one its not clear what problem is being addressed here [as mentioned in 
> my other e-mail.]
> 
> >From Ed's e-mail - my understanding is - whatever changes petsc developers 
> >might have - might not be acceptable.
> 
> So I think first we need to figure out what is acceptable and not - and then 
> figure out the workflow.
> 
> " Since each person is reponsable for their branch getting merged, "
> 
> So its not clear to me if this fits with what Ed mentioned. [wrt his goals of 
> keeping the examples in sync with the book]
> 
> BTW I should note: the current process [which I don't understand completely] 
> looks more complicated than what we had with petsc4py - which you were 
> against at - and raised this at  every opportunity - until petsc4py was 
> imported into petsc (using subtree)
> 
> Satish
> 
> > 
> >   Until I see the entire gittree workflow I will not understand.
> > 
> >   Barry
> > 
> > > 
> > > Satish
> > > 
> > > On Sun, 1 Nov 2020, Jed Brown wrote:
> > > 
> > >> Barry Smith <bsm...@petsc.dev> writes:
> > >> 
> > >>>>> git subtree does require use of new commands every time you mess with 
> > >>>>> it (say every three months) that we do not know and since each of 
> > >>>>> us will do this infrequently it is likely we will not remember them 
> > >>>>> (I won't)  while my approach does not require remembering new 
> > >>>>> commands. 
> > >>>>> My approach only requires branching, push, and pulling on the fork 
> > >>>>> and updating EdsRepo.py which is things all the developers know about 
> > >>>>> and 
> > >>>>> do regularly since we update other external packages.
> > >>> 
> > >>>>> Circular dependencies (PETSc CI depends on Ed's p4pdes depends on 
> > >>>>> PETSc) are significantly more labor-intensive and it would need to be 
> > >>>>> done on each change, versus once per release cycle.
> > >>> 
> > >>>    I think we are not communicating on exactly the  same wave length. 
> > >>> 
> > >>>    I am not advocating we test on Ed's I am advocating we test on our 
> > >>> fork of Ed's and how often our fork gets passed to Ed with MR is a 
> > >>> choice all of us make together, now we could do it once at release time 
> > >>> with all the fixes we made, every month or each time. Totally up to Ed 
> > >>> how often he wants to get them, and he is free to ignore them for weeks 
> > >>> (up to the next release if he likes) since our testing will continue 
> > >>> because we use our fork. How often we pass them on to Ed is not related 
> > >>> to how we maintain our CI.
> > >>> 
> > >>>    There is no circular dependency on my approach with Ed's p4pdes.
> > >> 
> > >> The circular dependency is on our fork of p4pdes.
> > >> 
> > >>>    I vote to fix things in our fork or gittree thing continuously since 
> > >>> it makes it easier to fix things rather than wait to the release when 
> > >>> we try to find and fix everything and it also helps tell us if we 
> > >>> introduced a real bug into PETSc and fix PETSc immediately instead of 
> > >>> waiting up to 6 months, just like we now we test immediately with 
> > >>> Petsc4py and we should do with SLEPc.  How often we give the updates to 
> > >>> Ed is a completely different issue. 
> > >>> 
> > >>>   So again back to my original statement ,it comes down to if the 
> > >>> subtree or the fork approach is easier for all the PETSc developers who 
> > >>> do not currently know gittree and would need to learn it with your 
> > >>> approach. I don't know which is easier learning to use gittree which 
> > >>> has its own gotcha's or using mine which we all know but may require an 
> > >>> extra step (not involving Ed, just updating the p4pdes.py commit each 
> > >>> time we change something in the fork.)
> > >>> 
> > >>>  I think using --download-p4pdes on a couple of systems in the CI is 
> > >>> enough, I don't think we need to put it all CI pipelines (I would like 
> > >>> slepc in all pipelines).but we could put in all pipelines if we want.
> > >>> 
> > >>>  For completeness I show the exact the work flow for my suggestion
> > >>> 
> > >>>   pipelines --download-p4pdes and runs its tests
> > >>>   if it breaks the developer uses --download-p4pdes  on their system 
> > >>>     they fix the problem either by fixing PETSc or what is downloaded 
> > >>> from the p4pde fork
> > >>>        if the fix is in the p4pdes fork they make a branch in the 
> > >>> p4pdes fork, which they already have since they used --download-p4pdes 
> > >>> and thus 
> > >>>                              have the fork on their system
> > >>>            they put the fix in new branch in the p4pdes fork and push it
> > >>>            they edit p4pdes.py and put a new commit in it pointing to 
> > >>> their branch in the fork
> > >>>   run the pipeline again
> > >>>       if fails with p4pdes they do the above again
> > >>>       else the PETSc branch gets accepted and merged to master
> > >>>          depending on Ed's choice we make an MR for p4pdes depending on 
> > >>> the agreed upon cycle. If Ed puts the fix into his master then we just 
> > >>> update
> > >>>               our fork with his latest master with a simple merge of 
> > >>> his.  This will then become the new one we test against. If Ed doesn't 
> > >>> respond to the MR all is fine we 
> > >>>              just continue on our fork. If he puts other things in his 
> > >>> branch but not our MR we just merge that into our fork and so are still 
> > >>> testing with his latest master.
> > >> 
> > >> As you've enumerated here, this requires three MRs per change:
> > >> 
> > >>  1. in PETSc with the actual change and to point at a p4pdes commit
> > >>  2. in p4pdes (Ed's or our fork) to implement needed changes (note: this 
> > >> can't merge until #1 merges)
> > >>  3. in PETSc to point at the merge commit of p4pdes (after #2 merges)
> > >> 
> > >> With subtree, there is only one MR and it's in the PETSc repository.
> > >> 
> > >> Recall that we exchanged hundreds of emails to subtree in BuildSystem 
> > >> long ago, then quite a lot more to subtree in petsc4py recently, and now 
> > >> we're doing it again.
> > >> 
> > >> The arguments against subtreeing p4pdes are
> > >> 
> > >>  1. We want an extra hurdle so developers think twice before changing 
> > >> those interfaces
> > >>  2. We want to drive traffic to Ed's repo
> > >>  3. The repository is so big we don't want every `git clone 
> > >> gitlab:petsc/petsc` to include it
> > >> 
> > >> This repo is small so I'm most sympathetic to #2, but nobody has made 
> > >> these arguments yet.
> > >> 
> > >>>  This happens to be the exact same thing I do now with slepc and any 
> > >>> other git based package that I use now (except for some I need to make 
> > >>> my own fork of the external package because we don't have one always 
> > >>> available eventually likely we will likely put more forks into 
> > >>> gitlab/petsc so each person doesn't constantly need to make fresh forks 
> > >>> of external packages)
> > >>> 
> > >>>  If subtree requires fewer steps than this and has no new oddball git 
> > >>> subtree commands then we should definitely use subtree, if it requires 
> > >>> other steps involving subtree we need see those commands written in a 
> > >>> workflow like I have written above and decide if it is still simpler or 
> > >>> not.
> > >>> 
> > >>>  I am not rejecting subtree, we just need to explicitly see its 
> > >>> complete work flow and hence the differences to decide, that is what I 
> > >>> asking for.
> > >>> 
> > >>>  Barry
> > >>> 
> > >>>  It seems to me that with subtree we also need to maintain a fork of 
> > >>> Ed's stuff or else that will have a circular dependency. But perhaps I 
> > >>> do not understand it.
> > >> 
> > >> Such a fork would only be used once per release to submit PRs to Ed.
> > >> 
> > > 
> > 
> 

Reply via email to