Re: [petsc-dev] [petsc-users] new book introducing PETSc for PDEs

Satish Balay via petsc-dev Sun, 01 Nov 2020 11:04:01 -0800

BTW: wrt fork (clone) vs subtree  (I'll have to read 
https://medium.com/@porteneuve/mastering-git-subtrees-943d29a798ec again) - the 
way I understand it is:


One can do push/pull/merge with subtree that are equivalent to push/pull/merge 
with a clone.

i.e; with a local clone one does: [branch is implicit here so lets stick with 
default master branch]

git clone repo-URL
cd repo
<edit>
git commit
git pull
git push

with subtree one does:

cd petsc
git subtree add --prefix=location repo-url [branch]
<edit>
git commit
git subtree pull --prefix=location repo-url [branch]
git subtree push --prefix=location repo-url [branch]

i.e the complexity of branches/merge issues don't go away - it just avoids the 
extra repo [and avoids making 2 commits in sync in 2 different repos]

Notes:
- wrt petsc4py we don't have a 'subtree pull' in the workflow
- and 'subtree push' is currently broken - so petsc4py workflow is currently 
broken.

Also note: with the previous petsc4py workflow we had:
 - make MR to petsc4py
 - use this commit with --download-petsc4py in petsc [in MR branch]
 - do not merge petsc-MR until petsc4py-MR is merged [this way - any revisions 
in petsc4py-MR - that will change the hash can be updated in petsc-MR]

With the current discussion - the workflow for examples is more complicated - 
i.e the changes petsc-developers do and that get accepted upstream can be 
different [perhaps temporarily.] - so there is extra complexity here where 
local changes (fork or subrepo) are not in sync with upstream changes. One way 
to sync is:
- make local changes (and use them with petsc CI)
- make separate MR to upstream (revise and update the MR - as needed for 
acceptance). Note: I'm not sure if there is a workflow to test these revisions 
in petsc cI
- once merged to upstream example repo - sync [subtree, or fork] - either a 
merge - or revert local changes, pull upstream changes.

So a simpler model is [as Ed referred to]: use upstream only, and the CI use 
(allow_failure:true) - and then Ed can fix the issues as he sees fit [and there 
will be CI errors until the fix is in]. And the petsc-MR is not held back (even 
with fundamental failures)

Satish

On Sun, 1 Nov 2020, Satish Balay wrote:

> I should add - we should:
> 
> - ignore git subtree or a fork
> - assume one (or more) petsc developers have write access to create branches 
> in upstream repo
> - figure out workflow for the requirements [and current constraints wrt book].
> 
> - a fork primarily helps with write access [wrt branches, when no write 
> access is available for upstream repo] - there is a mirror sync cost
> - a subtree [assuming its not buggy] - helps with a single commit workflow 
> [vs multiple commits in different repos - and keeping them in sync]
> - [a default is neither - using upstream repo directly]
> 
> And in all cases there is similar synchronization cost between upstream repo 
> and fork/subtree [this depends on the requirements - and current constranits]
> 
> Satish
> 
> On Sun, 1 Nov 2020, Satish Balay via petsc-dev wrote:
> 
> > On Sun, 1 Nov 2020, Barry Smith wrote:
> > 
> > > 
> > > 
> > > > On Nov 1, 2020, at 10:30 AM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > > > 
> > > > Just a note: subtree workflow is currently broken with petsc4py
> > > > 
> > > > We've imported petsc4py using subtree - but now we are unable to export 
> > > > back changes in petsc repo to the standalone petsc4py repo due to git 
> > > > subtree errors.
> > > > 
> > > > I spent some time on workarounds - but didn't go anywhere.
> > > > 
> > > > [likely this is a non-issue with this examples repo workflow - as there 
> > > > is no direct push back to upstream repo]
> > > > 
> > > > And with subtree - not all developers need to know about it. Only the 
> > > > person who is syncing the 2 repos.. [if they know how to sync/resolve 
> > > > conflicts]
> > > 
> > >   Since each person is reponsable for their branch getting merged, 
> > > doesn't this mean anyone who changes something in PETSc that requires a 
> > > change in p4pdes has to fix it via gittree process ( that is have have to 
> > > sync the 2 repose?) and hence now how to sync 2 repos via gittree.
> > 
> > 
> > For one its not clear what problem is being addressed here [as mentioned in 
> > my other e-mail.]
> > 
> > >From Ed's e-mail - my understanding is - whatever changes petsc developers 
> > >might have - might not be acceptable.
> > 
> > So I think first we need to figure out what is acceptable and not - and 
> > then figure out the workflow.
> > 
> > " Since each person is reponsable for their branch getting merged, "
> > 
> > So its not clear to me if this fits with what Ed mentioned. [wrt his goals 
> > of keeping the examples in sync with the book]
> > 
> > BTW I should note: the current process [which I don't understand 
> > completely] looks more complicated than what we had with petsc4py - which 
> > you were against at - and raised this at  every opportunity - until 
> > petsc4py was imported into petsc (using subtree)
> > 
> > Satish
> > 
> > > 
> > >   Until I see the entire gittree workflow I will not understand.
> > > 
> > >   Barry
> > > 
> > > > 
> > > > Satish
> > > > 
> > > > On Sun, 1 Nov 2020, Jed Brown wrote:
> > > > 
> > > >> Barry Smith <bsm...@petsc.dev> writes:
> > > >> 
> > > >>>>> git subtree does require use of new commands every time you mess 
> > > >>>>> with it (say every three months) that we do not know and since each 
> > > >>>>> of 
> > > >>>>> us will do this infrequently it is likely we will not remember them 
> > > >>>>> (I won't)  while my approach does not require remembering new 
> > > >>>>> commands. 
> > > >>>>> My approach only requires branching, push, and pulling on the fork 
> > > >>>>> and updating EdsRepo.py which is things all the developers know 
> > > >>>>> about and 
> > > >>>>> do regularly since we update other external packages.
> > > >>> 
> > > >>>>> Circular dependencies (PETSc CI depends on Ed's p4pdes depends on 
> > > >>>>> PETSc) are significantly more labor-intensive and it would need to 
> > > >>>>> be done on each change, versus once per release cycle.
> > > >>> 
> > > >>>    I think we are not communicating on exactly the  same wave length. 
> > > >>> 
> > > >>>    I am not advocating we test on Ed's I am advocating we test on our 
> > > >>> fork of Ed's and how often our fork gets passed to Ed with MR is a 
> > > >>> choice all of us make together, now we could do it once at release 
> > > >>> time with all the fixes we made, every month or each time. Totally up 
> > > >>> to Ed how often he wants to get them, and he is free to ignore them 
> > > >>> for weeks (up to the next release if he likes) since our testing will 
> > > >>> continue because we use our fork. How often we pass them on to Ed is 
> > > >>> not related to how we maintain our CI.
> > > >>> 
> > > >>>    There is no circular dependency on my approach with Ed's p4pdes.
> > > >> 
> > > >> The circular dependency is on our fork of p4pdes.
> > > >> 
> > > >>>    I vote to fix things in our fork or gittree thing continuously 
> > > >>> since it makes it easier to fix things rather than wait to the 
> > > >>> release when we try to find and fix everything and it also helps tell 
> > > >>> us if we introduced a real bug into PETSc and fix PETSc immediately 
> > > >>> instead of waiting up to 6 months, just like we now we test 
> > > >>> immediately with Petsc4py and we should do with SLEPc.  How often we 
> > > >>> give the updates to Ed is a completely different issue. 
> > > >>> 
> > > >>>   So again back to my original statement ,it comes down to if the 
> > > >>> subtree or the fork approach is easier for all the PETSc developers 
> > > >>> who do not currently know gittree and would need to learn it with 
> > > >>> your approach. I don't know which is easier learning to use gittree 
> > > >>> which has its own gotcha's or using mine which we all know but may 
> > > >>> require an extra step (not involving Ed, just updating the p4pdes.py 
> > > >>> commit each time we change something in the fork.)
> > > >>> 
> > > >>>  I think using --download-p4pdes on a couple of systems in the CI is 
> > > >>> enough, I don't think we need to put it all CI pipelines (I would 
> > > >>> like slepc in all pipelines).but we could put in all pipelines if we 
> > > >>> want.
> > > >>> 
> > > >>>  For completeness I show the exact the work flow for my suggestion
> > > >>> 
> > > >>>   pipelines --download-p4pdes and runs its tests
> > > >>>   if it breaks the developer uses --download-p4pdes  on their system 
> > > >>>     they fix the problem either by fixing PETSc or what is downloaded 
> > > >>> from the p4pde fork
> > > >>>        if the fix is in the p4pdes fork they make a branch in the 
> > > >>> p4pdes fork, which they already have since they used 
> > > >>> --download-p4pdes and thus 
> > > >>>                              have the fork on their system
> > > >>>            they put the fix in new branch in the p4pdes fork and push 
> > > >>> it
> > > >>>            they edit p4pdes.py and put a new commit in it pointing to 
> > > >>> their branch in the fork
> > > >>>   run the pipeline again
> > > >>>       if fails with p4pdes they do the above again
> > > >>>       else the PETSc branch gets accepted and merged to master
> > > >>>          depending on Ed's choice we make an MR for p4pdes depending 
> > > >>> on the agreed upon cycle. If Ed puts the fix into his master then we 
> > > >>> just update
> > > >>>               our fork with his latest master with a simple merge of 
> > > >>> his.  This will then become the new one we test against. If Ed 
> > > >>> doesn't respond to the MR all is fine we 
> > > >>>              just continue on our fork. If he puts other things in 
> > > >>> his branch but not our MR we just merge that into our fork and so are 
> > > >>> still testing with his latest master.
> > > >> 
> > > >> As you've enumerated here, this requires three MRs per change:
> > > >> 
> > > >>  1. in PETSc with the actual change and to point at a p4pdes commit
> > > >>  2. in p4pdes (Ed's or our fork) to implement needed changes (note: 
> > > >> this can't merge until #1 merges)
> > > >>  3. in PETSc to point at the merge commit of p4pdes (after #2 merges)
> > > >> 
> > > >> With subtree, there is only one MR and it's in the PETSc repository.
> > > >> 
> > > >> Recall that we exchanged hundreds of emails to subtree in BuildSystem 
> > > >> long ago, then quite a lot more to subtree in petsc4py recently, and 
> > > >> now we're doing it again.
> > > >> 
> > > >> The arguments against subtreeing p4pdes are
> > > >> 
> > > >>  1. We want an extra hurdle so developers think twice before changing 
> > > >> those interfaces
> > > >>  2. We want to drive traffic to Ed's repo
> > > >>  3. The repository is so big we don't want every `git clone 
> > > >> gitlab:petsc/petsc` to include it
> > > >> 
> > > >> This repo is small so I'm most sympathetic to #2, but nobody has made 
> > > >> these arguments yet.
> > > >> 
> > > >>>  This happens to be the exact same thing I do now with slepc and any 
> > > >>> other git based package that I use now (except for some I need to 
> > > >>> make my own fork of the external package because we don't have one 
> > > >>> always available eventually likely we will likely put more forks into 
> > > >>> gitlab/petsc so each person doesn't constantly need to make fresh 
> > > >>> forks of external packages)
> > > >>> 
> > > >>>  If subtree requires fewer steps than this and has no new oddball git 
> > > >>> subtree commands then we should definitely use subtree, if it 
> > > >>> requires other steps involving subtree we need see those commands 
> > > >>> written in a workflow like I have written above and decide if it is 
> > > >>> still simpler or not.
> > > >>> 
> > > >>>  I am not rejecting subtree, we just need to explicitly see its 
> > > >>> complete work flow and hence the differences to decide, that is what 
> > > >>> I asking for.
> > > >>> 
> > > >>>  Barry
> > > >>> 
> > > >>>  It seems to me that with subtree we also need to maintain a fork of 
> > > >>> Ed's stuff or else that will have a circular dependency. But perhaps 
> > > >>> I do not understand it.
> > > >> 
> > > >> Such a fork would only be used once per release to submit PRs to Ed.
> > > >> 
> > > > 
> > > 
> > 
> 
>

Re: [petsc-dev] [petsc-users] new book introducing PETSc for PDEs

Reply via email to