We should come to a decision soon on git-flow; a small piece of a conversation between Nikolay and I regarding the CUDA branch and workflow:
nsakharnykh [12:07 PM] so the plan is to squash CUDA branch and integrate into master? After we check all items in the epic? apalumbo [5:18 PM] ^^ actually there is still an unresolved question of using git-flow. If we use that then yes- we'll squash a commit all the current `CUDA` to to `develop` rather than master. we should really we could also just squash and commit our current, inital branch to `CUDA` then to `develop` as MAHOUT-200X, delete the CUDA branch, and continue on from there. with new branches for each issue, MAHOUT-2003, delete the branch upon committing. this way our work will make it into the nightly SNAPSHOTS. If we're going to use this, then an 0.13.1-release branch should be created. So that we can check our work into develop as we go. --andy ________________________________ From: Andrew Palumbo <ap....@outlook.com> Sent: Saturday, July 15, 2017 7:20:50 PM To: dev@mahout.apache.org Subject: Re: Proposal for changing Mahout's Git branching rules bumping this thread now that we jave JIRA off this list; also showing an empirical (what I find to be positive) use-case for Mahout git-flow: We'd decided to use a git-flow branch management strategy a couple of months back, just before Nikolai and I started working on cuda integration. At first i was undecided really whether git-flow made sense for a project of our size.. it can take a minute to get used to, and with our multiple repos and forks, I wasn't sure if it really mattered either way. I created a `git-wip-us.apache.org/mahout/CUDA` feature branch (should have been MAHOUT-2002). Around that time we found urgent need scala 2.11 and spark > 1.6 capabilities. I was unable to work on the release at the time and Trevor and Andrew have the release almost down to a few scripts (working from master after I believe decisions swung back to *not* using git-flow). Trevor is currently running the release, with a code freeze in place, while Nikolay and I can continue to commit to the Feature Branch, which is not going into the current release. When CUDA integration is finished, we can merge `git-wip-us.apache.org/mahout/CUDA` into master (develop). Long story short, A release doesn't have to stop the whole team from working on New Features (slated for later releases). The couple of issues minor that I see are: 1) If we decide to go with a real git-flow strategy, we should be merging our feature branches into `develop`, which will be at the least a minor jenkins headache. (we'll need to publish our nightlies off of `develop`).. probably something else Jenkins/hudson related, But Travis has really come a long way in the last year or so, and Trevor recently set it up to test the Spark Module in pseudo-cluster mode. 2) As Dmitriy pointed out, I believe..., unless we can configure ASF Github to open PRs by default to the `develop` branch, I can almost guarantee that we will have new contributors opening against `master`. Other than that, I'm not sure of any huge issues or learning curve that we'd need to overcome. my .02 --andy ________________________________ From: Pat Ferrel <p...@occamsmachete.com> Sent: Friday, June 23, 2017 11:23:08 AM To: dev@mahout.apache.org Subject: Re: Proposal for changing Mahout's Git branching rules I don’t know where to start here. Git flow does not address the merge conflict problems you talk about. They have nothing to do with the process and are made no easier or harder by following it. The only thing I can comment on is that PredictionIO sets “develop” as the default branch so PRs are always against that, making absolutely no difference in convenience to contributors. And since we should soon be able to use the shiny green merge button on github, the process will quite smooth and far less dangerous since master is not affected. Note that this is from experience, not hypotheticals. PIO has a mess of dependency combinations, even worse than Mahout and we’ve found that following this makes a hard job at least contained. Merging will always be hard but thats why we get the big bucks ;-) Contributors voted to use the process on PIO just like committers and something like 6 have since graduated to committer status over the last 6 months. I’d be happy to put anyone in touch with them if you want to see what they think. On Jun 22, 2017, at 4:21 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: and contributors convenience should be golden IMO. I remember experiencing a mild irritation when i was asked to resolve the conflicts on spark prs because I felt they arose solely because the committer was taking too long to review my PR and ok it. But if it were resulting from the project not following simple KISS github PR workflow, it probably would be a bigger turn-off. and then imagine the overhead of explaining to every newcomer that they should and why they should be PRing not against the master but something else when every other ASF project accepts PRs against master... I dunno... when working on github, any deviation from github commonly accepted PR flows imo would be a fatal wound to the process. On Thu, Jun 22, 2017 at 4:13 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > should read > > And then you will face the dilemma whether to ask people to resolve merge > issues w.r.t. *dev* and resubmit against *dev*, which will result to high > contribtors' attrition, or resolve them yourself without deep knowledge of > the author's intent, which will result in delays and plain errors. > > On Thu, Jun 22, 2017 at 2:48 PM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >> >> >> On Wed, Jun 21, 2017 at 3:00 PM, Pat Ferrel <p...@occamsmachete.com> >> wrote: >> >>> Which is an option part of git flow but maybe take a look at a better >>> explanation than mine: http://nvie.com/posts/a-succes >>> sful-git-branching-model/ <http://nvie.com/posts/a-succe >>> ssful-git-branching-model/> >>> >>> I still don’t see how this complicates resolving conflicts. It just >>> removes the resolution from being a blocker. If some conflict is pushed to >>> master the project is dead until it is resolved (how often have we seen >>> this?) >> >> >> This is completely detached from github reality. >> >> In this model, all contributors work actually on the same branch. In >> github, every contributor will fork off their own dev branch. >> >> In this model, people start with a fork off the dev branch and push to >> dev branch. In github, a contributor will fork off the master branch and >> will PR against master branch. This is default behavior and my gut feeling >> no amount of forewarning is going to change that w.r.t. contributors. And >> if one starts off his/her work with the branch with intent to commit to >> another, then conflict is guaranteed every time he or she changes the file >> that has been changed on the branch to be merged to. >> >> For example: >> Master is at A >> Dev branch is at A - B -C ... F. >> >> if I start working at master (A) then i wil generate conflicts if i have >> changed same files (lines) as in B, C, .. or F. >> >> If I start working at dev (F) then i will not have a chance to generate >> conflicts with B,C,..F but only with commits that happened after i had >> started. >> >> Also, if I start working at master (A) then github flow will suggest me >> to merge into master during PR. I guarantee 100% of first time PRs will >> trip on that in github. even if you put "start your work off dev not >> master" 20 times into project readme. >> >> And then you will face the dilemma whether to ask people to resolve merge >> issues w.r.t. master and resubmit, which will result to high contribtors' >> attrition, or resolve them yourself without deep knowledge of the author's >> intent, which will result in delays and plain errors. >> >> -d >> > >