Commends inlined. On Fri, Jul 10, 2015 at 2:10 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> Hi Guozhang, > > Comments inline. > > On Fri, Jul 10, 2015 at 8:47 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > > I have a couple of comments on the wiki pages / merge scripts: > > > > Thanks, it's important to discuss these things as the current version is > based on what the Spark project does and may not match what we want to do > exactly. > > 1. In the wiki page it mentions "If the change is new, then it usually > > needs a new JIRA. However, trivial changes, where "what should change" is > > virtually the same as "how it should change" do not require a JIRA. > > Example: "Fix typos in Foo scaladoc"." > > > > So it sounds we are going to allow pull requests without a JIRA ticket, > but > > later we are also mentioning: > > > > "The PR title should be of the form [KAFKA-xxxx].." which is a bit > > conflicting with the previous statement. Could you clarify it? Today our > > commits are mostly with JIRAs except some trivial ones that are only done > > by committers, I can either extend this to non-committers or not, just > that > > we need to make it clear. > > > > I agree that it's not very clear and it needs to be fixed. What do we want > to do though? It's a trade-off between consistency (always have a JIRA > ticket) and avoiding redundancy (skip the JIRA where it doesn't add > anything). The former is the more conservative option and I am happy to > update the documentation if that's the preferred option. > > Personally I think it is better to not enforcing a JIRA ticket for minor / hotfix commits, for example, we can format the title with [MINOR] [HOTFIX] etc as in Spark: https://github.com/apache/spark/commits/master But I think it we need some broader discussion for this, maybe we could start another email thread regarding this? > 2. The git commit message is a little verbose to me, for example: > > > > ------ > > > > commit ee88dbb67f19b787e12ccef37982c9459b78c7b6 > > > > Author: Geoff Anderson <ge...@confluent.io> > > > > Date: Thu Jul 9 14:58:01 2015 -0700 > > > > KAFKA-2327; broker doesn't start if config defines advertised.host but > > not advertised.port > > > > > > > > Added unit tests as well. These fail without the fix, but pass with > the > > fix. > > > > > > > > Author: Geoff Anderson <ge...@confluent.io> > > > > > > > > Closes #73 from granders/KAFKA-2327 and squashes the following > commits: > > > > > > > > 52a2085 [Geoff Anderson] Cleaned up unecessary toString calls > > > > 23b3340 [Geoff Anderson] Fixes KAFKA-2327 > > > > ------ > > > > > > The "Author" field is redundant, and we could probably also omit the > github > > commits. What do you think? > > > > Is it harmful to have detailed information? None of it is actually > redundant as far as I can see (but I could be missing something). There is > the squashed commit author, the pull request message, the pull request > author and the information about the individual commits. Even though Geoff > worked on this PR by himself, multiple people can collaborate on a feature > and then it's useful to credit correctly. > > The fact that we are keeping all the information encourages a style of > development where a few small commits (each commit should must make sense > on its own and pass the tests) are used in the pull request, which helps a > lot during code review and when trying to understand changes after the > fact. This is in contrast with the style where there is a single commit per > feature even when the feature is quite large (we have a few of those > ongoing at the moment). I don't know if you noticed, but GitHub actually > links to the original commits, which is quite handy: > > > https://github.com/apache/kafka/commit/ee88dbb67f19b787e12ccef37982c9459b78c7b6 > > This approach is a bit of a workaround, but it's easier as the script > handles everything (credit to the Spark project once again, it's their > code). The ideal scenario would be to keep the individual commits by > merging the branch into trunk after a rebase (no fast-forward, so that a > merge commit is maintained). This has a number of nice characteristics > (history is linear, it's clear what commits were part of the merged branch, > easy to bisect, easy to revert, etc.), but it requires a bit more work, > more Git knowledge and ideally a PR builder that builds and tests every > commit in the branch (Typesafe wrote a tool that does this for the Scala > project, for what is worth). In other words, a step too far for us right > now. :) > > What do you think? > > I think I'm sold on your comments, makes sense :) > Best, > Ismael > -- -- Guozhang