On Fri, 28 Jul 2017, Alfred M. Szmidt wrote: > 1. The package has a public version control system. > > (Rationale: this ensures people can see what changed, just as with > ChangeLogs, but can see *exactly* what changed rather than just the > brief descriptions.) > > I think that rationale is incorrect, just because you have a public > version control system does not mean that you can see what actually > changed. Going through multiple megabytes of diffs is not feasible, > and searching for when something was renamed, added, removed, etc is > something no tool is capable of providing.
That's a function of a busy project and is the same whether you look at commit logs, diffs or ChangeLog messages. > 2. The version control uses a distributed version control system. > > (Rationale: this ensures people can get a complete copy of the > history of what changed, as they can with ChangeLog files in > releases.) > > How would the information that is normally available in a ChangeLog > file be populated if all that information is in the VCS? That would > still be needed for normal tarballs and the like when VCS is out the > window. I wouldn't object to shipping the version control history in tarballs, if necessary to stop having to write in the ChangeLog format (or having tarballs with and without the version control history). Or straight copies of the version control logs, as long as no-one actually has to manually write the list of files, named entities within those files and what is changed in each named entity (instead just having human-written logs describing what changed at the logical level). But I believe that people wanting to look at the history are going to check out the repository rather than attempting to get it from tarballs. > 3. Commits are made for each logical change, not batched into a > commit per release or per day or other such batching. > > (Rationale: this ensures as much separation of logically separate > changes as there would be in a ChangeLog file.) > > This is I think a good idea, the bunching of ChangeLog entries always > feelt a bit weird. I'd actually like points 1 and 3 (public VCS with logical commits) to be required for all GNU packages, but that's independent of my present point. > 5. Commit messages describe the logical "what" changed (but don't > necessarily describe the physical "what" at the level of changes to > individual files and functions). > > (Rationale: the logical "what" is useful information at the human > level for understanding the change. Listing individual changed > files and functions both duplicates the information available from > the version control system, and is at the wrong level for > understanding the change for most purposes. > > I am not sure I understand why it is wrong, to be able to understand > how something came to be one needs to look at how things changed -- > and only way to do that is with a ChangeLog entry. The only way to do that reliably is with the version control history. Which is what people expect to use to look at how something came to be as it is - they expect to check out a repository, not to look at a tarball for that information. > It's normal in glibc, for example, for a change to affect many > separate files and named entities in those files, in ways that are > repetitive but not repetitive enough to use e.g. "All callers > changed", and which the ChangeLog format does not provide a good > fit to or result in useful information about the changes not > available from version control.) > > Knowing that all callers have been change is I think useful > information, why do you think the opposite? The point is that the changes are mechanical, but not in a way that corresponds to "all callers changed", and that listing all the named entities changed and how they changed is error-prone, time-consuming (possibly taking longer than writing the patch itself) and results in a ChangeLog entry that is completely useless for people wanting to understand the change (who will want the description at the logical level, and if they want the exact details for each named entity, will find the version control history more useful). Here's a representative example of a ChangeLog entry I wrote recently. I think that given the logical description (summary line plus two paragraphs) in the commit message, and given the commit history for anyone interested in the exact details of how particular files or entities were changed, writing this ChangeLog entry was a complete waste of time and it provides nothing useful for anyone using or developing glibc. And this sort of mostly-mechanical change, with many files and entities therein changed in similar but not identical ways, is very common when working on glibc; I've written a great many such ChangeLog entries, some much longer than this one with hundreds of named entities enumerated as changed. And, similarly, for GCC changes. Spending the time to write several paragraphs at the human level about the content and purpose of a change is worthwhile. Spending the time to duplicate, badly, the information in the diff itself about changed files and entities therein is just an extra unnecessary hoop to jump through when making a change. 2017-06-01 Joseph Myers <[email protected]> [BZ #21457] * sysdeps/arm/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (mcontext_t): Use __ctx in defining fields. * sysdeps/i386/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (__ctxt): Likewise. (fpregset_t): Use __ctx and __ctxt in defining fields. (mcontext_t): Likewise. * sysdeps/m68k/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (mcontext_t): Use __ctx in defining fields. * sysdeps/mips/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (fpregset_t): Use __ctx in defining fields. (mcontext_t): Likewise. * sysdeps/unix/sysv/linux/alpha/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if [__USE_MISC]. (fpregset_t): Define using __NFPREG. * sysdeps/unix/sysv/linux/m68k/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (fpregset_t): Use __ctx in defining fields. (mcontext_t): Likewise. * sysdeps/unix/sysv/linux/mips/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (__ctx): New macro. (fpregset_t): Use __ctx in defining fields. (mcontext_t): Likewise. * sysdeps/unix/sysv/linux/nios2/sys/ucontext.h (__ctx): New macro. (mcontext_t): Use __ctx in defining fields. * sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h (__ctx): New macro. [__WORDSIZE == 32] (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. [__WORDSIZE == 32] (gregset_t): Define using __NGREG. [__WORDSIZE == 32] (fpregset_t): Use __ctx in defining fields. (mcontext_t): Likewise. [__WORDSIZE != 32] (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. [__WORDSIZE != 32] (NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if [__USE_MISC]. [__WORDSIZE != 32] (NVRREG): Rename to __NVRREG and define NVRREG to __NVRREG if [__USE_MISC]. [__WORDSIZE != 32] (gregset_t): Define using __NGREG. [__WORDSIZE != 32] (fpregset_t): Define using __NFPREG. [__WORDSIZE != 32] (vscr_t): Use __ctx in defining fields. [__WORDSIZE != 32] (vrregset_t): Likewise. [__WORDSIZE != 32] (mcontext_t): Likewise. * sysdeps/unix/sysv/linux/s390/sys/ucontext.h (__ctx): New macro. (__psw_t): Use __ctx in defining fields. (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (fpreg_t): Use __ctx in defining fields. (fpregset_t): Likewise. (mcontext_t): Likewise. * sysdeps/unix/sysv/linux/sh/sys/ucontext.h (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. (gregset_t): Define using __NGREG. (NFPREG): Rename to __NFPREG and define NFPREG to __NFPREG if [__USE_MISC]. (fpregset_t): Define using __NFPREG. (__ctx): New macro. (mcontext_t): Use __ctx in defining fields. * sysdeps/unix/sysv/linux/x86/sys/ucontext.h (__ctx): New macro. [__x86_64__] (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. [__x86_64__] (gregset_t): Define using __NGREG. [__x86_64__] (struct _libc_fpxreg): Use __ctx in defining fields. [__x86_64__] (struct _libc_fpstate): Likewise. [__x86_64__] (mcontext_t): Likewise. [!__x86_64__] (NGREG): Rename to __NGREG and define NGREG to __NGREG if [__USE_MISC]. [!__x86_64__] (gregset_t): Define using __NGREG. [!__x86_64__] (struct _libc_fpreg): Use __ctx in defining fields. [!__x86_64__] (struct _libc_fpstate): Likewise. [!__x86_64__] (mcontext_t): Likewise. > Being able to generate the ChangeLog file is I think important for > posterity, tarball releases lack any kind of history. History has a > really bad memory, just because one uses a VCS today doesn't mean that > this will be available in 10, 20, 30 years in any usable format, or it > might vanish completley. Well, you could add a requirement not to switch away from a distributed VCS or to switch to a different VCS without converting history. And indeed one to have the repository present or mirrored on GNU servers, if desired (or to have release tarball versions that include the VCS history, etc.). > Keep a change log to describe all the changes made to program > source files. The purpose of this is so that people > investigating bugs in the future will know about the changes that > might have introduced the bug. Often a new bug can be found by > looking at what was recently changed. More importantly, change > logs can help you eliminate conceptual inconsistencies between > different parts of a program, by giving you a history of how the > conflicting concepts arose and who they came from. > > All this information is available in version control. > > If you put ChangeLog entries in the commit message, then yes this > information will be available. But if you discard ChangeLog entries > completley, I do not see how it can be available. "annotate", "diff" > don't provide a human readable and searchable means to go through > history. The information also becomes totally lost as soon as you > discard the VCS (i.e. when doing releases). You know about the changes much more reliably from the VCS than from ChangeLog entries, given that people may forget to write the ChangeLog entry, or may miss out a file or function's changes from it, or may commit with a ChangeLog from a previous version of the patch that doesn't correspond accurately to the committed patch version (given the make-work nature of writing most ChangeLog entries, and given they are something not generally used outside the GNU project, updating them is often something people don't think of doing - again, watching for badly updated ChangeLog entries in patch review is both necessary at present, and essentially a waste of time). You can use e.g. "git log -p --stat" and search for file or function names (function names mentioned automatically on the @@ line of diff context are going to be at least as accurate as those in ChangeLog entries, given they are probably what people use when writing their ChangeLog entries to identify the functions changed). The precise details may differ, but you have much more flexibility when looking at the actual history than something written at a very specific level (too high to actually undo the changes, too low to readily get an overall understanding of a complicated change) for ChangeLogs. > Because the problem with ChangeLogs, as seen in glibc and > elsewhere, is with needing to write descriptions in a particular > format, at a level that is not useful for human understanding of > the changes while not being as detailed as the exact changes > themselves in version control, being able to generate ChangeLogs > from version control using suitably-formatted log messages does not > address the issue. > > Are we talking about the entries, or the actual ChangeLog file? Many > projects have abandoned keeping actual ChangeLog files, and extracting > this information when making a tarball release since they cause the > typical merge conflics and what nots. If you are refering to the > ChangeLog entries, I am not sure what problems you are refering to. The problem that enumerating individual named entities changed consumes the time of contributors, confuses and puts off people used to non-GNU free software which invariably does not use this particular pre-VCS form of describing changes, and results in long unhelpful descriptions which don't allow you to see the wood for the trees because of the focus on a particular low-level repetition of what the change itself is for each individual entity, as can be seen in the VCS, rather than what the change is as a logical whole. > This form of description is exactly what's the problem. In the > presence of ubiquitous distributed version control, writing this > style of description is the equivalent of: > > /* Add 1 to i. */ > i++; > > (that is, just repeating the immediately obvious meaning of the > history that everyone can see, and so effectively serving to hide > what's actually interesting about the history at a human level and > *should* be described). > > I don't think the comparison is fair, the point of the ChangeLog files > is to be able to undo changes. The comment above doesn't actually The point of the VCS is to be able to undo changes. ChangeLog files, and the form of change description therein, are in no way a substitute for the VCS, and are essentially obsoleted by it. > provide anything, a more apt comparison would have been: > > /* Change #1 was: Add pi to i. */ > /* Change #2 was: Add 1 to i. */ > i += 2; No, my assertion is that "Add 1 to i." is to "i++;" as the above long ChangeLog entry is to the actual commit involved - a repetition of what everyone can plainly see from looking at the thing described (a C statement in the first case, a commit in the git history of glibc in the second case), and so completely useless. Instead of "Add 1 to i." you should describe logical blocks of code at the logical level with things that aren't immediately repeating the obvious semantics of the code. And, likewise, the actual commit message Fix more namespace issues in sys/ucontext.h (bug 21457). Continuing the fixes for namespace issues in sys/ucontext.h, this patch moves various symbols into the implementation namespace in the absence of __USE_MISC. As with previous changes, it is nonexhaustive, just covering more straightforward cases. Structure fields are generally changed to have a prefix __ in the absence of __USE_MISC, via a macro __ctx (used without a space before the open parenthesis, since the result is a single identifier). Various macros such as NGREG also have leading __ added. No changes are made to structure tags (and thus to C++ name mangling), except that in the (unused) file sysdeps/i386/sys/ucontext.h, structures defined inside other structures as the type for a field have their tags removed in the non-__USE_MISC case (those structure tags would not in any case have been visible in C++, because in C++ the scope of such a tag is limited to the containing structure). No changes are made to the contents of bits/sigcontext.h, or to whether it is included. Because of remaining namespace issues, this patch does not yet fix the bug or allow any XFAILs to be removed. describes the logical nature of the change (including what is *not* changed, where relevant, which ChangeLog files would never mention), at the appropriate level for people to understand it. I think people should be writing commit logs at that level rather than spending time duplicating the VCS information on exactly which symbols were changed in which files. > That information is very useful when digging for bugs, and > understanding how a code base was changed. Just because one uses VCS > doesn't mean that history is automatically available to everyone, > someone still needs to write a commit message of some sort (i.e. the > ChangeLog entry) I.e. the sort of message above that you use to justify and explain the change at the logical level rather than enumerating files and symbols therein. I'm all for proper detailed commit messages explaining both the content and the purpose of the change at the logical level, as used by the Linux kernel and by git itself. It's the descriptions at the per-file, per-function level in the ChangeLog format that I consider unhelpful when they duplicate VCS information, badly. I think GNU should be encouraging the sort of commit messages used by the Linux kernel and git, i.e. the sort of patch description you'd put in a mailing list message proposing and explaining the patch, while leaving the VCS to show what files and bits of files were changed, how, for those interested in that information. > Sifting through multi-megabyte diffs isn't very fun when trying to get > a birds eye view of what actually happened in a code base, and this is > where ChangeLog entries are super useful and I'd argue totally > nessecary for any code base. I don't think so. If someone wants to understand what changed between glibc 2.25 and 2.26 in more detail than the NEWS file gives, they might look at the above sort of description in the commit log; it will be much more helpful to them, and give much more insight into glibc development, than over 10000 lines of ChangeLog entries enumerating files and symbols. If they want to see the files changed, git log --stat. If they want to see deeper into particular changes, git log -p --stat and look at whichever changes are of interest. -- Joseph S. Myers [email protected]
