Re: Pretty format specifier for commit count?

2015-01-22 Thread Jeff King
On Thu, Jan 22, 2015 at 11:10:42AM +0100, Michael J Gruber wrote:

 We do have a linear history when we walk with --first-parent :)

Yes, but I do not think it is robust to adding new commits on top. E.g.,
given:

  A--B--C---F
  \/
   D--E

a --first-parent walk from F will show F-C-B-A. Now imagine the branch
advances to I:

  G--H---I
 /  /
  A--B--C---F--J
  \/
   D--E

A walk from I will show I-H-G-C-B-A. F is no longer mentioned at all,
and A, B, and C are now at different positions.

This might be OK in Josh's case. I have an intuition that commits can
only be _removed_ in this case. Which means position from the _top_
might change, but the position from the root will always be the same
(and that is what he wants to be stable).  But I did not think hard
enough to convince myself that this is always the case.

 So, for the changelog for commits on a branch, where on a branch is
 not the git concept but defined by git rev-list --first-parent (more
 like hg branches), the count from root would be deterministic and the
 right concept given the appropriate branch workflow.

Certainly the distance to root is deterministic. But I think we are
really counting number of commits to be output between the root and
this commit. I guess if:

  1. You only ever start from one traversal point.

  2. You are picking only one parent for each merge.

then we know that our queue of commits to examine only ever has 0 or 1
items in it. And therefore a commit is either shown in the same
position from the end, or not shown at all. Because once we get there,
it is deterministic which commits we will show.

 Generation numbers are monotonous but may increase by steps greater than
 1 on that branch if I remember them correctly. I.e., merge commits are
 weighted by the number of commits that get merged in.

Sort of. It is the longest distance to (any) root from the commit. So it
is the max() of the generations of the parents, plus one. So for a
simple branch/merge between two lines of development, the increase is
the number of commits that are merged. But a branch that has its own
branches will not increase the generation count by the total number of
commits, but rather by the longest individual sub-branch.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pretty format specifier for commit count?

2015-01-22 Thread Michael J Gruber
j...@joshtriplett.org schrieb am 21.01.2015 um 00:11:
 On Tue, Jan 20, 2015 at 04:49:53PM -0500, Jeff King wrote:
 On Mon, Jan 19, 2015 at 05:17:25PM -0800, Josh Triplett wrote:

 Can you be a bit more specific about the type count that you are after?
 git describe counts commits since the most recent tag (possibly within
 a specific subset of all tags). Is that your desired format?

 That might work, since the repository in question has no tags; I'd
 actually like commits since root commit.

 That's basically a generation number. But I'm not sure if that's really
 what you want; in a non-linear history it's not unique (two children of
 commit X are both X+1).
 
 That would actually be perfectly fine.  If I need to distinguish
 branches, I can either use branch/tag names, or append a commit hash.  I
 don't mind the following:
 
  /-B-\
 A D
  \-C-/
 
 A=1
 B=C=2
 D=3
 
 I could (and probably should) append +hash to the version number for
 uniqueness, and if I care what order B and C sort in, I can use tags,
 branches, or some other more clever mechanism.
 
 It sounds like you really just want commits
 counting up from the root, and with side branches to have their own
 unique numbers. So something like:

C
   /
   A--B--D

   A=1
   B=2
   C=3
   D=4

 except the last two are assigned arbitrarily. You need some rules for
 linearizing the commits.
 
 I don't care about the numbers assigned to anything not reachable from
 the committish I start from.
 
 But that's not deterministic as you add more starting points (either new
 ref tips, or just new merges we have to cross). For example, imagine
 this:

  G--H
 /\
C--E   \
   /\   \
   A--B--D---F---I

 If we start at I, then we might visit H and G first, meaning we learn
 about C much earlier than we otherwise would. Then we hit F, and get to
 C from there. But now it it may be in a different position with respect
 to D!
 
 Right, the numbers need to always stay the same as you add more commits
 over time.  If walking a given graph assigns a given set of generation
 numbers, walking any subgraph should assign all the same generation
 numbers to the common nodes.
 
 I suspect your problem statement may simply assume a linear history,
 which makes this all much simpler. But we are not likely to add a
 feature to git that will break badly once you have a non-linear history. :)
 
 Not assuming a linear history, but assuming a linear changelog file. :)
 
 I think in the linear case that a generation number _would_ be correct,
 and it is a useful concept by itself. So that may be the best thing to
 add.
 
 Sounds good to me.
 
 - Josh Triplett

We do have a linear history when we walk with --first-parent :)

So, for the changelog for commits on a branch, where on a branch is
not the git concept but defined by git rev-list --first-parent (more
like hg branches), the count from root would be deterministic and the
right concept given the appropriate branch workflow.

Generation numbers are monotonous but may increase by steps greater than
1 on that branch if I remember them correctly. I.e., merge commits are
weighted by the number of commits that get merged in.

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pretty format specifier for commit count?

2015-01-20 Thread josh
On Tue, Jan 20, 2015 at 04:49:53PM -0500, Jeff King wrote:
 On Mon, Jan 19, 2015 at 05:17:25PM -0800, Josh Triplett wrote:
 
   Can you be a bit more specific about the type count that you are after?
   git describe counts commits since the most recent tag (possibly within
   a specific subset of all tags). Is that your desired format?
  
  That might work, since the repository in question has no tags; I'd
  actually like commits since root commit.
 
 That's basically a generation number. But I'm not sure if that's really
 what you want; in a non-linear history it's not unique (two children of
 commit X are both X+1).

That would actually be perfectly fine.  If I need to distinguish
branches, I can either use branch/tag names, or append a commit hash.  I
don't mind the following:

 /-B-\
A D
 \-C-/

A=1
B=C=2
D=3

I could (and probably should) append +hash to the version number for
uniqueness, and if I care what order B and C sort in, I can use tags,
branches, or some other more clever mechanism.

 It sounds like you really just want commits
 counting up from the root, and with side branches to have their own
 unique numbers. So something like:
 
C
   /
   A--B--D
 
   A=1
   B=2
   C=3
   D=4
 
 except the last two are assigned arbitrarily. You need some rules for
 linearizing the commits.

I don't care about the numbers assigned to anything not reachable from
the committish I start from.

 But that's not deterministic as you add more starting points (either new
 ref tips, or just new merges we have to cross). For example, imagine
 this:
 
  G--H
 /\
C--E   \
   /\   \
   A--B--D---F---I
 
 If we start at I, then we might visit H and G first, meaning we learn
 about C much earlier than we otherwise would. Then we hit F, and get to
 C from there. But now it it may be in a different position with respect
 to D!

Right, the numbers need to always stay the same as you add more commits
over time.  If walking a given graph assigns a given set of generation
numbers, walking any subgraph should assign all the same generation
numbers to the common nodes.

 I suspect your problem statement may simply assume a linear history,
 which makes this all much simpler. But we are not likely to add a
 feature to git that will break badly once you have a non-linear history. :)

Not assuming a linear history, but assuming a linear changelog file. :)

 I think in the linear case that a generation number _would_ be correct,
 and it is a useful concept by itself. So that may be the best thing to
 add.

Sounds good to me.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pretty format specifier for commit count?

2015-01-20 Thread Jeff King
On Mon, Jan 19, 2015 at 05:17:25PM -0800, Josh Triplett wrote:

  Can you be a bit more specific about the type count that you are after?
  git describe counts commits since the most recent tag (possibly within
  a specific subset of all tags). Is that your desired format?
 
 That might work, since the repository in question has no tags; I'd
 actually like commits since root commit.

That's basically a generation number. But I'm not sure if that's really
what you want; in a non-linear history it's not unique (two children of
commit X are both X+1). It sounds like you really just want commits
counting up from the root, and with side branches to have their own
unique numbers. So something like:

   C
  /
  A--B--D

  A=1
  B=2
  C=3
  D=4

except the last two are assigned arbitrarily. You need some rules for
linearizing the commits.

Git's default output order is deterministic when walking backwards
through history from a specific set of starting points. We keep a queue
of commits to visit, sorted by timestamp, with ties in timestamps broken
by whichever was added first (so two parents of a merge get the first
parent added first, then the second). E.g. (and remember we're walking
backwards from the tip here, but you could do the backwards walk and
then reverse it, and start numbering from the other end):

   C--E
  /\
  A--B--D---F

If we start at F, we might visit F, E, D, C, B, A. Or maybe C before D,
but only if its commit timestamp is newer (and if they tie, we
definitely visit D first, because it will have been queued first).

But that's not deterministic as you add more starting points (either new
ref tips, or just new merges we have to cross). For example, imagine
this:

 G--H
/\
   C--E   \
  /\   \
  A--B--D---F---I

If we start at I, then we might visit H and G first, meaning we learn
about C much earlier than we otherwise would. Then we hit F, and get to
C from there. But now it it may be in a different position with respect
to D!

I suspect your problem statement may simply assume a linear history,
which makes this all much simpler. But we are not likely to add a
feature to git that will break badly once you have a non-linear history. :)

I think in the linear case that a generation number _would_ be correct,
and it is a useful concept by itself. So that may be the best thing to
add.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pretty format specifier for commit count?

2015-01-19 Thread Josh Triplett
On Mon, Jan 19, 2015 at 02:54:13PM +0100, Michael J Gruber wrote:
 Josh Triplett schrieb am 19.01.2015 um 02:29:
  I'd like to use git-log to generate a Debian changelog file (with one
  entry per commit), which has entries like this:
  
  package-name (version-number) unstable; urgency=low
  
   * ...
  
   -- Example Person per...@example.org  RFC822-date
  
  Since I'm intentionally generating one entry per commit, I can generate
  *almost* all of this with git log:
  
  git log --pretty='format:packagename (FIXME) unstable; urgency=low%n%n  * 
  %s%n%w(0,4,4)%+b%w(0,0,0)%n -- %an %ae  %aD%n'
  
  This produces entries like this:
  
  packagename (FIXME) unstable; urgency=low
  
* Example change
  
  Long description of example change.
  
   -- Josh Triplett j...@joshtriplett.org  Thu, 8 Jan 2015 16:36:52 -0800
  
  packagename (FIXME) unstable; urgency=low
  
* Initial version
  
   -- Josh Triplett j...@joshtriplett.org  Thu, 8 Jan 2015 16:36:51 -0800
  
  Would it be possible to add a format specifier producing a commit count,
  similar to that provided by git-describe?  Such a specifier would allow
  filling in the version number in the format above (replacing the FIXME).
  (Note that the version numbers need to monotonically increase; otherwise
  I would just use the commit hash as the version numer.)
  
  - Josh Triplett
  
 
 Can you be a bit more specific about the type count that you are after?
 git describe counts commits since the most recent tag (possibly within
 a specific subset of all tags). Is that your desired format?

That might work, since the repository in question has no tags; I'd
actually like commits since root commit.

I could imagine scenarios in which both most recent tag and commits
since most recent tag would be useful format specifiers; however, for
this use case, I'm looking for commits since root commit.

 (I won't suggest scripting around rev-list, describe and log -1 because
 you know that already...)

Right.  Though as far as I can tell, git describe doesn't actually do
what I'm looking for.  rev-list --count $commit does (though that'd be
N**2), as would something like rev-list --reverse HEAD | nl | while read
count hash ; do ..., but I'd like to do better than that.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pretty format specifier for commit count?

2015-01-19 Thread Michael J Gruber
Josh Triplett schrieb am 19.01.2015 um 02:29:
 I'd like to use git-log to generate a Debian changelog file (with one
 entry per commit), which has entries like this:
 
 package-name (version-number) unstable; urgency=low
 
  * ...
 
  -- Example Person per...@example.org  RFC822-date
 
 Since I'm intentionally generating one entry per commit, I can generate
 *almost* all of this with git log:
 
 git log --pretty='format:packagename (FIXME) unstable; urgency=low%n%n  * 
 %s%n%w(0,4,4)%+b%w(0,0,0)%n -- %an %ae  %aD%n'
 
 This produces entries like this:
 
 packagename (FIXME) unstable; urgency=low
 
   * Example change
 
 Long description of example change.
 
  -- Josh Triplett j...@joshtriplett.org  Thu, 8 Jan 2015 16:36:52 -0800
 
 packagename (FIXME) unstable; urgency=low
 
   * Initial version
 
  -- Josh Triplett j...@joshtriplett.org  Thu, 8 Jan 2015 16:36:51 -0800
 
 Would it be possible to add a format specifier producing a commit count,
 similar to that provided by git-describe?  Such a specifier would allow
 filling in the version number in the format above (replacing the FIXME).
 (Note that the version numbers need to monotonically increase; otherwise
 I would just use the commit hash as the version numer.)
 
 - Josh Triplett
 

Can you be a bit more specific about the type count that you are after?
git describe counts commits since the most recent tag (possibly within
a specific subset of all tags). Is that your desired format?

(I won't suggest scripting around rev-list, describe and log -1 because
you know that already...)

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html