On 4/27/2018 12:03 PM, SZEDER Gábor wrote:
Szia Feri,

I'm moving the IRC discussion here, because this might be a bug report
in the end.  So, kindly try these steps (103 MB free space required):

$ git clone https://github.com/ClusterLabs/pacemaker.git && cd pacemaker
[...]
$ git branch --contains Pacemaker-0.6.1
* master
$ git tag --merged master | fgrep Pacemaker-0.6
Pacemaker-0.6.0
Pacemaker-0.6.2
Pacemaker-0.6.3
Pacemaker-0.6.4
Pacemaker-0.6.5
Pacemaker-0.6.6

Notice that Pacemaker-0.6.1 is missing from the output.  Kind people on
IRC didn't find a quick explanation, and we all had to go eventually.
Is this expected behavior?  Reproduced with git 2.11.0 and 2.17.0.
The commit pointed to by the tag Pacemaker-0.6.1 and its parent have a
serious clock skew, i.e. they are a few months older than their parents:

$ git log --format='%h %ad %cd%d%n    %s' --date=short 
Pacemaker-0.6.1^..47a8ef4c
47a8ef4ce 2008-02-15 2008-02-15
     Low: TE: Logging - display the op's magic field for unexpected and foreign 
events
b9cfcd6b4 2007-12-10 2007-12-10 (tag: Pacemaker-0.6.2)
     haresources2cib.py: set default-action-timeout to the default (20s)
52e7793e0 2007-12-10 2007-12-10
     haresources2cib.py: update ra parameters lists
dea277271 2008-02-14 2008-02-14
     Medium: Build: Turn on snmp support in rpm packages (patch from MATSUDA, 
Daiki)
f418742fe 2008-02-14 2008-02-14
     Low: Build: Update the .spec file with the one used by build service
ccfa716a5 2008-02-14 2008-02-14
     Medium: SNMP: Allow the snmp subagent to be built (patch from MATSUDA, 
Daiki)
50f0ade2d 2008-02-14 2008-02-14
     Low: Build: Update last release number
90f11667f 2008-02-14 2008-02-14
     Medium: Tools: Make sure the autoconf variables in haresources2cib are 
expanded
9d2383c46 2008-02-11 2008-02-11 (tag: Pacemaker-0.6.1)
     High: cib: Ensure the archived file hits the disk before returning

(branch|tag|describe|...) (--contains|--merged) use the commit timestamp
information as a heuristic to avoid traversing parts of the history,
which makes these operations, especially on big histories, an order of
magnitude or two faster.  Yeah, commit timestamps can't always be
trusted, but skewed commits are rare, and skewed commits with this much
skew are even rarer.

I'm not sure how (or if it's at all possible) to turn off this
timestamp-based optimisation.

This is actually a bit more complicated. The "--merged" check in 'git tag' uses a different mechanism to detect which tags are reachable. It uses a revision walk starting at the "merge commit" (master in your case) and all tags with the "limited" option (to ensure the walk happens during prepare_revision_walk()) but marks the merge commit as UNINTERESTING. The limit_list() method stops when all commits are marked UNINTERESTING - minus some "slop" related to the commits that start the walk.

One important note: the set of tags is important here. If you add a new tag to the root commit (git tag MyTag a2d71961f) then the walk succeeds by ensuring it walks until MyTag. This gets around the clock skew issue. There may be other more-recent tags with a clock-skew issue, but since Pacemaker-0.6.0 is the oldest tag, that requires the walk to continue until at least that date.

The commit-walk machinery in revision.c is rather complicated, and is used for a lot of different reasons, such as "git log" and this application in "git tag". It is on my list to refactor this code to use the commit-graph and generation numbers, but as we can see by this example, it is not easy to tease out what is happening in the code.

In a world where generation numbers are expected to be available, we could rewrite do_merge_filter() in ref-filter.c to call into paint_down_to_common() in commit.c using the new "min_generation" marker. By assigning the tags to be in the "twos" list and the merge commit in the "one" commit, we can check if the tags have the PARENT1 flag after the walk in paint_down_to_common(). Due to the static nature of paint_down_to_common(), we will likely want to abstract this into an external method in commit.c, say can_reach_many(struct commit *from, struct commit_list *to).

FWIW, much work is being done on a cached commit graph including commit
generation numbers, which will solve this issue both correctly and more
efficiently.  Perhaps it will already be included in the next release.

The work in ds/generation-numbers is focused on the "git tag --contains" method, which does return correctly here (it is the reverse of the --merged condition):

Which tags can reach Pacemaker-0.6.1?

$ git tag --contains Pacemaker-0.6.1
(returns a big list)

This is the actual reverse lookup (which branches contain this tag?)

$ git branch --contains Pacemaker-0.6.1 | grep master
* master

These commands work despite clock skew. The commit-graph feature makes them faster.

Thanks,
-Stolee

Reply via email to