Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms
Any news about this patch? 2017-03-21 22:24 GMT+01:00 Alex Hoffman : > Hi, Brian, > > We definitely prefer the wrapper function oid_to_hex() to > sha1_to_hex(). Thanks for feedback. > Below is the updated patch: > > --- > bisect.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/bisect.c b/bisect.c > index 30808cadf..7b65acbcd 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -131,7 +131,7 @@ static void show_list(const char *debug, int > counted, int nr, > unsigned flags = commit->object.flags; > enum object_type type; > unsigned long size; > - char *buf = read_sha1_file(commit->object.sha1, &type, &size); > + char *buf = read_sha1_file(commit->object.oid.hash, > &type, &size); > const char *subject_start; > int subject_len; > > @@ -143,10 +143,10 @@ static void show_list(const char *debug, int > counted, int nr, > fprintf(stderr, "%3d", weight(p)); > else > fprintf(stderr, "---"); > - fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1)); > + fprintf(stderr, " %.*s", 8, oid_to_hex(&commit->object.oid)); > for (pp = commit->parents; pp; pp = pp->next) > fprintf(stderr, " %.*s", 8, > - sha1_to_hex(pp->item->object.sha1)); > + oid_to_hex(&pp->item->object.oid)); > > subject_len = find_commit_subject(buf, &subject_start); > if (subject_len) > -- > 2.12.0.400.g54ad2d445.dirty
Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms
Hi, Brian, We definitely prefer the wrapper function oid_to_hex() to sha1_to_hex(). Thanks for feedback. Below is the updated patch: --- bisect.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bisect.c b/bisect.c index 30808cadf..7b65acbcd 100644 --- a/bisect.c +++ b/bisect.c @@ -131,7 +131,7 @@ static void show_list(const char *debug, int counted, int nr, unsigned flags = commit->object.flags; enum object_type type; unsigned long size; - char *buf = read_sha1_file(commit->object.sha1, &type, &size); + char *buf = read_sha1_file(commit->object.oid.hash, &type, &size); const char *subject_start; int subject_len; @@ -143,10 +143,10 @@ static void show_list(const char *debug, int counted, int nr, fprintf(stderr, "%3d", weight(p)); else fprintf(stderr, "---"); - fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1)); + fprintf(stderr, " %.*s", 8, oid_to_hex(&commit->object.oid)); for (pp = commit->parents; pp; pp = pp->next) fprintf(stderr, " %.*s", 8, - sha1_to_hex(pp->item->object.sha1)); + oid_to_hex(&pp->item->object.oid)); subject_len = find_commit_subject(buf, &subject_start); if (subject_len) -- 2.12.0.400.g54ad2d445.dirty
[PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms
--- bisect.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/bisect.c b/bisect.c index 30808cadf..6feed8533 100644 --- a/bisect.c +++ b/bisect.c @@ -131,7 +131,7 @@ static void show_list(const char *debug, int counted, int nr, unsigned flags = commit->object.flags; enum object_type type; unsigned long size; - char *buf = read_sha1_file(commit->object.sha1, &type, &size); + char *buf = read_sha1_file(commit->object.oid.hash, &type, &size); const char *subject_start; int subject_len; @@ -143,10 +143,10 @@ static void show_list(const char *debug, int counted, int nr, fprintf(stderr, "%3d", weight(p)); else fprintf(stderr, "---"); - fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1)); + fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.oid.hash)); for (pp = commit->parents; pp; pp = pp->next) fprintf(stderr, " %.*s", 8, - sha1_to_hex(pp->item->object.sha1)); + sha1_to_hex(pp->item->object.oid.hash)); subject_len = find_commit_subject(buf, &subject_start); if (subject_len) -- 2.12.0.399.g9d77b0405.dirty
Re: Git bisect does not find commit introducing the bug
> isn't that spelt `--ancestry-path` ? > (--ancestry-path has it's own issues such as needing an --first-parent-show > option, but that's possibly a by the by) Indeed it is spelled `--ancestry-path`. And interestingly enough you may use it multiple times with the wanted effect in our case (e.g when the user has multiple good commit and a single bad commit before running the bisect itself). Also it is `--first-parent` (not `--first-parent-show`), but I do not understand why do we need this option? What kind of issues does `--ancestry-path` have? Best regards, VG
Re: Git bisect does not find commit introducing the bug
> If `git bisect` is/would be affected by `git log` history-related options > then this is what `--strict-ancestor` option gives/would give. Exactly my thoughts. All that needs to be changed in the 2nd problem is the graph where to search. But first we must agree about the usefulness of the 2nd problem. Any thoughts/comments/votes for/against?
Re: Git bisect does not find commit introducing the bug
I see two different problems each with a different assumption (see the definition of "bisectable" in the email of Junio C Hamano): 1. (Current) Assume the entire history graph is bisectable. DO: Search where in the entire graph the first 'trait'/transition occurs. 2. (New) Assume only the graph between one good commit and one bad commit is bisectable. DO: Search where the first transition occurs in the graph between these two commits. It seems that the real world needs a solution also for the second problem (example if the good commit is the FIRST good commit of a feature or if the good commit is not the first good commit, but you definitely know, that it broke first somewhere in between the good and bad commit). I find the way to go as Oleg proposed is gittish enough (with a new parameter --strategy). Beside I would underline that also the second problem is a bisect problem, just for another graph, thus it makes perfect sense to extend 'git bisect' for this. Does that look reasonably?
Re: Git bisect does not find commit introducing the bug
> Then you must adjust your definition of "good": All commits that do not have > the feature, yet, are "good": since they do not have the feature in the > first place, they cannot have the breakage that you found in the feature. > > That is exactly the situation in your original example! But you constructed > the condition of goodness in such a simplistic way (depending on the > presence of a string), that it was impossible to distinguish between "does > not have the feature at all" and "has the feature, but it is broken". Johannes, thank you for correctly identifying the error in my logic. Indeed I was using the term 'bad' also for the commit without the feature. In order to find the commit introducing the bug in my example a new state is needed, which would make 'git bisect' a bit more complicated than the user 'most of the time' probably needs. Or do you think, it would make sense to ask the user for this state (if e.g 'git bisect' would be started with a new parameter)?
Re: Git bisect does not find commit introducing the bug
Below is a correction of the first proposed algorithm: >--o1--o2--o3--G --X1 >\\ > x1--x2--x3--x4--X2--B-- > \ / > y1--y2--y3 > Step 1a. (Unchanged) keep only the commits that: a) are ancestor of the "bad" commit (including the "bad" commit itself), b) are not ancestor of a "good" commit (excluding the "good" commits). The following graph results: x1--x2--x3--x4--X2--B-- \ / y1--y2--y3 Step 1b. (New) Mark all commits of the resulting graph as unconfirmed (unconfirmed=node without good ancestors). Mark as confirmed all descendants of commits that user marked explicitly as good (e.g if user already marked its parent/grand parent/... as good right before starting bisect run). Step 1c. From all unconfirmed root commits build a set SET_GOOD of those with any good parents in the original graph (root commit = commit without parents in the resulting graph) and one SET_BAD for commits with only BAD parents. To build a set means to ask explicitly the user (or the command passed in git bisect run) whether any of its parents is good. In the example above find out whether x1 has any good parents or no parent at all and if so add it to SET_GOOD, otherwise to SET_BAD. Mark as confirmed each commit in SET_GOOD and all its descendants. For every commit in SET_BAD delete all paths from it until to the next confirmed commit. In the example above if x1 is in SET_BAD delete the path x1-x3 and x1-y3. If any new root commits result (commit x4), redo step 1c with them. Step2. Continue the existing algorithm. If this improvement works (i.e you do not find any bugs in it and it is feasible to implement, which seems to me) the following would be its advantages: 1. An assumption less, as we explicitly check the assumption. 2. It might be quicker, because we delete parts of graph that cannot contain transitions. 3. It returns more exact results. VG
Re: Git bisect does not find commit introducing the bug
> At the end of the git-bisect man page (in the SEE ALSO section) there > is a link to > https://github.com/git/git/blob/master/Documentation/git-bisect-lk2009.txt > which has a lot of details about how bisect works. > Thanks for pointing out the SEE ALSO section. I think it makes sense to include/describe the entire algorithm in the man page itself, although I am not sure whether the graphs would be always correctly visually represented in the man page format. > The goal is to find the first bad commit, which is a commit that has > only good parents. OK, bisect's mission is more exact than I thought, which is good. M > As o1 is an ancestor of G, then o1 is considered good by the bisect algorithm. > If it was bad, it would means that there is a transition from bad to > good between o1 and G. > But when a good commit is an ancestor of the bad commit, git bisect > makes the assumption that there is no transition from bad to good in > the graph. The assumption that there is no transition from bad to good in the graph did not hold in my example and it does not hold when a feature was recently introduced and gets broken relative shortly afterwards. But I consider it is easy to change the algorithm not to assume, but rather to check it. > git bisect makes some assumptions that are true most of the time, so > in practice it works well most of the time. Whatever the definition of "most of the time" everyone might have, I think there is room for improvement. Below I am trying to make a small change to the current algorithm in order to deal with the assumption that sometimes does not hold (e.g in my example), by explicitly validating the check. > --o1--o2--o3--G--X1 > \\ > x1--x2--x3--x4--X2--B-- > \ / >y1--y2--y3 Step 1a. (Unchanged) keep only the commits that: a) are ancestor of the "bad" commit (including the "bad" commit itself), b) are not ancestor of a "good" commit (excluding the "good" commits). The following graph results: x1--x2--x3--x4--X2--B-- \ / y1--y2--y3 Step 1b. (New) Mark all root commits of the resulting graph (i.e commits without parents) as unconfirmed (unconfirmed=node that has only bad parents). Remove all root commits that user already confirmed (e.g if user already marked its parent as good right before starting bisect run). For every unconfirmed root commit check if it has any good parents. In the example above check whether x1 has good parents. If the current root element has any parents and none of them is good, we can delete all paths from it until to the next commit that has a parent in the ancestors of GOOD. In the example above to delete the path x1-x3 and x1-y3. Also add new resulting root commits to the list of unconfirmed commits (commit x4). Otherwise mark it as confirmed. Step2. Continue the existing algorithm. If this improvement works (i.e you do not find any bugs in it and it is feasible to implement, which seems to me) the following would be its advantages: 1. An assumption less, as we explicitly check the assumption. 2. It might be quicker, because we delete parts of graph that cannot contain transitions. 3. It returns more exact results. VG
Re: Git bisect does not find commit introducing the bug
> But this is not how Git works. Git computes graph differences, i.e., it > subtracts from the commits reachable from v.bad those that are reachable > from v.good. This leaves more than just those on some path from v.good to > v.bad. And it should work this way. Consider this history: > > --o--o--o--G--X >\ \ > x--x--x--x--X--B-- > > When you find B bad and G good, than any one of the x or X may have > introduced the breakage, not just one of the X. > Thank you for clarifying how git bisect works. How did you find that out? Did you check the source code? If that is not documented in the man page may be it worth documenting it in order to avoid future confusion for other users. Let's consider your example with distinct names for nodes: --o1--o2--o3--G--X1 \\ x1--x2--x3--x4--X1--B-- It makes sense that git bisect is expecting a single transition, as this is a precondition for a binary search to work. My definition of "the transition" is a commit with at least one of its parents as a good version, but the commit itself a bad version. I hope we agree that git bisect's mission is to search for this transition (as I suppose that most of people need such a functionality from git, if not exactly from git bisect). How could be x1 or x3 be the transition, if chances are that o1 is not a good version? Of course it would make sense to me if bisect would check o1 whether good and only then to check also x1-x3, but this is not what git makes (at least not in my initial example). If you consider that git bisect's mission is different from finding the transition, could you please explain what exact commit git bisect is supposed to return (ideally with terms from the graph theory) and when it makes sense to return that? Because I do not see any sense in looking in the path x1-x3 without knowing that those commits may be a transition. > Oh, IMO git bisect was well thought through. If it considered just paths > from good to bad, it would not given the correct answer. See the example > history above. Bisect authors would not have deemed that sufficiently good You definitely convinced me that git MUST search more than only in the paths between good and bad commits, as the good commit G does not have to be the first good commit (thank you for that). My problem/confusion is that it returns something that does not make sense to me, because it does not make sure it returns a transition. VG PS: thank you for continuing this discussion.
Re: Git bisect does not find commit introducing the bug
>> First of all this is confusing, as this commit cannot be reached >> starting from "v.good". > Hm, IMHO it shows that your example is pretty artificial (although you > might have come across it in a real-world scenario): you introduced a > new feature in f4154e9 (and it worked) and you broke that feature by > making the merge 671cec2. However, the feature (that broke in 671cec2) > did not even exist in 04c6f4b; so a test on the feature would not fail > (leading to "bisect bad" as in the example), it would not exist (leading > to "bisect skip"). No one commented the fact, that I find this very confusing. Don't you find this confusing? I will underline, that 'git bisect good v.good' will fail if the commit 'v.good' is not a parent of the bad commit, meaning there MUST be at least a path between 'v.good' and 'v.bad', thus I would expect it looks on this path ONLY. Beside that, this is what I understand by 'binary search' (to search on this commit path). You might find this example artificial, but I doubt git is/was intentionally designed to work with 'natural' examples only (no matter how you define 'natural' and 'artificial'). >> In other words: bisect assumes that your repo is usually in a good state >> and you have a commit that changes it to a bad state. In your case you >> have a repo that is in a bad state and you have a commit that switches >> it to a good state and later you merge a bad-state branch and you have a >> bad state again. It is not made for that use-case, I think. > Correct. The assumption of bisection is that there is only one transition > between GOOD and BAD. By violating that assumption, anything can happen. I did not find that in the manpage or did I miss it? Why would someone assume that the commit graph looks in a certain way? I assume, that 'git bisect' was not thought through and that it considers the first directed path between v.good and v.bad, instead of all paths (in my example graph there are two such paths). I will also underline that git bisect was designed to work with multiple good commits and one bad commit (also multiple paths), but probably NOT with multiple paths between the same pair of good and bad commits. VG 2017-02-18 10:12 GMT+01:00 Johannes Sixt : > Am 18.02.2017 um 00:21 schrieb Stephan Beyer: >> >> On 02/17/2017 11:29 PM, Alex Hoffman wrote: >> * 7a9e952 (bisect bad) >> |\ >> | * 671cec2 <--- expected >> | |\ >> | * | 04c6f4b <--- found >> * | | 3915157 >> |\ \ \ >> | | |/ >> | |/| >> | * | f4154e9 (bisect good) >> | * | 85855bf >> | |/ >> * | f1a36f5 >> |/ >> * 1b7fb88 >> >> The and markers are set by your definition of what good and >> what bad commits are. >> >> [...] >> In other words: bisect assumes that your repo is usually in a good state >> and you have a commit that changes it to a bad state. In your case you >> have a repo that is in a bad state and you have a commit that switches >> it to a good state and later you merge a bad-state branch and you have a >> bad state again. It is not made for that use-case, I think. > > > Correct. The assumption of bisection is that there is only one transition > between GOOD and BAD. By violating that assumption, anything can happen. > > -- Hannes >
Git bisect does not find commit introducing the bug
Hi there, According to the documentation "git bisect" is designed "to find the commit that introduced a bug" . I have found a situation in which it does not returns the commit I expected. In order to reproduce the problem: 1. mkdir test; cd test; git clone https://github.com/entbugger/git-bisect-issue.git cd git-bisect-issue The tag "v.bad" is one bad version and tag "v.good" is the first good version. A good version is one having the line "FEATURE2" in file1.txt, which was introduced in "v.good". 2. Copy test scripts to another folder to make sure they are not overridden by 'git bisect' cp *.sh ../ cd .. ./search-bug-git.sh 3. Run ./search-bug-git.sh to search for the commit introducing the bug. It finds that the commit 04c6f4b9ed with the message "Feature 1" is the first one introducing the bug. First of all this is confusing, as this commit cannot be reached starting from "v.good". Then I expected the commit with the message "Introduce bug" to be returned by 'git bisect', as it is the first commit between "v.good" and "v.bad" that does not contain the line "FEATURE2" in file1.txt, which is what I understand by the commit "that introduced a bug" (cited from the manpage of git bisect). Thanks for looking to this, Best regards, VG