Re: [PATCH 8/4] match-trees: drop "x = x" initializations
René Scharfe writes: > Am 24.03.2013 05:55, schrieb Junio C Hamano: >> So I like your change for readability, but for GCC 4.4.5 we still >> need the unnecessary initialization. > > Hrm, perhaps we can make it even simpler for the compiler. And the result is even simpler for human readers, I'd have to say. > I'm a bit uneasy about this one because we lack proper tests for > this code and I don't know how to write ones off the bat. This looks pretty much a straight-forward equivalent rewrite from your earlier one, which was also an obvious equivalent to the original, at least to me. The first four lines in the original were made into two tree_entry() calls (what a useful helper we haven't been using!) and that allows us to lose explicit update_tree_entry() calls. > match-trees.c | 68 > --- > 1 file changed, 28 insertions(+), 40 deletions(-) > > diff --git a/match-trees.c b/match-trees.c > index 26f7ed1..2bb734d 100644 > --- a/match-trees.c > +++ b/match-trees.c > @@ -47,6 +47,13 @@ static int score_matches(unsigned mode1, unsigned mode2, > const char *path) > return score; > } > > +static int base_name_entries_compare(const struct name_entry *a, > + const struct name_entry *b) > +{ > + return base_name_compare(a->path, tree_entry_len(a), a->mode, > + b->path, tree_entry_len(b), b->mode); > +} > + > /* > * Inspect two trees, and give a score that tells how similar they are. > */ > @@ -71,54 +78,35 @@ static int score_trees(const unsigned char *hash1, const > unsigned char *hash2) > if (type != OBJ_TREE) > die("%s is not a tree", sha1_to_hex(hash2)); > init_tree_desc(&two, two_buf, size); > - while (one.size | two.size) { > - const unsigned char *elem1 = elem1; > - const unsigned char *elem2 = elem2; > - const char *path1 = path1; > - const char *path2 = path2; > - unsigned mode1 = mode1; > - unsigned mode2 = mode2; > + for (;;) { > + struct name_entry e1, e2; > + int got_entry_from_one = tree_entry(&one, &e1); > + int got_entry_from_two = tree_entry(&two, &e2); > int cmp; > > - if (one.size) > - elem1 = tree_entry_extract(&one, &path1, &mode1); > - if (two.size) > - elem2 = tree_entry_extract(&two, &path2, &mode2); > - > - if (!one.size) { > - /* two has more entries */ > - score += score_missing(mode2, path2); > - update_tree_entry(&two); > - continue; > - } > - if (!two.size) { > + if (got_entry_from_one && got_entry_from_two) > + cmp = base_name_entries_compare(&e1, &e2); > + else if (got_entry_from_one) > /* two lacks this entry */ > - score += score_missing(mode1, path1); > - update_tree_entry(&one); > - continue; > - } > - cmp = base_name_compare(path1, strlen(path1), mode1, > - path2, strlen(path2), mode2); > - if (cmp < 0) { > + cmp = -1; > + else if (got_entry_from_two) > + /* two has more entries */ > + cmp = 1; > + else > + break; > + > + if (cmp < 0) > /* path1 does not appear in two */ > - score += score_missing(mode1, path1); > - update_tree_entry(&one); > - continue; > - } > - else if (cmp > 0) { > + score += score_missing(e1.mode, e1.path); > + else if (cmp > 0) > /* path2 does not appear in one */ > - score += score_missing(mode2, path2); > - update_tree_entry(&two); > - continue; > - } > - else if (hashcmp(elem1, elem2)) > + score += score_missing(e2.mode, e2.path); > + else if (hashcmp(e1.sha1, e2.sha1)) > /* they are different */ > - score += score_differs(mode1, mode2, path1); > + score += score_differs(e1.mode, e2.mode, e1.path); > else > /* same subtree or blob */ > - score += score_matches(mode1, mode2, path1); > - update_tree_entry(&one); > - update_tree_entry(&two); > + score += score_matches(e1.mode, e2.mode, e1.path); > } > free(one_buf); > free(two_buf); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
Am 24.03.2013 05:55, schrieb Junio C Hamano: > However, the same compiler still thinks {elem,path,mode}1 can be > used uninitialized (but not {elem,path,mode}2). The craziness I > reported in the previous message is also the same. With this patch > on top to swap the side we inspect first, the compiler thinks > {elem,path,mode}2 can be used uninitialized but not the other three > variables X-<. > > So I like your change for readability, but for GCC 4.4.5 we still > need the unnecessary initialization. Hrm, perhaps we can make it even simpler for the compiler. -- >8 -- Subject: match-trees: simplify score_trees() using tree_entry() Convert the loop in score_trees() to tree_entry(). The code becomes shorter and simpler because the calls to update_tree_entry() are not needed any more. Another benefit is that we need less variables to track the current tree entries; as a side-effect of that the compiler has an easier job figuring out the control flow and thus can avoid false warnings about uninitialized variables. Using struct name_entry also allows the use of tree_entry_len() for finding the path length instead of strlen(), which may be slightly more efficient. Also unify the handling of missing entries in one of the two trees (i.e. added or removed files): Just set cmp appropriately first, no matter if we ran off the end of a tree or if we actually have two entries to compare, and check its value a bit later without duplicating the handler code. Signed-off-by: Rene Scharfe --- I'm a bit uneasy about this one because we lack proper tests for this code and I don't know how to write ones off the bat. match-trees.c | 68 --- 1 file changed, 28 insertions(+), 40 deletions(-) diff --git a/match-trees.c b/match-trees.c index 26f7ed1..2bb734d 100644 --- a/match-trees.c +++ b/match-trees.c @@ -47,6 +47,13 @@ static int score_matches(unsigned mode1, unsigned mode2, const char *path) return score; } +static int base_name_entries_compare(const struct name_entry *a, +const struct name_entry *b) +{ + return base_name_compare(a->path, tree_entry_len(a), a->mode, +b->path, tree_entry_len(b), b->mode); +} + /* * Inspect two trees, and give a score that tells how similar they are. */ @@ -71,54 +78,35 @@ static int score_trees(const unsigned char *hash1, const unsigned char *hash2) if (type != OBJ_TREE) die("%s is not a tree", sha1_to_hex(hash2)); init_tree_desc(&two, two_buf, size); - while (one.size | two.size) { - const unsigned char *elem1 = elem1; - const unsigned char *elem2 = elem2; - const char *path1 = path1; - const char *path2 = path2; - unsigned mode1 = mode1; - unsigned mode2 = mode2; + for (;;) { + struct name_entry e1, e2; + int got_entry_from_one = tree_entry(&one, &e1); + int got_entry_from_two = tree_entry(&two, &e2); int cmp; - if (one.size) - elem1 = tree_entry_extract(&one, &path1, &mode1); - if (two.size) - elem2 = tree_entry_extract(&two, &path2, &mode2); - - if (!one.size) { - /* two has more entries */ - score += score_missing(mode2, path2); - update_tree_entry(&two); - continue; - } - if (!two.size) { + if (got_entry_from_one && got_entry_from_two) + cmp = base_name_entries_compare(&e1, &e2); + else if (got_entry_from_one) /* two lacks this entry */ - score += score_missing(mode1, path1); - update_tree_entry(&one); - continue; - } - cmp = base_name_compare(path1, strlen(path1), mode1, - path2, strlen(path2), mode2); - if (cmp < 0) { + cmp = -1; + else if (got_entry_from_two) + /* two has more entries */ + cmp = 1; + else + break; + + if (cmp < 0) /* path1 does not appear in two */ - score += score_missing(mode1, path1); - update_tree_entry(&one); - continue; - } - else if (cmp > 0) { + score += score_missing(e1.mode, e1.path); + else if (cmp > 0) /* path2 does not appear in one */ - score += score_missing(mode2, path2); - update_tree_entry(&two); - continue; - }
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
On Sat, Mar 23, 2013 at 09:55:53PM -0700, Junio C Hamano wrote: > René Scharfe writes: > > > Hmm, let's see if we can help the compiler follow the code without > > making it harder for people to understand. The patch looks a bit > > jumbled, but the resulting code is OK in my biased opinion. > > I actually think the result is much better than a mere "OK"; the > duplicated "at this point we know path1 (or path2) is missing from > the other side" has been bothering me and I was about to suggest a > similar rewrite before I read your message ;-) > > However, the same compiler still thinks {elem,path,mode}1 can be > used uninitialized (but not {elem,path,mode}2). The craziness I > reported in the previous message is also the same. With this patch > on top to swap the side we inspect first, the compiler thinks > {elem,path,mode}2 can be used uninitialized but not the other three > variables X-<. Yeah, I'd agree that the result is more readable, but it does not address the original problem. Junio, do you want to drop my patch, squash the initialization of mode into René's version, and just add a note to the commit message that we still have to deal with the gcc warning? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
René Scharfe writes: > Hmm, let's see if we can help the compiler follow the code without > making it harder for people to understand. The patch looks a bit > jumbled, but the resulting code is OK in my biased opinion. I actually think the result is much better than a mere "OK"; the duplicated "at this point we know path1 (or path2) is missing from the other side" has been bothering me and I was about to suggest a similar rewrite before I read your message ;-) However, the same compiler still thinks {elem,path,mode}1 can be used uninitialized (but not {elem,path,mode}2). The craziness I reported in the previous message is also the same. With this patch on top to swap the side we inspect first, the compiler thinks {elem,path,mode}2 can be used uninitialized but not the other three variables X-<. So I like your change for readability, but for GCC 4.4.5 we still need the unnecessary initialization. match-trees.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/match-trees.c b/match-trees.c index c0c66bb..9ea2c80 100644 --- a/match-trees.c +++ b/match-trees.c @@ -77,16 +77,16 @@ static int score_trees(const unsigned char *hash1, const unsigned char *hash2) unsigned mode1, mode2; int cmp = 0; - if (one.size) - elem1 = tree_entry_extract(&one, &path1, &mode1); - else - /* two has more entries */ - cmp = 1; if (two.size) elem2 = tree_entry_extract(&two, &path2, &mode2); else /* two lacks this entry */ cmp = -1; + if (one.size) + elem1 = tree_entry_extract(&one, &path1, &mode1); + else + /* two has more entries */ + cmp = 1; if (!cmp) cmp = base_name_compare(path1, strlen(path1), mode1, -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
Am 22.03.2013 17:21, schrieb Jeff King: > Of the 8 patches, this is the one I find the least satisfying, if only > because I do not think gcc's failure is because of complicated control > flow, and rearranging the code would only hurt readability. Hmm, let's see if we can help the compiler follow the code without making it harder for people to understand. The patch looks a bit jumbled, but the resulting code is OK in my biased opinion. -- >8 -- There are two ways we can spot missing entries, i.e. added or removed files: By reaching the end of one of the trees while the other still has entries, or in the middle of the two lists with base_name_compare(). Missing files are handled the same in either case, but the code is duplicated. Unify the handling by just setting cmp appropriately when running off a tree instead of handling the case on the spot. If both trees contain entries, call base_name_compare() as usual. This make the code slightly shorter, and also helps gcc 4.6 to understand that none of the variables in the loop are used without initialization. Therefore we can remove the trick to initialize them using themselves, which was used to squelch false warnings. [Stolen from Jeff King:] While we're in the area, let's also update the loop condition to use logical-OR rather than bitwise-OR. They should be equivalent in this case, and the use of the latter was probably a typo. Signed-off-by: Rene Scharfe --- match-trees.c | 36 ++-- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/match-trees.c b/match-trees.c index 26f7ed1..c0c66bb 100644 --- a/match-trees.c +++ b/match-trees.c @@ -71,34 +71,26 @@ static int score_trees(const unsigned char *hash1, const unsigned char *hash2) if (type != OBJ_TREE) die("%s is not a tree", sha1_to_hex(hash2)); init_tree_desc(&two, two_buf, size); - while (one.size | two.size) { - const unsigned char *elem1 = elem1; - const unsigned char *elem2 = elem2; - const char *path1 = path1; - const char *path2 = path2; - unsigned mode1 = mode1; - unsigned mode2 = mode2; - int cmp; + while (one.size || two.size) { + const unsigned char *elem1, *elem2; + const char *path1, *path2; + unsigned mode1, mode2; + int cmp = 0; if (one.size) elem1 = tree_entry_extract(&one, &path1, &mode1); + else + /* two has more entries */ + cmp = 1; if (two.size) elem2 = tree_entry_extract(&two, &path2, &mode2); - - if (!one.size) { - /* two has more entries */ - score += score_missing(mode2, path2); - update_tree_entry(&two); - continue; - } - if (!two.size) { + else /* two lacks this entry */ - score += score_missing(mode1, path1); - update_tree_entry(&one); - continue; - } - cmp = base_name_compare(path1, strlen(path1), mode1, - path2, strlen(path2), mode2); + cmp = -1; + + if (!cmp) + cmp = base_name_compare(path1, strlen(path1), mode1, + path2, strlen(path2), mode2); if (cmp < 0) { /* path1 does not appear in two */ score += score_missing(mode1, path1); -- 1.8.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
On Fri, Mar 22, 2013 at 02:33:16PM -0700, Junio C Hamano wrote: > Junio C Hamano writes: > > > These three are all updated by the same tree_entry_extract() call, > > and whenever we use mode[12] we use path[12], so if it decides path1 > > is used or assigned, it should be able to tell mode1 is, too. > > > > Unsatisfactory, it surely is... > > And immediately after I wrote the above, I am greeted by this: > > gcc (Debian 4.4.5-8) 4.4.5 > match-trees.c:75: error: 'elem1' may be used uninitialized in this > function > match-trees.c:77: error: 'path1' may be used uninitialized in this > function > > and this crazy one on top squelches it. Ugh, yeah, I should have tried with more compilers. 4.6 complains, but 4.7 doesn't (although I still find it really weird that 4.7 gets it _half_ right). > I'll initialize all of them to nonsense values for now. I think that's sensible. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
Junio C Hamano writes: > These three are all updated by the same tree_entry_extract() call, > and whenever we use mode[12] we use path[12], so if it decides path1 > is used or assigned, it should be able to tell mode1 is, too. > > Unsatisfactory, it surely is... And immediately after I wrote the above, I am greeted by this: gcc (Debian 4.4.5-8) 4.4.5 match-trees.c:75: error: 'elem1' may be used uninitialized in this function match-trees.c:77: error: 'path1' may be used uninitialized in this function and this crazy one on top squelches it. If you flip the order of four lines that extracts only when size is non-zero to extract first from two into elem2, then the warning is given for elem2/path2 but not for elem1/path1. I'll initialize all of them to nonsense values for now. diff --git a/match-trees.c b/match-trees.c index 4360f10..88981e8 100644 --- a/match-trees.c +++ b/match-trees.c @@ -72,9 +72,9 @@ static int score_trees(const unsigned char *hash1, const unsigned char *hash2) die("%s is not a tree", sha1_to_hex(hash2)); init_tree_desc(&two, two_buf, size); while (one.size || two.size) { - const unsigned char *elem1; + const unsigned char *elem1 = NULL; const unsigned char *elem2; - const char *path1; + const char *path1 = NULL; const char *path2; unsigned mode1 = 0; /* make gcc happy */ unsigned mode2 = 0; /* make gcc happy */ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/4] match-trees: drop "x = x" initializations
Jeff King writes: > Of the 8 patches, this is the one I find the least satisfying, if only > because I do not think gcc's failure is because of complicated control > flow, and rearranging the code would only hurt readability. And I'm > quite curious why it complains about "mode", but not about the other > variables, which are set in the exact same place (and why it would not > be able to handle such a simple control flow at all). > > It makes me wonder if I am missing something, or there is some subtle > bug. But I can't see it. Other eyes appreciated. I obviously am not qualified as "other eyes" to catch bugs in this code as this is entirely mine, but I do not see any obvious reason that would make the compiler to think mode[12] less initialized than elem[12] or path[12] either. These three are all updated by the same tree_entry_extract() call, and whenever we use mode[12] we use path[12], so if it decides path1 is used or assigned, it should be able to tell mode1 is, too. Unsatisfactory, it surely is... > match-trees.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/match-trees.c b/match-trees.c > index 26f7ed1..4360f10 100644 > --- a/match-trees.c > +++ b/match-trees.c > @@ -71,13 +71,13 @@ static int score_trees(const unsigned char *hash1, const > unsigned char *hash2) > if (type != OBJ_TREE) > die("%s is not a tree", sha1_to_hex(hash2)); > init_tree_desc(&two, two_buf, size); > - while (one.size | two.size) { > - const unsigned char *elem1 = elem1; > - const unsigned char *elem2 = elem2; > - const char *path1 = path1; > - const char *path2 = path2; > - unsigned mode1 = mode1; > - unsigned mode2 = mode2; > + while (one.size || two.size) { > + const unsigned char *elem1; > + const unsigned char *elem2; > + const char *path1; > + const char *path2; > + unsigned mode1 = 0; /* make gcc happy */ > + unsigned mode2 = 0; /* make gcc happy */ > int cmp; > > if (one.size) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/4] match-trees: drop "x = x" initializations
These nonsense assignments are meant to squelch gcc warnings that the variables might be used uninitialized. However, gcc gets it mostly right, realizing that we will either extract tree entries from both sides, or we will hit a "continue" statement and go to the top of the loop. However, while getting this right for the "elem" and "path" variables, it does not do so for the "mode" variables. Let's drop the nonsense initialization where modern gcc does not need them, and just set the modes to "0", along with a comment. These values should never be used, but it makes both gcc, as well as any compiler which does not like the "x = x" initializations, happy. While we're in the area, let's also update the loop condition to use logical-OR rather than bitwise-OR. They should be equivalent in this case, and the use of the latter was probably a typo. Signed-off-by: Jeff King --- Of the 8 patches, this is the one I find the least satisfying, if only because I do not think gcc's failure is because of complicated control flow, and rearranging the code would only hurt readability. And I'm quite curious why it complains about "mode", but not about the other variables, which are set in the exact same place (and why it would not be able to handle such a simple control flow at all). It makes me wonder if I am missing something, or there is some subtle bug. But I can't see it. Other eyes appreciated. match-trees.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/match-trees.c b/match-trees.c index 26f7ed1..4360f10 100644 --- a/match-trees.c +++ b/match-trees.c @@ -71,13 +71,13 @@ static int score_trees(const unsigned char *hash1, const unsigned char *hash2) if (type != OBJ_TREE) die("%s is not a tree", sha1_to_hex(hash2)); init_tree_desc(&two, two_buf, size); - while (one.size | two.size) { - const unsigned char *elem1 = elem1; - const unsigned char *elem2 = elem2; - const char *path1 = path1; - const char *path2 = path2; - unsigned mode1 = mode1; - unsigned mode2 = mode2; + while (one.size || two.size) { + const unsigned char *elem1; + const unsigned char *elem2; + const char *path1; + const char *path2; + unsigned mode1 = 0; /* make gcc happy */ + unsigned mode2 = 0; /* make gcc happy */ int cmp; if (one.size) -- 1.8.2.13.g0f18d3c -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html