Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
On wo, 2015-12-30 at 13:41 -0800, Junio C Hamano wrote: > Dennis Kaarsemaker writes: > > > On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote: > > > Dennis Kaarsemaker writes: > > > > > > > diff --git a/reflog-walk.c b/reflog-walk.c > > > > index 85b8a54..b85c8e8 100644 > > > > --- a/reflog-walk.c > > > > +++ b/reflog-walk.c > > > > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct > > > > reflog_walk_info > > > > *info, struct commit *commit) > > > > reflog = &commit_reflog->reflogs->items[commit_reflog > > > > ->recno]; > > > > info->last_commit_reflog = commit_reflog; > > > > commit_reflog->recno--; > > > > - commit_info->commit = (struct commit > > > > *)parse_object(reflog > > > > ->osha1); > > > > - if (!commit_info->commit) { > > > > + commit_info->commit = lookup_commit(reflog->osha1); > > > > + if (!commit_info->commit || parse_commit(commit_info > > > > ->commit)) { > > > > commit->parents = NULL; > > > > return; > > > > > > This looks somewhat roundabout and illogical. The original was > > > bad > > > because it blindly assumed reflgo->osha1 refers to a commit > > > without > > > making sure that assumption holds. Calling lookup_commit() > > > blindly > > > is not much better, even though you are helped that the function > > > happens not to barf if the given object is not a commit. > > > > > > Also this changes semantics, no? Trace the original flow and > > > think > > > what happens, when we see a commit object that cannot be parsed > > > in > > > parse_commit_buffer(). parse_object() calls > > > parse_object_buffer() > > > which in turn calls parse_commit_buffer() and the entire > > > callchain > > > returns NULL. commit_info->commit will become NULL in such a > > > case. > > > > > > With your code, lookup_commit() will store a non NULL in > > > commit_info->commit, and parse_commit() calls > > > parse_commit_buffer() > > > and that would fail, so you clear commit->parents to NULL but > > > fail > > > to set commit_info->commit to NULL. > > > > > > Why not keep the parse_object() as-is and make sure we error out > > > unless the result is a commit with a more explicit check, perhaps > > > like this, instead? > > > > lookup_commit actually returns NULL (via object_as_type) for > > objects > > that are not commits, so I don't think the above is true. > > I think you did not read what you are responding to. I was talking > about the error case where the object _is_ a commit (hence lookup > returns it), but parse_commit_buffer() does not like its contents. I read it, but misunderstood it. Thanks for clarifying. > > The code below also loses the diagnostic message about the object > > not being a commit. > > Giving such a diagnostic message is a BUG. > > A ref can legitimately point at any type of object (only refs under > refs/heads/, aka "branches", must point at commits), so you MUST NOT > complain about seeing a non-commit in a reflog in general. Yeah, that makes sense, didn't think of that. -- Dennis Kaarsemaker www.kaarsemaker.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
Dennis Kaarsemaker writes: > On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote: >> Dennis Kaarsemaker writes: >> >> > diff --git a/reflog-walk.c b/reflog-walk.c >> > index 85b8a54..b85c8e8 100644 >> > --- a/reflog-walk.c >> > +++ b/reflog-walk.c >> > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info >> > *info, struct commit *commit) >> >reflog = &commit_reflog->reflogs->items[commit_reflog >> > ->recno]; >> >info->last_commit_reflog = commit_reflog; >> >commit_reflog->recno--; >> > - commit_info->commit = (struct commit *)parse_object(reflog >> > ->osha1); >> > - if (!commit_info->commit) { >> > + commit_info->commit = lookup_commit(reflog->osha1); >> > + if (!commit_info->commit || parse_commit(commit_info >> > ->commit)) { >> >commit->parents = NULL; >> >return; >> >> This looks somewhat roundabout and illogical. The original was bad >> because it blindly assumed reflgo->osha1 refers to a commit without >> making sure that assumption holds. Calling lookup_commit() blindly >> is not much better, even though you are helped that the function >> happens not to barf if the given object is not a commit. >> >> Also this changes semantics, no? Trace the original flow and think >> what happens, when we see a commit object that cannot be parsed in >> parse_commit_buffer(). parse_object() calls parse_object_buffer() >> which in turn calls parse_commit_buffer() and the entire callchain >> returns NULL. commit_info->commit will become NULL in such a case. >> >> With your code, lookup_commit() will store a non NULL in >> commit_info->commit, and parse_commit() calls parse_commit_buffer() >> and that would fail, so you clear commit->parents to NULL but fail >> to set commit_info->commit to NULL. >> >> Why not keep the parse_object() as-is and make sure we error out >> unless the result is a commit with a more explicit check, perhaps >> like this, instead? > > lookup_commit actually returns NULL (via object_as_type) for objects > that are not commits, so I don't think the above is true. I think you did not read what you are responding to. I was talking about the error case where the object _is_ a commit (hence lookup returns it), but parse_commit_buffer() does not like its contents. > The code below also loses the diagnostic message about the object > not being a commit. Giving such a diagnostic message is a BUG. A ref can legitimately point at any type of object (only refs under refs/heads/, aka "branches", must point at commits), so you MUST NOT complain about seeing a non-commit in a reflog in general. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote: > Dennis Kaarsemaker writes: > > > diff --git a/reflog-walk.c b/reflog-walk.c > > index 85b8a54..b85c8e8 100644 > > --- a/reflog-walk.c > > +++ b/reflog-walk.c > > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info > > *info, struct commit *commit) > > reflog = &commit_reflog->reflogs->items[commit_reflog > > ->recno]; > > info->last_commit_reflog = commit_reflog; > > commit_reflog->recno--; > > - commit_info->commit = (struct commit *)parse_object(reflog > > ->osha1); > > - if (!commit_info->commit) { > > + commit_info->commit = lookup_commit(reflog->osha1); > > + if (!commit_info->commit || parse_commit(commit_info > > ->commit)) { > > commit->parents = NULL; > > return; > > This looks somewhat roundabout and illogical. The original was bad > because it blindly assumed reflgo->osha1 refers to a commit without > making sure that assumption holds. Calling lookup_commit() blindly > is not much better, even though you are helped that the function > happens not to barf if the given object is not a commit. > > Also this changes semantics, no? Trace the original flow and think > what happens, when we see a commit object that cannot be parsed in > parse_commit_buffer(). parse_object() calls parse_object_buffer() > which in turn calls parse_commit_buffer() and the entire callchain > returns NULL. commit_info->commit will become NULL in such a case. > > With your code, lookup_commit() will store a non NULL in > commit_info->commit, and parse_commit() calls parse_commit_buffer() > and that would fail, so you clear commit->parents to NULL but fail > to set commit_info->commit to NULL. > > Why not keep the parse_object() as-is and make sure we error out > unless the result is a commit with a more explicit check, perhaps > like this, instead? lookup_commit actually returns NULL (via object_as_type) for objects that are not commits, so I don't think the above is true. The code below also loses the diagnostic message about the object not being a commit. > reflog-walk.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/reflog-walk.c b/reflog-walk.c > index 85b8a54..861d7c4 100644 > --- a/reflog-walk.c > +++ b/reflog-walk.c > @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info > *info, struct commit *commit) > struct commit_info *commit_info = > get_commit_info(commit, &info->reflogs, 0); > struct commit_reflog *commit_reflog; > + struct object *logobj; > struct reflog_info *reflog; > > info->last_commit_reflog = NULL; > @@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info > *info, struct commit *commit) > reflog = &commit_reflog->reflogs->items[commit_reflog > ->recno]; > info->last_commit_reflog = commit_reflog; > commit_reflog->recno--; > - commit_info->commit = (struct commit *)parse_object(reflog > ->osha1); > - if (!commit_info->commit) { > + logobj = parse_object(reflog->osha1); > + if (!logobj || logobj->type != OBJ_COMMIT) { > + commit_info->commit = NULL; > commit->parents = NULL; > return; > } > + commit_info->commit = (struct commit *)logobj; > > commit->parents = xcalloc(1, sizeof(struct commit_list)); > commit->parents->item = commit_info->commit; > > > > +test_expect_success 'reflog containing non-commit sha1s' ' > > + git checkout -b broken-reflog && > > + echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) > > abc 01 +" >> .git/logs/refs/heads/broken-reflog > > && > > + git reflog broken-reflog > > +' > > + > > This will negatively affect the ongoing effort to abstract out the > on-disk implementation of the reflog. In some future installation > of Git, the reflog may not even be in .git/logs/refs/whatever file. I was following the style of the test above it, will fix. > Use a non-branch ref, so that you can store any valid object not > just commits, and use a Git command (e.g. "git update-ref" or "git > tag") instead of the raw filesystem access to update it, perhaps > like this? > > git tag --create-reflog test-logs HEAD^ && > git tag -f test-logs HEAD^{tree} && > git tag -f test-logs HEAD && > git reflog test-logs -- Dennis Kaarsemaker www.kaarsemaker.net -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
Dennis Kaarsemaker writes: > diff --git a/reflog-walk.c b/reflog-walk.c > index 85b8a54..b85c8e8 100644 > --- a/reflog-walk.c > +++ b/reflog-walk.c > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, > struct commit *commit) > reflog = &commit_reflog->reflogs->items[commit_reflog->recno]; > info->last_commit_reflog = commit_reflog; > commit_reflog->recno--; > - commit_info->commit = (struct commit *)parse_object(reflog->osha1); > - if (!commit_info->commit) { > + commit_info->commit = lookup_commit(reflog->osha1); > + if (!commit_info->commit || parse_commit(commit_info->commit)) { > commit->parents = NULL; > return; This looks somewhat roundabout and illogical. The original was bad because it blindly assumed reflgo->osha1 refers to a commit without making sure that assumption holds. Calling lookup_commit() blindly is not much better, even though you are helped that the function happens not to barf if the given object is not a commit. Also this changes semantics, no? Trace the original flow and think what happens, when we see a commit object that cannot be parsed in parse_commit_buffer(). parse_object() calls parse_object_buffer() which in turn calls parse_commit_buffer() and the entire callchain returns NULL. commit_info->commit will become NULL in such a case. With your code, lookup_commit() will store a non NULL in commit_info->commit, and parse_commit() calls parse_commit_buffer() and that would fail, so you clear commit->parents to NULL but fail to set commit_info->commit to NULL. Why not keep the parse_object() as-is and make sure we error out unless the result is a commit with a more explicit check, perhaps like this, instead? reflog-walk.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/reflog-walk.c b/reflog-walk.c index 85b8a54..861d7c4 100644 --- a/reflog-walk.c +++ b/reflog-walk.c @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit) struct commit_info *commit_info = get_commit_info(commit, &info->reflogs, 0); struct commit_reflog *commit_reflog; + struct object *logobj; struct reflog_info *reflog; info->last_commit_reflog = NULL; @@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit) reflog = &commit_reflog->reflogs->items[commit_reflog->recno]; info->last_commit_reflog = commit_reflog; commit_reflog->recno--; - commit_info->commit = (struct commit *)parse_object(reflog->osha1); - if (!commit_info->commit) { + logobj = parse_object(reflog->osha1); + if (!logobj || logobj->type != OBJ_COMMIT) { + commit_info->commit = NULL; commit->parents = NULL; return; } + commit_info->commit = (struct commit *)logobj; commit->parents = xcalloc(1, sizeof(struct commit_list)); commit->parents->item = commit_info->commit; > +test_expect_success 'reflog containing non-commit sha1s' ' > + git checkout -b broken-reflog && > + echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc > 01 +" >> .git/logs/refs/heads/broken-reflog && > + git reflog broken-reflog > +' > + This will negatively affect the ongoing effort to abstract out the on-disk implementation of the reflog. In some future installation of Git, the reflog may not even be in .git/logs/refs/whatever file. Use a non-branch ref, so that you can store any valid object not just commits, and use a Git command (e.g. "git update-ref" or "git tag") instead of the raw filesystem access to update it, perhaps like this? git tag --create-reflog test-logs HEAD^ && git tag -f test-logs HEAD^{tree} && git tag -f test-logs HEAD && git reflog test-logs -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
Use lookup_commit instead of parse_object to look up commits mentioned in the reflog. This avoids a segfault in save_parents if somehow a sha1 for something other than a commit ends up in the reflog. Signed-off-by: Dennis Kaarsemaker Helped-by: Nguyễn Thái Ngọc Duy --- Duy Nguyen wrote: > I would go with something like this. The typecasting to "struct commit > *" is the bug because parse_object() can return any object type. Yeah, that's much better. Here it is as a patch with a test. reflog-walk.c | 4 ++-- t/t1410-reflog.sh | 6 ++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/reflog-walk.c b/reflog-walk.c index 85b8a54..b85c8e8 100644 --- a/reflog-walk.c +++ b/reflog-walk.c @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit) reflog = &commit_reflog->reflogs->items[commit_reflog->recno]; info->last_commit_reflog = commit_reflog; commit_reflog->recno--; - commit_info->commit = (struct commit *)parse_object(reflog->osha1); - if (!commit_info->commit) { + commit_info->commit = lookup_commit(reflog->osha1); + if (!commit_info->commit || parse_commit(commit_info->commit)) { commit->parents = NULL; return; } diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh index b79049f..76ccbe5 100755 --- a/t/t1410-reflog.sh +++ b/t/t1410-reflog.sh @@ -325,4 +325,10 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' ' test_cmp expect actual ' +test_expect_success 'reflog containing non-commit sha1s' ' + git checkout -b broken-reflog && + echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc 01 +" >> .git/logs/refs/heads/broken-reflog && + git reflog broken-reflog +' + test_done -- 2.7.0-rc1-207-ga35084c -- Dennis Kaarsemaker http://twitter.com/seveas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html