Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog

2015-12-30 Thread Dennis Kaarsemaker
On wo, 2015-12-30 at 13:41 -0800, Junio C Hamano wrote:
> Dennis Kaarsemaker  writes:
> 
> > On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
> > > Dennis Kaarsemaker  writes:
> > > 
> > > > diff --git a/reflog-walk.c b/reflog-walk.c
> > > > index 85b8a54..b85c8e8 100644
> > > > --- a/reflog-walk.c
> > > > +++ b/reflog-walk.c
> > > > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct
> > > > reflog_walk_info
> > > > *info, struct commit *commit)
> > > > reflog = &commit_reflog->reflogs->items[commit_reflog
> > > > ->recno];
> > > > info->last_commit_reflog = commit_reflog;
> > > > commit_reflog->recno--;
> > > > -   commit_info->commit = (struct commit
> > > > *)parse_object(reflog
> > > > ->osha1);
> > > > -   if (!commit_info->commit) {
> > > > +   commit_info->commit = lookup_commit(reflog->osha1);
> > > > +   if (!commit_info->commit || parse_commit(commit_info
> > > > ->commit)) {
> > > > commit->parents = NULL;
> > > > return;
> > > 
> > > This looks somewhat roundabout and illogical.  The original was
> > > bad
> > > because it blindly assumed reflgo->osha1 refers to a commit
> > > without
> > > making sure that assumption holds.  Calling lookup_commit()
> > > blindly
> > > is not much better, even though you are helped that the function
> > > happens not to barf if the given object is not a commit.
> > > 
> > > Also this changes semantics, no?  Trace the original flow and
> > > think
> > > what happens, when we see a commit object that cannot be parsed
> > > in
> > > parse_commit_buffer().  parse_object() calls
> > > parse_object_buffer()
> > > which in turn calls parse_commit_buffer() and the entire
> > > callchain
> > > returns NULL.  commit_info->commit will become NULL in such a
> > > case.
> > > 
> > > With your code, lookup_commit() will store a non NULL in
> > > commit_info->commit, and parse_commit() calls
> > > parse_commit_buffer()
> > > and that would fail, so you clear commit->parents to NULL but
> > > fail
> > > to set commit_info->commit to NULL.
> > > 
> > > Why not keep the parse_object() as-is and make sure we error out
> > > unless the result is a commit with a more explicit check, perhaps
> > > like this, instead?
> > 
> > lookup_commit actually returns NULL (via object_as_type) for
> > objects
> > that are not commits, so I don't think the above is true.
> 
> I think you did not read what you are responding to.  I was talking
> about the error case where the object _is_ a commit (hence lookup
> returns it), but parse_commit_buffer() does not like its contents.

I read it, but misunderstood it. Thanks for clarifying.

> > The code below also loses the diagnostic message about the object
> > not being a commit.
> 
> Giving such a diagnostic message is a BUG.
> 
> A ref can legitimately point at any type of object (only refs under
> refs/heads/, aka "branches", must point at commits), so you MUST NOT
> complain about seeing a non-commit in a reflog in general.

Yeah, that makes sense, didn't think of that.
-- 
Dennis Kaarsemaker
www.kaarsemaker.net


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog

2015-12-30 Thread Junio C Hamano
Dennis Kaarsemaker  writes:

> On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
>> Dennis Kaarsemaker  writes:
>> 
>> > diff --git a/reflog-walk.c b/reflog-walk.c
>> > index 85b8a54..b85c8e8 100644
>> > --- a/reflog-walk.c
>> > +++ b/reflog-walk.c
>> > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info
>> > *info, struct commit *commit)
>> >reflog = &commit_reflog->reflogs->items[commit_reflog
>> > ->recno];
>> >info->last_commit_reflog = commit_reflog;
>> >commit_reflog->recno--;
>> > -  commit_info->commit = (struct commit *)parse_object(reflog
>> > ->osha1);
>> > -  if (!commit_info->commit) {
>> > +  commit_info->commit = lookup_commit(reflog->osha1);
>> > +  if (!commit_info->commit || parse_commit(commit_info
>> > ->commit)) {
>> >commit->parents = NULL;
>> >return;
>> 
>> This looks somewhat roundabout and illogical.  The original was bad
>> because it blindly assumed reflgo->osha1 refers to a commit without
>> making sure that assumption holds.  Calling lookup_commit() blindly
>> is not much better, even though you are helped that the function
>> happens not to barf if the given object is not a commit.
>> 
>> Also this changes semantics, no?  Trace the original flow and think
>> what happens, when we see a commit object that cannot be parsed in
>> parse_commit_buffer().  parse_object() calls parse_object_buffer()
>> which in turn calls parse_commit_buffer() and the entire callchain
>> returns NULL.  commit_info->commit will become NULL in such a case.
>> 
>> With your code, lookup_commit() will store a non NULL in
>> commit_info->commit, and parse_commit() calls parse_commit_buffer()
>> and that would fail, so you clear commit->parents to NULL but fail
>> to set commit_info->commit to NULL.
>>
>> Why not keep the parse_object() as-is and make sure we error out
>> unless the result is a commit with a more explicit check, perhaps
>> like this, instead?
>
> lookup_commit actually returns NULL (via object_as_type) for objects
> that are not commits, so I don't think the above is true.

I think you did not read what you are responding to.  I was talking
about the error case where the object _is_ a commit (hence lookup
returns it), but parse_commit_buffer() does not like its contents.

> The code below also loses the diagnostic message about the object
> not being a commit.

Giving such a diagnostic message is a BUG.

A ref can legitimately point at any type of object (only refs under
refs/heads/, aka "branches", must point at commits), so you MUST NOT
complain about seeing a non-commit in a reflog in general.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog

2015-12-30 Thread Dennis Kaarsemaker
On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
> Dennis Kaarsemaker  writes:
> 
> > diff --git a/reflog-walk.c b/reflog-walk.c
> > index 85b8a54..b85c8e8 100644
> > --- a/reflog-walk.c
> > +++ b/reflog-walk.c
> > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info
> > *info, struct commit *commit)
> > reflog = &commit_reflog->reflogs->items[commit_reflog
> > ->recno];
> > info->last_commit_reflog = commit_reflog;
> > commit_reflog->recno--;
> > -   commit_info->commit = (struct commit *)parse_object(reflog
> > ->osha1);
> > -   if (!commit_info->commit) {
> > +   commit_info->commit = lookup_commit(reflog->osha1);
> > +   if (!commit_info->commit || parse_commit(commit_info
> > ->commit)) {
> > commit->parents = NULL;
> > return;
> 
> This looks somewhat roundabout and illogical.  The original was bad
> because it blindly assumed reflgo->osha1 refers to a commit without
> making sure that assumption holds.  Calling lookup_commit() blindly
> is not much better, even though you are helped that the function
> happens not to barf if the given object is not a commit.
> 
> Also this changes semantics, no?  Trace the original flow and think
> what happens, when we see a commit object that cannot be parsed in
> parse_commit_buffer().  parse_object() calls parse_object_buffer()
> which in turn calls parse_commit_buffer() and the entire callchain
> returns NULL.  commit_info->commit will become NULL in such a case.
> 
> With your code, lookup_commit() will store a non NULL in
> commit_info->commit, and parse_commit() calls parse_commit_buffer()
> and that would fail, so you clear commit->parents to NULL but fail
> to set commit_info->commit to NULL.
>
> Why not keep the parse_object() as-is and make sure we error out
> unless the result is a commit with a more explicit check, perhaps
> like this, instead?

lookup_commit actually returns NULL (via object_as_type) for objects
that are not commits, so I don't think the above is true. The code
below also loses the diagnostic message about the object not being a
commit.

>  reflog-walk.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..861d7c4 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info
> *info, struct commit *commit)
>   struct commit_info *commit_info =
>   get_commit_info(commit, &info->reflogs, 0);
>   struct commit_reflog *commit_reflog;
> + struct object *logobj;
>   struct reflog_info *reflog;
>  
>   info->last_commit_reflog = NULL;
> @@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info
> *info, struct commit *commit)
>   reflog = &commit_reflog->reflogs->items[commit_reflog
> ->recno];
>   info->last_commit_reflog = commit_reflog;
>   commit_reflog->recno--;
> - commit_info->commit = (struct commit *)parse_object(reflog
> ->osha1);
> - if (!commit_info->commit) {
> + logobj = parse_object(reflog->osha1);
> + if (!logobj || logobj->type != OBJ_COMMIT) {
> + commit_info->commit = NULL;
>   commit->parents = NULL;
>   return;
>   }
> + commit_info->commit = (struct commit *)logobj;
>  
>   commit->parents = xcalloc(1, sizeof(struct commit_list));
>   commit->parents->item = commit_info->commit;
> 
> 
> > +test_expect_success 'reflog containing non-commit sha1s' '
> > +   git checkout -b broken-reflog &&
> > +   echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD)
> > abc  01 +" >> .git/logs/refs/heads/broken-reflog
> > &&
> > +   git reflog broken-reflog
> > +'
> > +
> 
> This will negatively affect the ongoing effort to abstract out the
> on-disk implementation of the reflog.  In some future installation
> of Git, the reflog may not even be in .git/logs/refs/whatever file.

I was following the style of the test above it, will fix.

> Use a non-branch ref, so that you can store any valid object not
> just commits, and use a Git command (e.g. "git update-ref" or "git
> tag") instead of the raw filesystem access to update it, perhaps
> like this?
> 
>   git tag --create-reflog test-logs HEAD^ &&
>   git tag -f test-logs HEAD^{tree} &&
>   git tag -f test-logs HEAD &&
>   git reflog test-logs

-- 
Dennis Kaarsemaker
www.kaarsemaker.net


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog

2015-12-30 Thread Junio C Hamano
Dennis Kaarsemaker  writes:

> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..b85c8e8 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, 
> struct commit *commit)
>   reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>   info->last_commit_reflog = commit_reflog;
>   commit_reflog->recno--;
> - commit_info->commit = (struct commit *)parse_object(reflog->osha1);
> - if (!commit_info->commit) {
> + commit_info->commit = lookup_commit(reflog->osha1);
> + if (!commit_info->commit || parse_commit(commit_info->commit)) {
>   commit->parents = NULL;
>   return;

This looks somewhat roundabout and illogical.  The original was bad
because it blindly assumed reflgo->osha1 refers to a commit without
making sure that assumption holds.  Calling lookup_commit() blindly
is not much better, even though you are helped that the function
happens not to barf if the given object is not a commit.

Also this changes semantics, no?  Trace the original flow and think
what happens, when we see a commit object that cannot be parsed in
parse_commit_buffer().  parse_object() calls parse_object_buffer()
which in turn calls parse_commit_buffer() and the entire callchain
returns NULL.  commit_info->commit will become NULL in such a case.

With your code, lookup_commit() will store a non NULL in
commit_info->commit, and parse_commit() calls parse_commit_buffer()
and that would fail, so you clear commit->parents to NULL but fail
to set commit_info->commit to NULL.

Why not keep the parse_object() as-is and make sure we error out
unless the result is a commit with a more explicit check, perhaps
like this, instead?

 reflog-walk.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..861d7c4 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, 
struct commit *commit)
struct commit_info *commit_info =
get_commit_info(commit, &info->reflogs, 0);
struct commit_reflog *commit_reflog;
+   struct object *logobj;
struct reflog_info *reflog;
 
info->last_commit_reflog = NULL;
@@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info *info, 
struct commit *commit)
reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
info->last_commit_reflog = commit_reflog;
commit_reflog->recno--;
-   commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-   if (!commit_info->commit) {
+   logobj = parse_object(reflog->osha1);
+   if (!logobj || logobj->type != OBJ_COMMIT) {
+   commit_info->commit = NULL;
commit->parents = NULL;
return;
}
+   commit_info->commit = (struct commit *)logobj;
 
commit->parents = xcalloc(1, sizeof(struct commit_list));
commit->parents->item = commit_info->commit;


> +test_expect_success 'reflog containing non-commit sha1s' '
> + git checkout -b broken-reflog &&
> + echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc  
> 01 +" >> .git/logs/refs/heads/broken-reflog &&
> + git reflog broken-reflog
> +'
> +

This will negatively affect the ongoing effort to abstract out the
on-disk implementation of the reflog.  In some future installation
of Git, the reflog may not even be in .git/logs/refs/whatever file.

Use a non-branch ref, so that you can store any valid object not
just commits, and use a Git command (e.g. "git update-ref" or "git
tag") instead of the raw filesystem access to update it, perhaps
like this?

git tag --create-reflog test-logs HEAD^ &&
git tag -f test-logs HEAD^{tree} &&
git tag -f test-logs HEAD &&
git reflog test-logs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog

2015-12-30 Thread Dennis Kaarsemaker
Use lookup_commit instead of parse_object to look up commits mentioned
in the reflog. This avoids a segfault in save_parents if somehow a sha1
for something other than a commit ends up in the reflog.

Signed-off-by: Dennis Kaarsemaker 
Helped-by: Nguyễn Thái Ngọc Duy 
---
Duy Nguyen wrote:

> I would go with something like this. The typecasting to "struct commit
> *" is the bug because parse_object() can return any object type.

Yeah, that's much better. Here it is as a patch with a test. 

 reflog-walk.c | 4 ++--
 t/t1410-reflog.sh | 6 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..b85c8e8 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, 
struct commit *commit)
reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
info->last_commit_reflog = commit_reflog;
commit_reflog->recno--;
-   commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-   if (!commit_info->commit) {
+   commit_info->commit = lookup_commit(reflog->osha1);
+   if (!commit_info->commit || parse_commit(commit_info->commit)) {
commit->parents = NULL;
return;
}
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index b79049f..76ccbe5 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -325,4 +325,10 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ 
boundaries' '
test_cmp expect actual
 '
 
+test_expect_success 'reflog containing non-commit sha1s' '
+   git checkout -b broken-reflog &&
+   echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc  
01 +" >> .git/logs/refs/heads/broken-reflog &&
+   git reflog broken-reflog
+'
+
 test_done
-- 
2.7.0-rc1-207-ga35084c


-- 
Dennis Kaarsemaker 
http://twitter.com/seveas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html