On Sat, Sep 22, 2018 at 04:11:45PM +0200, SZEDER Gábor wrote:
> The command 'git ls-remote --sort=authordate <remote>' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
>
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=<key>' option with certain keys does
> require access to the referenced objects. This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message. However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
>
> Make ref-filter more careful and only attempt to retrieve an object
> when we are in a repository. Also add a test to ensure that 'git
> ls-remote --sort' fails gracefully when executed outside of a
> repository.
This all makes sense, and I think your fix is going in the right
direction.
But...
> I'm not quite sure that this is the best place to add this check...
> but hey, it's a Saturday afternoon after all ;)
I also wonder about this. For refs, we already catch these cases at a
low-level and BUG(). That's better than a segfault, and I suspect we
should be doing the same here in oid_object_info_extended(). But that
just shifts the segfault to a BUG().
For the refs code, we've generally tried to catch things at a high-level
and report a more human-friendly error explaining the situation. So
doing the same thing here would mean adding code to ls-remote. But I
think the plumbing gets pretty tricky, since it has no way to ask
ref-filter "hey, are we doing to need to look at objects?".
That's a thing that I think ref-filter _should_ support (it knows it
after having parsed the format string). But it probably ought to come
along with other refactoring, and shouldn't hold up this fix.
So this probably _is_ a reasonable place to check it. However...
> diff --git a/ref-filter.c b/ref-filter.c
> index e1bcb4ca8a..3555bc29e7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int
> deref, struct object **obj
> oi->info.sizep = &oi->size;
> oi->info.typep = &oi->type;
> }
> - if (oid_object_info_extended(the_repository, &oi->oid, &oi->info,
> + if (!have_git_dir() ||
> + oid_object_info_extended(the_repository, &oi->oid, &oi->info,
> OBJECT_INFO_LOOKUP_REPLACE))
> return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
> oid_to_hex(&oi->oid), ref->refname);
Would we perhaps want to give the user a hint that the object is not
really missing, but rather that we're not in a repository? E.g.,
something like:
if (!have_git_dir())
return strbuf_addf_ret(err, -1, "format specifier requires a
repository");
if (oid_object_info_extended(...))
return ...;
?
-Peff