> Teach list-objects the "only:commits" filter which allows for filtering
> out all non-commit and non-annotated tag objects (unless other objects
> are explicitly specified by the user). The purpose of this patch is to
> allow smaller partial clones.
> 
> The name of this filter - only:commits - is a bit inaccurate because it
> still allows annotated tags to pass through. I chose it because it was
> the only concise name I could think of that was pretty descriptive. I
> considered and decided against "tree:none" because the code and
> documentation for filters seems to lack the concept of "you're filtering
> this, so we'll implicitly filter all referents of this." So "tree:none"
> is vague, since some may think it filters blobs too, while some may not.
> "only:commits" is specific and makes it easier to match it to a
> potential use case.

I'll do a fuller review tomorrow, but here are my initial thoughts.

I'm undecided about whether "only:commits" or "tree:none" is better -
one argument in favor of the latter is that blobs are not of much use
without any trees referring to them, so it makes sense that omitting
trees means omitting blobs. But that requires some thought and is not
immediately obvious.

>  /*
> - * A filter for list-objects to omit ALL blobs from the traversal.
> - * And to OPTIONALLY collect a list of the omitted OIDs.
> + * A filter for list-objects to omit ALL blobs from the traversal, and 
> possibly
> + * trees as well.
> + * Can OPTIONALLY collect a list of the omitted OIDs.
>   */
> -struct filter_blobs_none_data {
> +struct filter_none_of_type_data {
> +     unsigned omit_trees : 1;
>       struct oidset *omits;
>  };

I know that it's documented above that blobs are always omitted, but
maybe it's worth it to add a comment /* blobs are always omitted */.

> -     case LOFS_BEGIN_TREE:
> -             assert(obj->type == OBJ_TREE);
> -             /* always include all tree objects */
> -             return LOFR_MARK_SEEN | LOFR_DO_SHOW;
> -
>       case LOFS_END_TREE:
>               assert(obj->type == OBJ_TREE);
>               return LOFR_ZERO;
>  
> +     case LOFS_BEGIN_TREE:
> +             assert(obj->type == OBJ_TREE);
> +             if (!filter_data->omit_trees)
> +                     return LOFR_MARK_SEEN | LOFR_DO_SHOW;
> +
>       case LOFS_BLOB:
> -             assert(obj->type == OBJ_BLOB);
>               assert((obj->flags & SEEN) == 0);

Moving the case LOFS_BEGIN_TREE and removing the assert is unnecessary,
I think.

Also, there's fallthrough. If that's on purpose, add /* fallthrough */,
although I think that it complicates the code unnecessarily here.

> +test_expect_success 'verify only:commits packfile has no blobs or trees' '
> +     git -C r1 pack-objects --rev --stdout --filter=only:commits 
> >commitsonly.pack <<-EOF &&
> +     HEAD
> +     EOF
> +     git -C r1 index-pack ../commitsonly.pack &&
> +     git -C r1 verify-pack -v ../commitsonly.pack \
> +             | grep -E "tree|blob" \
> +             | sort >observed &&
> +     test_line_count = 0 observed
> +'

Bash pipes conceal return codes. Here it's OK, but it might be better to
write the verify-pack on its own line and then '! grep -E "tree|blob"' -
you don't need to sort or test_line_count.

> +test_expect_success 'grab tree directly when using only:commits' '
> +     # We should get the tree specified directly but not its blobs or 
> subtrees.
> +     git -C r1 pack-objects --rev --stdout --filter=only:commits 
> >commitsonly.pack <<-EOF &&
> +     HEAD:
> +     EOF
> +     git -C r1 index-pack ../commitsonly.pack &&
> +     git -C r1 verify-pack -v ../commitsonly.pack \
> +             | grep -E "tree|blob" \
> +             | sort >observed &&
> +     test_line_count = 1 observed
> +'

Similar comment as above, except you can redirect the output of grep to
a file, then test_line_count on that file. No need for sort.

Reply via email to