Re: [PATCH 0/6] receive-pack: quarantine pushed objects

2016-10-03 Thread Christian Couder
On Sun, Oct 2, 2016 at 3:02 PM, Jeff King  wrote:
> On Sun, Oct 02, 2016 at 11:20:59AM +0200, Christian Couder wrote:
>
>> I wonder if the patch you sent in:
>>
>> https://public-inbox.org/git/20160816144642.5ikkta4l5hyx6...@sigill.intra.peff.net/
>>
>> is still useful or not.
>
> It is potentially still useful for other code paths besides
> receive-pack. But if the main concern is pushes, then yeah, I think it
> is not really doing anything.
>
>> I guess if we fail the receive-pack because the pack is bigger than
>> receive.maxInputSize, then the "quarantine" directory will also be
>> removed, so the part of the pack that we received before failing the
>> receive-pack will be deleted.
>
> Correct. _Any_ failure up to the tmp_objdir_migrate() call will drop the
> objects. So that includes index-pack failing for any reason.

Great, thanks for explaining!

>> > These two patches set that up by letting index-pack and pre-receive
>> > know that quarantine path and use it to store arbitrary files that
>> > _don't_ get migrated to the main object database (i.e., the log file
>> > mentioned above).
>>
>> It would be nice to have a diffstat for the whole series.
>
> You mean in the cover letter? I do not mind including it if people find
> them useful, but I personally have always just found them to be clutter
> at that level.

I think it can help to quickly get an idea about what the series
impacts, and it would have made it easier for me to see that the
changes in the patch you sent previously
(https://public-inbox.org/git/20160816144642.5ikkta4l5hyx6...@sigill.intra.peff.net/)
are not part of this series.

Thanks anyway,
Christian.


Re: [PATCH 0/6] receive-pack: quarantine pushed objects

2016-10-02 Thread Jeff King
On Sun, Oct 02, 2016 at 11:20:59AM +0200, Christian Couder wrote:

> On Fri, Sep 30, 2016 at 9:35 PM, Jeff King  wrote:
> > I've mentioned before on the list that GitHub "quarantines" objects
> > while the pre-receive hook runs. Here are the patches to implement
> > that.
> 
> Great! Thanks for upstreaming these patches!
> 
> I wonder if the patch you sent in:
> 
> https://public-inbox.org/git/20160816144642.5ikkta4l5hyx6...@sigill.intra.peff.net/
> 
> is still useful or not.

It is potentially still useful for other code paths besides
receive-pack. But if the main concern is pushes, then yeah, I think it
is not really doing anything.

> I guess if we fail the receive-pack because the pack is bigger than
> receive.maxInputSize, then the "quarantine" directory will also be
> removed, so the part of the pack that we received before failing the
> receive-pack will be deleted.

Correct. _Any_ failure up to the tmp_objdir_migrate() call will drop the
objects. So that includes index-pack failing for any reason.

> > These two patches set that up by letting index-pack and pre-receive
> > know that quarantine path and use it to store arbitrary files that
> > _don't_ get migrated to the main object database (i.e., the log file
> > mentioned above).
> 
> It would be nice to have a diffstat for the whole series.

You mean in the cover letter? I do not mind including it if people find
them useful, but I personally have always just found them to be clutter
at that level.

-Peff


Re: [PATCH 0/6] receive-pack: quarantine pushed objects

2016-10-02 Thread Christian Couder
On Fri, Sep 30, 2016 at 9:35 PM, Jeff King  wrote:
> I've mentioned before on the list that GitHub "quarantines" objects
> while the pre-receive hook runs. Here are the patches to implement
> that.

Great! Thanks for upstreaming these patches!

I wonder if the patch you sent in:

https://public-inbox.org/git/20160816144642.5ikkta4l5hyx6...@sigill.intra.peff.net/

is still useful or not.

> The basic problem is that as-is, index-pack admits pushed objects into
> the main object database immediately, before the pre-receive hook runs.
> It _has_ to, since the hook needs to be able to actually look at the
> objects. However, this means that if the pre-receive hook rejects the
> push, we still end up with the objects in the repository. We can't just
> delete them as temporary files, because we don't know what other
> processes might have started referencing them.
>
> The solution here is to push into a "quarantine" directory that is
> accessible only to pre-receive, check_connected(), etc, and only
> move the objects into the main object database after we've finished
> those basic checks.

I guess if we fail the receive-pack because the pack is bigger than
receive.maxInputSize, then the "quarantine" directory will also be
removed, so the part of the pack that we received before failing the
receive-pack will be deleted.

[...]

> These two patches set that up by letting index-pack and pre-receive
> know that quarantine path and use it to store arbitrary files that
> _don't_ get migrated to the main object database (i.e., the log file
> mentioned above).

It would be nice to have a diffstat for the whole series.

Thanks,
Christian.