Re: [PATCH 0/5] not making corruption worse

2015-03-18 Thread Jeff King
On Tue, Mar 17, 2015 at 03:54:02PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > But it strikes me as weird that we consider the _tips_ of history to be
> > special for ignoring breakage. If the tip of "bar" is broken, we omit
> > it. But if the tip is fine, and there's breakage three commits down in
> > the history, then doing a clone is going to fail horribly, as
> > pack-objects realizes it can't generate the pack. So in practice, I'm
> > not sure how much you're buying with the "don't mention broken refs"
> > code.
> 
> I think this is a trade-off between strictness and convenience.  Is
> it preferrable that every time you try to clone a repository you get
> reminded that one of its refs point at a bogus object and you
> instead have to do "git fetch $there" with a refspec that excludes
> the broken one, or is it OK to allow clones and fetches silently
> succeed as if nothing is broken?

I think the real issue is that we do not know on the server side what
the client wants. Is it "tell me the refs, so I can grab just the one I
need, and I don't care about the broken ones"? Or is it "I want
everything you have, and tell me if you can't serve it"?  You want
strictness in the latter case, but not in the former. But if we were to
err on the side of strictness, you could not do the former at all
(because upload-pack would barf before the client even has a chance to
say anything).

I'm not sure if anyone will actually find GIT_REF_PARANOIA useful for
something like that or not. As an environment variable, it may impact a
filesystem-local clone, but it would not travel across a TCP connection.
And doing so is tough, because the ref advertisement happens before the
client speaks.

If we ever have a client-speaks-first protocol, one extension could
allow the client to flip the paranoia switch on the server. But my main
goal here was really just making "prune" safer, so I'm happy enough with
what this series does, for now.

> In some parts of the system there is a movement to make this trade
> off tweakable (hint: what happened to the knobs to fsck that allow
> certain kinds of broken objects in the object store?  did the topic
> go anywhere?). This one so far lacked such a knob to tweak, and I
> view your paranoia bit as such a knob.

I think I promised several times to review that topic and never got
around to it. Which makes me a bad person. It is still on my todo list.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] not making corruption worse

2015-03-17 Thread Junio C Hamano
Jeff King  writes:

> But it strikes me as weird that we consider the _tips_ of history to be
> special for ignoring breakage. If the tip of "bar" is broken, we omit
> it. But if the tip is fine, and there's breakage three commits down in
> the history, then doing a clone is going to fail horribly, as
> pack-objects realizes it can't generate the pack. So in practice, I'm
> not sure how much you're buying with the "don't mention broken refs"
> code.

I think this is a trade-off between strictness and convenience.  Is
it preferrable that every time you try to clone a repository you get
reminded that one of its refs point at a bogus object and you
instead have to do "git fetch $there" with a refspec that excludes
the broken one, or is it OK to allow clones and fetches silently
succeed as if nothing is broken?

If the breakage in the reachability chain is somewhere that affects
a branch that is actively in use by the project, with or without
hiding a broken tip, you will be hit by object transfer failure and
you need to really go in there and fix things anyway.  If it is just
a no-longer-used experimental branch that lost necessary objects,
it may be more convenient if the system automatically ignored it.

In some parts of the system there is a movement to make this trade
off tweakable (hint: what happened to the knobs to fsck that allow
certain kinds of broken objects in the object store?  did the topic
go anywhere?). This one so far lacked such a knob to tweak, and I
view your paranoia bit as such a knob.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] not making corruption worse

2015-03-17 Thread Jeff King
On Tue, Mar 17, 2015 at 03:27:50AM -0400, Jeff King wrote:

> The general strategy for these is to use for_each_rawref traversals in
> these situations. That doesn't cover _every_ possible scenario. For
> example, you could do:
> 
>   git clone --no-local repo.git backup.git &&
>   rm -rf repo.git
> 
> and you might be disappointed if "backup.git" omitted some broken refs
> (upload-pack will simply skip the broken refs in its advertisement).  We
> could tighten this, but then it becomes hard to access slightly broken
> repositories (e.g., you might prefer to clone what you can, and not have
> git die() when it tries to serve the breakage). Patch 2 provides a
> tweakable safety valve for this.

One thing I thought about while working on this was whether we should
just make _all_ ref iterations for_each_rawref. The benefit to not doing
so in the hypothetical above is that you might be able to clone "foo"
even if "bar" is broken.

But it strikes me as weird that we consider the _tips_ of history to be
special for ignoring breakage. If the tip of "bar" is broken, we omit
it. But if the tip is fine, and there's breakage three commits down in
the history, then doing a clone is going to fail horribly, as
pack-objects realizes it can't generate the pack. So in practice, I'm
not sure how much you're buying with the "don't mention broken refs"
code.

OTOH, there are probably _some_ situations that can be recovered with
the current code that could not otherwise. For example, in the current
code, I can still fetch "foo" even if "bar" is broken 3 commits down.
Whereas if the tip is broken, there's a reasonable chance that
"upload-pack" would just barf and I could fetch nothing.

So I stuck to the status quo in most cases, and only turned on the more
aggressive behavior for destructive operations (and people who want to
go wild can set GIT_REF_PARANOIA=1 for their every day operations if
they want to).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html