On 10/26/2017 08:46 AM, Junio C Hamano wrote: > Michael Haggerty <mhag...@alum.mit.edu> writes: > >> Even when we are deleting references, we needn't overwrite the >> `packed-refs` file if the references that we are deleting are not >> present in that file. Implement this optimization as follows: > > This goal I can understand. files-transaction-prepare may see a > deletion and in order to avoid a deletion of a ref from the > file-backend to expose a stale entry in the packed-refs file, it > prepares a transaction to remove the same ref that might exist in > it. If it turns out that there is no such ref in the packed-refs > file, then we can leave the packed-refs file as-is without > rewriting. > >> * Add a function `is_packed_transaction_noop()`, which checks whether >> a given packed-refs transaction doesn't actually have to do >> anything. This function must be called while holding the >> `packed-refs` lock to avoid races. > > This checks three things only to cover the most trivial case (I am > perfectly happy that it punts on more complex cases). If an update > > - checks the old value, > > - sets a new value, or > > - deletes a ref that is not on the filesystem, > > it is not a trivial case. I can understand the latter two (i.e. We > are special casing the deletion, and an update with a new value is > not. If removing a file from the filesystem is not sufficient to > delete, we may have to rewrite the packed-refs), but not the first > one. Is it because currently we do not say "delete this ref only > when its current value is X"?
It wouldn't be too hard to allow updates with REF_HAVE_OLD to be optimized away too. I didn't do it because 1. We currently only create updates for that packed_refs backend with REF_HAVE_OLD turned off, so such could would be unreachable (and untestable). 2. I wanted to keep the patch as simple as possible in case you decide to merge it into 2.15. There would also be a little bit of a leveling violation (or maybe the function name is not ideal). `is_packed_transaction_noop()` could check that `old_oid` values are correct, and if they all are, declare the transaction a NOOP. (It wasn't *really* a NOOP, but its OP, namely checking old values, has already been carried out.) But what if it finds that an `old_oid` was incorrect? Right now `is_packed_transaction_noop()` has no way to report something like a TRANSACTION_GENERIC_ERROR. It could be that long-term it makes more sense for this optimization to happen in `packed_transaction_prepare()`. Except that function is sometimes called for its side-effects, like sorting an existing file, in which case overwriting the `packed-refs` file shouldn't be optimized away. So overall it seemed easier to punt on this optimization at this point. > Also it is not immediately obvious how the "is this noop" helper is > checking the absence of the same-named ref in the current > packed-refs file, which is what matters for the correctness of the > optimization. The check is in the second loop in `is_packed_transaction_noop()`: if (!refs_read_raw_ref(ref_store, update->refname, oid.hash, &referent, &type) || errno != ENOENT) { /* * We might have to actually delete that * reference -> not a NOOP. */ ret = 0; break; } If the refname doesn't exist, then `refs_read_raw_ref()` returns -1 and sets errno to ENOENT, and the loop continues. Otherwise we exit with a value of 0, meaning that the transaction is not a NOOP. There are a lot of double-negatives here. It might be easier to follow the logic if the sense of the function were inverted: `is_packed_transaction_needed()`. Let me know if you have a strong feeling about it. Michael