On Thu, 15 Jun 2017 10:50:46 -0700 Junio C Hamano <gits...@pobox.com> wrote:
> Jonathan Tan <jonathanta...@google.com> writes: > > > Looking at the 3 primary functions (sha1_object_info_extended, > > read_object, has_sha1_file_with_flags), they independently implement > > mechanisms such as object replacement, retrying the packed store after > > failing to find the object in the packed store then the loose store, and > > being able to mark a packed object as bad and then retrying the whole > > process. Consolidating these mechanisms would be a great help to > > maintainability. > > > > Therefore, consolidate all 3 functions by extending > > sha1_object_info_extended() to support the functionality needed by all 3 > > functions, and then modifying the other 2 to use > > sha1_object_info_extended(). > > This is a rather "ugly" looking patch ;-) but I followed what > has_sha1_file_with_flags() and read_object() do before and after > this change, and I think this patch is a no-op wrt their behaviour > (which is a good thing). > > But I have a very mixed feelings on one aspect of the resulting > sha1_object_info_extended(). > > > int sha1_object_info_extended(const unsigned char *sha1, struct > > object_info *oi, unsigned flags) > > { > > ... > > if (!find_pack_entry(real, &e)) { > > /* Most likely it's a loose object. */ > > - if (!sha1_loose_object_info(real, oi, flags)) { > > + if (oi && !sha1_loose_object_info(real, oi, flags)) { > > oi->whence = OI_LOOSE; > > return 0; > > } > > + if (!oi && has_loose_object(real)) > > + return 0; > > This conversion is not incorrect per-se. > > We can see that has_sha1_file_with_flags() after this patch still > calls has_loose_object(). But it bothers me that there is no hint > to future developers to warn that a rewrite of the above like this > is incorrect: > > if (!find_pack_entry(read, &e)) { > /* Most likely it's a loose object. */ > + struct object_info dummy_oi; > + if (!oi) { > + memset(&dummy_oi, 0, sizeof(dummy_oi); > + oi = &dummy_oi; > + } > - if (oi && !sha1_loose_object_info(real, oi, flags)) { > + if (!sha1_loose_object_info(real, oi, flags)) { > oi->whence = OI_LOOSE; > return 0; > } > - if (!oi && has_loose_object(real)) > - return 0; > > It used to be very easy to see that has_sha1_file_with_flags() will > call has_loose_object() when it does not find the object in a pack, > which will result in the loose object file freshened. In the new > code, it is very subtle to see that---it will happen when the caller > passes oi == NULL, and has_sha1_file_with_flags() is such a caller, > but it is unclear if there are other callers of this "consolidated" > sha1_object_info_extended() that passes oi == NULL, and if they do > also want to freshen the loose object file when they do so. Good point - sorry, I didn't pay much attention to the freshening behavior. After some thought, I now think it might be better to avoid modifying has_sha1_file_with_flags(). As it is, sha1_object_info_extended() already needs special handling (special flags and handling the possibility of "oi" being NULL) to handle the functionality required by has_sha1_file_with_flags(); adding yet another thing to handle (freshen or not) would make it much too complicated. This means that subsequent patches that modify the handling of storage-agnostic objects would still need to modify 2 functions, but at least that is fewer than the current 3. I'll reroll with these changes so that you (and others) can see what it looks like. > > @@ -3480,18 +3491,12 @@ int has_sha1_pack(const unsigned char *sha1) > > > > int has_sha1_file_with_flags(const unsigned char *sha1, int flags) > > { > > - struct pack_entry e; > > + int f = OBJECT_INFO_SKIP_CACHED | > > + ((flags & HAS_SHA1_QUICK) ? OBJECT_INFO_QUICK : 0); > > > > if (!startup_info->have_repository) > > return 0; > > - if (find_pack_entry(sha1, &e)) > > - return 1; > > - if (has_loose_object(sha1)) > > - return 1; > > - if (flags & HAS_SHA1_QUICK) > > - return 0; > > - reprepare_packed_git(); > > - return find_pack_entry(sha1, &e); > > + return !sha1_object_info_extended(sha1, NULL, f); > > } > > I would have preferred to see the new variable not to be called 'f', > as that makes it unclear what it is (is that a callback function > pointer?). In this case, uyou are forcing the flag bits passed > down, so perhaps you can reuse the same variable? > > If you allocated a separate variable because > has_sha1_file_with_flags() and sha1_object_info_extended() take flag > bits from two separate vocabularies, that is a valid reasoning, but > if that is the case, then I would have named 'f' to reflect that > fact that this is different from parameter 'flag' that is defined in > the has_sha1_file_with_flags() world, but a different thing that is > defined in sha1_object_info_extended() world, e.g. "soie_flag" or > something like that. > > Thanks. This makes sense. If I don't end up reverting has_sha1_file_with_flags(), I'll change the name to "soie_flag".