Jameson Miller <jam...@microsoft.com> writes:

> Add an API around managing the lifetime of cache_entry
> structs. Abstracting memory management details behind an API will
> allow for alternative memory management strategies without affecting
> all the call sites.  This commit does not change how memory is
> allocated / freed. A later commit in this series will allocate cache
> entries from memory pools as appropriate.
>
> Motivation:
> It has been observed that the time spent loading an index with a large
> number of entries is partly dominated by malloc() calls. This change
> is in preparation for using memory pools to reduce the number of
> malloc() calls made when loading an index.
>
> This API makes a distinction between cache entries that are intended
> for use with a particular index and cache entries that are not. This
> enables us to use the knowledge about how a cache entry will be used
> to make informed decisions about how to handle the corresponding
> memory.

Yuck.  make_index_cache_entry()?

Generally we use "cache" when working on the_index without passing
istate, and otherwise "index", which means that readers can assume
that distim_cache_entry(...)" is a shorter and more limited way to
say "distim_index_entry(&the_index, ...)".  Having both index and
cache in the same name smells crazy.

If most of the alocations are for permanent kind, give it a shorter
name call it make_cache_entry(&the_index, ...), and call the other
non-permanent one with a longer and more cumbersome name, perhaps
make_transient_cache_entry(...).  Avoid saying "index" in the former
name, as the design decision this series is making to allocate
memory for a cache-entry from a pool associated to an index_state is
already seen by what its first parameter is.

> diff --git a/cache.h b/cache.h
> index f0a407602c..204f788438 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -339,6 +339,29 @@ extern void remove_name_hash(struct index_state *istate, 
> struct cache_entry *ce)
>  extern void free_name_hash(struct index_state *istate);
>  
>  
> +/* Cache entry creation and freeing */
> +
> +/*
> + * Create cache_entry intended for use in the specified index. Caller
> + * is responsible for discarding the cache_entry with
> + * `discard_cache_entry`.
> + */
> +extern struct cache_entry *make_index_cache_entry(struct index_state 
> *istate, unsigned int mode, const unsigned char *sha1, const char *path, int 
> stage, unsigned int refresh_options);
> +extern struct cache_entry *make_empty_index_cache_entry(struct index_state 
> *istate, size_t name_len);
> +
> +/*
> + * Create a cache_entry that is not intended to be added to an index.
> + * Caller is responsible for discarding the cache_entry
> + * with `discard_cache_entry`.
> + */
> +extern struct cache_entry *make_transient_cache_entry(unsigned int mode, 
> const unsigned char *sha1, const char *path, int stage);
> +extern struct cache_entry *make_empty_transient_cache_entry(size_t name_len);
> +
> +/*
> + * Discard cache entry.
> + */
> +void discard_cache_entry(struct cache_entry *ce);

I am not yet convinced that it is a good idea to require each istate
to hold a separate pool.  Anything that uses unpack_trees() can do
"starting from this src_index, perform various mergy operations and
deposit the result in dst_index".  Sometimes the two logical indices
point at the same istate, sometimes different.  When src and dst are
different istates, the code that used to simply add another pointer
to the same ce to the dst index now needs to duplicate it out of the
pool associated with dst?

In any case, perhaps it will become clearer why it is a good idea as
we read on, so let's do so.

Reply via email to