I used ccache at my previous employer, and was very convinced of its value. Now that I have started a new job, I am in the process of trying to bring the new shop on board with ccache, so I have been doing lots of test runs and looking at things. Here is one thing I am thinking could add some value.

Looking through the ccache, I find many pairs of files which have different names (different hashes), but exactly identical content. This actually makes sense, as each file would have an index hash and a preprocessed hash, and since ccache needs to be able to find a match on either, then both need to be in the cache. (Actually, thinking about it, I'm a little surprised that there are any files in the ccache that DON'T appear twice - shouldn't EVERY compilation have 2 hashes?)

But it seems to me that it would make a lot of sense to store the data of these 2 files only once, by hard-linking the 2 names to the same inode. (For filesystems that support hard links, of course!) Every time ccache does an actual compilation and stores a file in the cache, it should store it under hard links for BOTH hashes - the indexed hash and the proprocessed hash. And if it gets a hash miss on the indexed hash but a hit on the preprocessed hash, then it should add the missed index hash as a hard link to the file found. So a given file (inode) in the cache could actually be referenced by MANY directory entries: one preprocessed hash, and multiple index hashes for various different combinations of source files and header files which end up producing the same output when passed through the preprocessor.

This could increase the storage efficiency of the ccache.

Of course, since not every filesystem supports hard links, the simplest solution was of course just to have multiple file copies. So I guess adding code to do this would require some way to determine if the filesystem the cache is on can in fact support hardlinks.

If you think this sounds like a good idea, but don't have bandwidth to do it, I would be willing to give it a try. Any hints on where to start would of course be welcome.

Thanks,
Frank Klotz
_______________________________________________
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache

Reply via email to