>> I looked at the metadata that gets created for every email message and it’s >> small - less than 100 bytes. So I ran a simple test of appending 20,000 >> unique 100 bytes ascii messages. I would have expected the repository size >> to be on the order of a few megabytes, instead it was 4.7G. This is roughly >> 234K overhead per 100 bytes message, which would be quite impractical for >> the email storage with the metadata essentially exceeding the message >> storage. > > Did you start from an empty repository? Would be interested to run your code > locally to check what happens.
Did you add the message sequentially or did you use a view? If you have 20k commits (which are at least 4k), but you also have 20k different directories (which contain 1, 2, 3, ... 20k files). What's the size of your filenames. They are serialized in the directory metadata, so if you have 10 byte filenames, you should indeed expect the last directory metadata to be around 20k*10 (maybe a bit less as it is compressed). Anyway, the Git format is not very flexible (as we want to keep compatibility with the Git format) but that useful to understand what is not optimal to improve it in the custom backend. Thomas _______________________________________________ MirageOS-devel mailing list [email protected] http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
