Hello, I'm not quite sure how to properly phrase the subject as a query term, so if this has been answered, please forgive the redundancy and quietly point me to where this gets addressed.
We are using svn at work to hold customer 'vault' data [various bits of information for each customer]. It has been a huge success -- to the point where we have over 1,000 customers using vaults. The checkins are automated, and we have amassed over 100,000 revisions thus far. User directories are created as /Ab/username [where Ab is a 2-character hash via a known (balanced) algorithm to make location of username files more machine-efficient]. So we have about 1,200 of these guys, with some hashes obviously being re-used, no big deal. The problem is that, even on miniscule changes, we are finding the db/rev/<shard>/<revno> files to be disproportionately large; for an addition or change of a file that is about 1k-4k, the rev files are at 100K each. At lower revisions, we noticed that the rev files are 4k but have been increasing in size with each shard that gets added, usually to the tune of 1k/shard. With so many revisions being checked in at a rapid rate, we found ourselves having to take production off line for a couple of minutes while we migrated the repository in question to a larger filesystem due to the threat of the filesystem filling up. The upshot of this is: Why does a minimal delta create such a large delta file? 100k for a small change? What's going on and how can we mitigate this? -- --*greywolf;