* Paul Jackson <[EMAIL PROTECTED]> wrote: > Hmmm ... I have this strong sense that I am about 2 hours away from > smacking my forehead and groaning "Duh - so that's what Ingo meant!" > > However, one must play out one's destiny. > > Could you provide an example scenario, which results in the creation > of a combo-blob? > > The best I can come up with is the following. > > Let's say Nick changes one line in the middle of kernel/sched.c (yeah > - I know - unlikely scenario - he usually changes more than that - > nevermind that detail.) > > In the days Before Combo Blobs (BCB), git would have been told that > kernel/sched.c was to be picked up, and would have wrapped it up in a > zlib'd blob, sha1summed it, seen it was a new sum, and added that blob > to its objects (or something like this -- I'm still a little fuzzy on > these git details.) > > But Nick just downloaded the latest git 1.5.11.1 which has added > support for combo blobs, so now, guessing here, instead of wrapping up > the new sched.c, git instead unwraps the old one, diff's with the new, > notices a couple of long sequences that are unchanged, wraps up both > of those sequences as a couple of relatively large blobs, and wraps up > the new lines that Nick just coded in the middle as a small blob, and > puts all three in the object store, along with another small > combo-blob, tying them all together.
actually, git would just include by reference the previous blob. lets say we had the previous version of sched.c in a blob, ID cc4ee6107d19f89898a8c89d45810f01710f2ff4. We have the new edit (which is small, lets say 20 bytes) in blob e010fab710092b19be6e26de1721e249dff2d141. We'd create the combo-blob representing the new version of sched.c, the following way: include cc4ee6107d19f89898a8c89d45810f01710f2ff4 0 54010 include e010fab710092b19be6e26de1721e249dff2d141 0 20 include cc4ee6107d19f89898a8c89d45810f01710f2ff4 54030 73061 so we'd include (by reference) most of the previous version, with a small blob for the extras. Since sched.c compresses down to 36K, we saved ~32K of bandwidth, and somewhere on the order of 20K of storage. to construct the combo blob later on, we do have to unpack sched.c (and if it's already a combo-blob that is not cached then we'd have to unpack all parents until we arrive at some full blob). > So far, not too bad. Haven't gained anything, and required the > unpacking of a zlib blog we didn't require before, and the running and > analyzing of a diff we didn't require before, but the end result is > only moderately worse - four object blobs instead of one, but of total > size not much larger (well, total size typically 3 disk blocks worse, > due to a slight increase in fragmentation from using 4 blocks to store > what used to be in one.) we'd have 2 new objects (the 'delta' and the 'combo' blob). (if # of objects is an issue then we could include new data in the combo blob itself too, but that's getting too complex i think.) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/