Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
> However you're > right that the original structure proposed by Linus is too flat. That was the only point I *meant* to defend. The rest was error. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
> Tom, please stop this ext* filesystem bashing ;-) For one thing... yes, i'm totally embarassed on this issue. I made a late-night math error in a spec. *hopefully* would have noticed it on my own as I coded to that spec but y'all have been wonderful at pointing out my mistake to me even though I initially defended it. As for ext* bashing it's not bashing exactly. I/O bandwidth gets a little better, disks get a bunch cheaper --- then ext* doesn't look bad at all in this respect. And we're awefully close to that point. Meanwhile, for times measured in years, I've gotten complaints from ext* users about software that is just fine on other filesystems over issues like the allocation size of small files. So ext*, from my perspective, was a little too far ahead of its time and, yes, my complaints about it are just about reaching their expiration date. Anyway, thanks for all the sanity about directory layout. Really, it was just an "I'm too sleepy to be doing this right now" error. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
> [your 0:3/4:7 directory hierarchy is horked] Absolutely. Made a dumb mistake the night I wrote that spec and embarassed that I initially defended it. I had an arithmetic error. Thanks, this time, for your persistence in pointing it out. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
> Yes, it really doesn't make much sense to have so big keys on the > directories. It's official... i'm blushing wildly thank you for the various replies that pointed out my thinko. That part of my spec hasn't been coded yet --- i just wrote text. It really was the silly late-night error of sort: "hmm...let's see, 4 hex digits plus 4 hex digits that's 16 bits sounds about right." Really, I'll fix it. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
> Using your suggested indexing method that uses [0:4] as the 1st level key and [0:3] > [4:8] as the 2nd level key, I obtain an indexed archive that occupies 159M, > where the top level contains 18665 1st level keys, the largest first level dir > contains 5 entries, and all 2nd level dirs contain exactly 1 entry. That's just a mistake in the spec. The format should probably be multi-level but, yes, the fanout I suggested is currently quite bogus. When I write that part of that code (today or tomorrow) I'll fix it. A few people pointed that out. Thanks. -t - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
Tomas Mraz <[EMAIL PROTECTED]> writes: >> Btw, if, as you indicate above, you do believe that a 1 level indexing should >> use [0:2], then it doesn't make much sense to me to also suggest that a 2 >> level >> indexing should use [0:1] as primary subkey :-) > > Why do you think so? IMHO we should always target a similar number of > files/subdirectories in a directories of the blob archive. So If I > always suppose that the archive would contain at most 16 millions of > files then the possible indexing schemes are either 1 level with key > length 3 (each directory would contain ~4096 files) or 2 level with key > length 2 (each directory would contain ~256 files). > Which one is better could be of course filesystem and hardware > dependent. First off, I have been using python slice notation, so when I write [0:2] I mean a key of length 2 (the second index is not included). I now realize that when you wrote the same you meant to include the second index. I believe that our disagreement comes from the fact that we are asking different questions. You consider the question of how to best index a fixed database and I consider the question of how to best index an ever increasing database. Now consider why we even want multiple indexing levels: presumably this is because certain operations become too costly when the size of a directory becomes too large. If that's not the case, then we might as well just have one big flat directory - perhaps that's even a viable option for some filesystems.[1] [1] there is the additional consideration that a hierarchical system implements a form of key compression by sharing key prefixes. I don't know at what point such an effect becomes beneficial, if ever. Now suppose we need at least one level of indexing. Under an assumption of uniform distribution of bits in keys, as more objects are added to the database, the lower levels are going to fill up uniformly. Therefore at those levels we are again faced with exactly the same indexing problem and thus should come up with exactly the same answer. This is why I believe that the scheme I proposed is best: when a bottom level directory fills up past a certain size, introduce under it an additional level, and reindex the keys. Since the "certain size" is fixed, this is a constant time operation. One could also entertain the idea of reindexing not just a bottom level directory but an entire subtree of the database (this would be closer to your idea of finding an optimal reindexing of just this part of the database). However this has the disadvantage that the operation's cost grows exponentially with the depth of the tree. Cheers, --Denys - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
On Thu, 2005-04-21 at 11:09 +0200, Denys Duchier wrote: > Tomas Mraz <[EMAIL PROTECTED]> writes: > > > If we suppose the maximum number of stored blobs in the order of milions > > probably the optimal indexing would be 1 level [0:2] indexing or 2 > > levels [0:1] [2:3]. However it would be necessary to do some > > benchmarking first before setting this to stone. > > As I have suggested in a previous message, it is trivial to implement adaptive > indexing: there is no need to hardwire a specific indexing scheme. > Furthermore, > I suspect that the optimal size of subkeys may well depend on the filesystem. > My experiments seem to indicate that subkeys of length 2 achieve an excellent > compromise between discriminatory power and disk footprint on ext2. > > Btw, if, as you indicate above, you do believe that a 1 level indexing should > use [0:2], then it doesn't make much sense to me to also suggest that a 2 > level > indexing should use [0:1] as primary subkey :-) Why do you think so? IMHO we should always target a similar number of files/subdirectories in a directories of the blob archive. So If I always suppose that the archive would contain at most 16 millions of files then the possible indexing schemes are either 1 level with key length 3 (each directory would contain ~4096 files) or 2 level with key length 2 (each directory would contain ~256 files). Which one is better could be of course filesystem and hardware dependent. Of course it might be best to allow adaptive indexing but I think that first some benchmarking should be made and it's possible that some fixed scheme could be chosen as optimal. -- Tomas Mraz <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
Tomas Mraz <[EMAIL PROTECTED]> writes: > If we suppose the maximum number of stored blobs in the order of milions > probably the optimal indexing would be 1 level [0:2] indexing or 2 > levels [0:1] [2:3]. However it would be necessary to do some > benchmarking first before setting this to stone. As I have suggested in a previous message, it is trivial to implement adaptive indexing: there is no need to hardwire a specific indexing scheme. Furthermore, I suspect that the optimal size of subkeys may well depend on the filesystem. My experiments seem to indicate that subkeys of length 2 achieve an excellent compromise between discriminatory power and disk footprint on ext2. Btw, if, as you indicate above, you do believe that a 1 level indexing should use [0:2], then it doesn't make much sense to me to also suggest that a 2 level indexing should use [0:1] as primary subkey :-) Cheers, -- Dr. Denys Duchier - IRI & LIFL - CNRS, Lille, France AIM: duchierdenys - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
On Wed, 2005-04-20 at 16:04 -0700, Tom Lord wrote: > I think that to a large extent you are seeing artifacts > of the questionable trade-offs that (reports tell me) the > ext* filesystems make. With a different filesystem, the > results would be very different. Tom, please stop this ext* filesystem bashing ;-) Even with filesystem which compresses the tails of files into one filesystem block it wouldn't make a difference that there are potentially (and very probably even with blob numbers in order of 10) 65536 directories on the first level. This doesn't help much in fast reading the first level. > I'm imagining a blob database containing may revisions of the linux > kernel. It will contain millions of blobs. > > It's fine that some filesystems and some blob operations work fine > on a directory with millions of files but what about other operations > on the database? I pity the poor program that has to `readdir' through > millions of files. Even with milions of files this key structure doesn't make much sense - the keys on the first and second levels are too long. However you're right that the original structure proposed by Linus is too flat. -- Tomas Mraz <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
Tom Lord <[EMAIL PROTECTED]> writes: > Thank you for your experiment. you are welcome. > I think that to a large extent you are seeing artifacts > of the questionable trade-offs that (reports tell me) the > ext* filesystems make. With a different filesystem, the > results would be very different. No, this is not the only thing that we observe. For example, here are the reports for the following two experiments: Indexing method = [2] Max keys at level 0: 256 Max keys at level 1: 108 Total number of dirs: 257 Total number of keys: 21662 Disk footprint :1.8M Indexing method = [4 4] Max keys at level 0: 18474 Max keys at level 1: 5 Max keys at level 2: 1 Total number of dirs: 40137 Total number of keys: 21662 Disk footprint :157M Notice the huge number of directories in the second experiment and they don't help at all in performing discrimination. > I'm imagining a blob database containing may revisions of the linux > kernel. It will contain millions of blobs. It is very easy to write code that uses an adaptive discrimination method (i.e. when a directory becomes too full, introduce an additional level of discrimination and rehash). In fact I have code that does that (rehashing if the size of a leaf directory exceed 256), but the [2] method above doesn't even need it even though it has 21662 keys. Just in case there is some interest, I attach below the python scripts which I used for my experiments: To create an indexed archive: python build.py SRC DST N1 ... Nk where SRC is the root directory of the tree to be indexed, and DST names the root directory of the indexed archive to be created. N1 through Nk are integers that each indicate how many chars to chop off the key to create the next level indexing key. python info.py DST collects and then prints out statistics about an indexed archive. For example, the invocation that relates to your original proposal would be: python build.py /usr/src/linux store 4 4 python info.py store import os,os.path,stat,sha tree = None archive = None slices= [] lastslice = (0,-1) def recurse(path): s = os.stat(path) if stat.S_ISDIR(s.st_mode): print path contents = [] for n in os.listdir(path): uid = recurse(os.path.join(path,n)) contents.append('\t'.join((n,uid))) contents = '\n'.join(contents) buf = sha.new(contents) uid = buf.hexdigest() uid = ','.join((uid,str(len(contents store(uid) return uid else: fd = file(path,"rb") contents = fd.read() fd.close() buf = sha.new(contents) uid = ','.join((buf.hexdigest(),str(s.st_size))) store(uid) return uid def store(uid): p = archive if not os.path.exists(p): os.mkdir(p) for s in slices: p = os.path.join(p,uid[s[0]:s[1]]) if not os.path.exists(p): os.mkdir(p) p = os.path.join(p,uid[lastslice[0]:lastslice[1]]) fd = file(p,"wb") fd.close() if __name__ == '__main__': import sys from optparse import OptionParser from types import IntType parser = OptionParser(usage="usage: %prog TREE ARCHIVE N1 ... Nk") (options, args) = parser.parse_args() if len(args) < 3: print sys.stderr, "expected at least 3 positional arguments" sys.exit(1) tree= args[0] archive = args[1] prev= 0 for a in args[2:]: try: next = prev+int(a) slices.append((prev,next)) prev = next except: print >>sys.stderr, "positional argument is not an integer:",a sys.exit(1) lastslice = (next,-1) recurse(tree) sys.exit(0) import os,os.path,stat info = [] archive = None total_keys = 0 total_dirs = 0 def collect_info(path,i): global total_dirs,total_keys s = os.stat(path) if stat.S_ISDIR(s.st_mode): total_dirs += 1 l = os.listdir(path) n = len(l) if i==len(info): info.append(n) elif n>info[i]: info[i] = n i += 1 for f in l: collect_info(os.path.join(path,f),i) else: total_keys += 1 def print_info(): i = 0 for n in info: print "Max keys at level %2s: %7s" % (i,n) i += 1 print "Total number of dirs: %7s" % total_dirs print "Total number of keys: %7s" % total_keys fd = os.popen("du -csh %s" % archive,"r") s = fd.read() fd.close() s = s.split()[0] print "Disk footprint : %7s" % s if __name__ == '__main__': import sys from optparse import OptionParser parser = OptionParser(usage="usage: %prog ARCHIVE") (options, args) = parser.parse_args() if len(args) != 1: print sys.stderr, "expected exactly 1 positional argument" sys.exit(1) archive = args[0] collect_info(archive,0) pr
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
On Wed, 2005-04-20 at 19:15 +0200, [EMAIL PROTECTED] wrote: ... > As data, I used my /usr/src/linux which uses 301M and contains 20753 files and > 1389 directories. To compute the key for a directory, I considered that its > contents were a mapping from names to keys. I suppose if you used the blob archive for storing many revisions the number of stored blobs would be much higher. However even then we can estimate that the maximum number of stored blobs will be in the order of milions. > When constructing the indexed archive, I actually stored empty files instead > of > blobs because I am only interested in overhead. > > Using your suggested indexing method that uses [0:4] as the 1st level key and [0:3] > [4:8] as the 2nd level key, I obtain an indexed archive that occupies 159M, > where the top level contains 18665 1st level keys, the largest first level dir > contains 5 entries, and all 2nd level dirs contain exactly 1 entry. Yes, it really doesn't make much sense to have so big keys on the directories. If we would assume that SHA1 is a really good hashing function so the probability of any hash value is the same this would allow storing 2^16 * 2^16 * 2^16 blobs with approximately same directory usage. > Using Linus suggested 1 level [0:2] indexing, I obtain an indexed archive that [0:1] I suppose > occupies 1.8M, where the top level contains 256 1st level keys, and where the > largest 1st level dir contains 110 entries. The question is how many entries in directory is optimal compromise between space and the speed of access to it's files. If we suppose the maximum number of stored blobs in the order of milions probably the optimal indexing would be 1 level [0:2] indexing or 2 levels [0:1] [2:3]. However it would be necessary to do some benchmarking first before setting this to stone. -- Tomas Mraz <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Gnu-arch-users] Re: [GNU-arch-dev] [ANNOUNCEMENT] /Arch/ embraces `git'
On Wed, 2005-04-20 at 19:15 +0200, [EMAIL PROTECTED] wrote: ... > As data, I used my /usr/src/linux which uses 301M and contains 20753 files and > 1389 directories. To compute the key for a directory, I considered that its > contents were a mapping from names to keys. I suppose if you used the blob archive for storing many revisions the number of stored blobs would be much higher. However even then we can estimate that the maximum number of stored blobs will be in the order of milions. > When constructing the indexed archive, I actually stored empty files instead > of > blobs because I am only interested in overhead. > > Using your suggested indexing method that uses [0:4] as the 1st level key and [0:3] > [4:8] as the 2nd level key, I obtain an indexed archive that occupies 159M, > where the top level contains 18665 1st level keys, the largest first level dir > contains 5 entries, and all 2nd level dirs contain exactly 1 entry. Yes, it really doesn't make much sense to have so big keys on the directories. If we would assume that SHA1 is a really good hashing function so the probability of any hash value is the same this would allow storing 2^16 * 2^16 * 2^16 blobs with approximately same directory usage. > Using Linus suggested 1 level [0:2] indexing, I obtain an indexed archive that [0:1] I suppose > occupies 1.8M, where the top level contains 256 1st level keys, and where the > largest 1st level dir contains 110 entries. The question is how many entries in directory is optimal compromise between space and the speed of access to it's files. If we suppose the maximum number of stored blobs in the order of milions probably the optimal indexing would be 1 level [0:2] indexing or 2 levels [0:1] [2:3]. However it would be necessary to do some benchmarking first before setting this to stone. -- Tomas Mraz <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html