Re: Fw: ext3 dir_index causes an error
Goswin von Brederlow wrote: > Eric Sandeen <[EMAIL PROTECTED]> writes: > >> Eric Sandeen wrote: >>> Andrew Morton wrote: Ted is dir_index maintainer ;) >> ... >> [1.] One line summary of the problem: ext3 dir_index causes an error >>> I'm looking at this now, FWIW... pretty easy to reproduce on ppc64, >>> though I've not yet hit it on x86. >> The issue here is that do_split() splits a leaf node at the entry with >> the median hash value, after sorting by hash... but it pays no attention >> to the resulting size of the records in the old & new blocks. > > http://en.wikipedia.org/wiki/Median > > | At most half the population have values less than the median and at > | most half have values greater than the median. If both groups > | contain less than half the population, then some of the population > | is exactly equal to the median. > > That would mean that both records will be the same size and to have an > overflow both would have to overflow. They should both be half full > +-1. No, it means that both blocks will have +/-1 the same *number* of entries. It says nothing about how much space is used in each. >> If you're unlucky, and your split is lopsided size-wise, you may not >> have space in the block chosen for the new entry. This is not checked, >> however, and things go bad quickly. > > Maybe you did not mean median although it would be the logical choice. Semantics aside, we don't want the median hash value, the middle hash value, or the average hash value... as far as I can see, we don't care about the hash value when we make this decision. We care about the sizes of the objects, not their hashes, and not where they fall in an ordered list of hashes. When deciding how many entries to move, we have to pay attention to how much space they're taking up, not just how many of them there are. If we only move the tiny entries, even if they accounts for half of the entries in the dir, that may not create enough room for the big entry we're trying to fit. Moving exactly half the entries may create a very lopsided size distribution. -Eric - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: ext3 dir_index causes an error
Eric Sandeen <[EMAIL PROTECTED]> writes: > Eric Sandeen wrote: >> Andrew Morton wrote: > >>> Ted is dir_index maintainer ;) > > ... > >>> [1.] One line summary of the problem: >>> ext3 dir_index causes an error >> >> I'm looking at this now, FWIW... pretty easy to reproduce on ppc64, >> though I've not yet hit it on x86. > > The issue here is that do_split() splits a leaf node at the entry with > the median hash value, after sorting by hash... but it pays no attention > to the resulting size of the records in the old & new blocks. http://en.wikipedia.org/wiki/Median | At most half the population have values less than the median and at | most half have values greater than the median. If both groups | contain less than half the population, then some of the population | is exactly equal to the median. That would mean that both records will be the same size and to have an overflow both would have to overflow. They should both be half full +-1. > If you're unlucky, and your split is lopsided size-wise, you may not > have space in the block chosen for the new entry. This is not checked, > however, and things go bad quickly. Maybe you did not mean median although it would be the logical choice. > Talked with Andreas a little about this, looking into the best way to > fix it up. > > -Eric MfG Goswin - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: ext3 dir_index causes an error
Eric Sandeen wrote: > Andrew Morton wrote: >> Ted is dir_index maintainer ;) ... >> [1.] One line summary of the problem: >> ext3 dir_index causes an error > > I'm looking at this now, FWIW... pretty easy to reproduce on ppc64, > though I've not yet hit it on x86. The issue here is that do_split() splits a leaf node at the entry with the median hash value, after sorting by hash... but it pays no attention to the resulting size of the records in the old & new blocks. If you're unlucky, and your split is lopsided size-wise, you may not have space in the block chosen for the new entry. This is not checked, however, and things go bad quickly. Talked with Andreas a little about this, looking into the best way to fix it up. -Eric - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fw: ext3 dir_index causes an error
Andrew Morton wrote: > Ted is dir_index maintainer ;) > > That's a nice-looking bug report, btw. Thanks. > > > Begin forwarded message: > > Date: Fri, 01 Jun 2007 13:01:07 +0900 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] > Subject: ext3 dir_index causes an error > > > > Hello, > > First of all, I really appricate your great works. > Now I've found a problem around dir_index feature. > Here is a report following linux/REPORTING-BUGS. > > > [1.] One line summary of the problem: > ext3 dir_index causes an error I'm looking at this now, FWIW... pretty easy to reproduce on ppc64, though I've not yet hit it on x86. -Eric - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fw: ext3 dir_index causes an error
Ted is dir_index maintainer ;) That's a nice-looking bug report, btw. Thanks. Begin forwarded message: Date: Fri, 01 Jun 2007 13:01:07 +0900 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: ext3 dir_index causes an error Hello, First of all, I really appricate your great works. Now I've found a problem around dir_index feature. Here is a report following linux/REPORTING-BUGS. [1.] One line summary of the problem: ext3 dir_index causes an error [2.] Full description of the problem/report: This is my local test program to reproduce this problem. The readdir1.c calls creat(2), opendir(3) and readdir(3). And the shell script execute it repeatedly with a brand-new ext3fs image on a loopback device. When the script adds '-O dir_index' to mkfs, some errors appear. On a system with linux-2.6.21.3, ext3fs produces these error message, and the filesystem seems to be corrupted. -- kjournald starting. Commit interval 5 seconds EXT3 FS on loop0, internal journal EXT3-fs: mounted filesystem with ordered data mode. ::: EXT3-fs: mounted filesystem with ordered data mode. EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 kjournald starting. Commit interval 5 seconds ::: -- On the other system with linux-2.6.18 (debian etch kernel), the same error appears. When the script adds '-O ^dir_index' to mkfs, the problem never appears. It is not everytime that these errors appear. So the shell script executes the readdir1 test program repeatedly. Recently I upgraded my debian system from version 3.1 'sarge' to 4.0 'etch'. The debian etch sets the dir_index feature by default. So I found this problem. [3.] Keywords (i.e., modules, networking, kernel): ext3 dir_index [4.] Kernel information [4.1.] Kernel version (from /proc/version): [4.2.] Kernel .config file: [5.] Most recent kernel version which did not have the bug: [6.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) [7.] A small shell script or example program which triggers the problem (if possible) (readdir1.c) #include #include #include #include #include #include #include #include #include #include void fin(char *s) { perror(s); exit(1); } void msg(int found, char *fname) { printf("%s%s found\n", fname, found?"":" not"); } int main(int argc, char *argv[]) { DIR *dp; struct dirent *de; int err, found, i; char a[250]; err = chdir(argv[1]); if (err) fin("chdir"); memset(a, 'a', sizeof(a)-1); a[sizeof(a)-1] = 0; for (i = 0; i < 16+1; i++) { a[0]++; err = creat(a, 0644); if (err < 0) fin("creat"); err = creat(argv[2], 0644); if (err < 0) fin("creat"); } #if 0 err = unlink(argv[2]); if (err && errno != ENOENT) fin("unlink"); #endif dp = opendir("."); if (!dp) fin("opendir"); de = readdir(dp); if (!de) fin("1st readdir"); assert(strcmp(argv[2], de->d_name)); #if 0 argv[2][0]++; err = creat(argv[2], 0644); if (err < 0) fin("creat"); argv[2][0]--; #endif err = creat(argv[2], 0644); if (err < 0) fin("creat"); #if 0 err = unlink(argv[2]); if (err && errno != ENOENT) fin("unlink"); #endif found = 0; while ((de = readdir(dp)) && !found) found = !strcmp(argv[2], de->d_name); msg(found, argv[2]); found = 0; rewinddir(dp); while ((de = readdir(dp)) && !found) found = !strcmp(argv[2], de->d_name); msg(found, argv[2]); closedir(dp); dp = opendir("."); if (!dp) fin("opendir"); found = 0; while ((de = readdir(dp)) && !found) found = !strcmp(argv[2], de->d_name); msg(found, argv[2]); return 0; } -- #!/bin/sh img=rw.img dir=rw set -e make /tmp/readdir1 cd /dev/shm dd if=/dev/zero of=$img bs=1k count=4k 2> /dev/null mkdir -p $dir ulimit -c un