Re: Fw: ext3 dir_index causes an error

2007-09-14 Thread Eric Sandeen
Goswin von Brederlow wrote:
> Eric Sandeen <[EMAIL PROTECTED]> writes:
> 
>> Eric Sandeen wrote:
>>> Andrew Morton wrote:
 Ted is dir_index maintainer ;)
>> ...
>>
 [1.] One line summary of the problem:
 ext3 dir_index causes an error
>>> I'm looking at this now, FWIW... pretty easy to reproduce on ppc64,
>>> though I've not yet hit it on x86.
>> The issue here is that do_split() splits a leaf node at the entry with
>> the median hash value, after sorting by hash... but it pays no attention
>> to the resulting size of the records in the old & new blocks.
> 
> http://en.wikipedia.org/wiki/Median
> 
> | At most half the population have values less than the median and at
> | most half have values greater than the median. If both groups
> | contain less than half the population, then some of the population
> | is exactly equal to the median.
> 
> That would mean that both records will be the same size and to have an
> overflow both would have to overflow. They should both be half full
> +-1.

No, it means that both blocks will have +/-1 the same *number* of
entries.  It says nothing about how much space is used in each.

>> If you're unlucky, and your split is lopsided size-wise, you may not
>> have space in the block chosen for the new entry.  This is not checked,
>> however, and things go bad quickly.
> 
> Maybe you did not mean median although it would be the logical choice.

Semantics aside, we don't want the median hash value, the middle hash
value, or the average hash value... as far as I can see, we don't care
about the hash value when we make this decision.  We care about the
sizes of the objects, not their hashes, and not where they fall in an
ordered list of hashes.

When deciding how many entries to move, we have to pay attention to how
much space they're taking up, not just how many of them there are.  If
we only move the tiny entries, even if they accounts for half of the
entries in the dir, that may not create enough room for the big entry
we're trying to fit.   Moving exactly half the entries may create a very
lopsided size distribution.

-Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: ext3 dir_index causes an error

2007-09-14 Thread Goswin von Brederlow
Eric Sandeen <[EMAIL PROTECTED]> writes:

> Eric Sandeen wrote:
>> Andrew Morton wrote:
>
>>> Ted is dir_index maintainer ;)
>
> ...
>
>>> [1.] One line summary of the problem:
>>> ext3 dir_index causes an error
>> 
>> I'm looking at this now, FWIW... pretty easy to reproduce on ppc64,
>> though I've not yet hit it on x86.
>
> The issue here is that do_split() splits a leaf node at the entry with
> the median hash value, after sorting by hash... but it pays no attention
> to the resulting size of the records in the old & new blocks.

http://en.wikipedia.org/wiki/Median

| At most half the population have values less than the median and at
| most half have values greater than the median. If both groups
| contain less than half the population, then some of the population
| is exactly equal to the median.

That would mean that both records will be the same size and to have an
overflow both would have to overflow. They should both be half full
+-1.

> If you're unlucky, and your split is lopsided size-wise, you may not
> have space in the block chosen for the new entry.  This is not checked,
> however, and things go bad quickly.

Maybe you did not mean median although it would be the logical choice.

> Talked with Andreas a little about this, looking into the best way to
> fix it up.
>
> -Eric

MfG
Goswin
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: ext3 dir_index causes an error

2007-09-14 Thread Eric Sandeen
Eric Sandeen wrote:
> Andrew Morton wrote:

>> Ted is dir_index maintainer ;)

...

>> [1.] One line summary of the problem:
>> ext3 dir_index causes an error
> 
> I'm looking at this now, FWIW... pretty easy to reproduce on ppc64,
> though I've not yet hit it on x86.

The issue here is that do_split() splits a leaf node at the entry with
the median hash value, after sorting by hash... but it pays no attention
to the resulting size of the records in the old & new blocks.

If you're unlucky, and your split is lopsided size-wise, you may not
have space in the block chosen for the new entry.  This is not checked,
however, and things go bad quickly.

Talked with Andreas a little about this, looking into the best way to
fix it up.

-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: ext3 dir_index causes an error

2007-09-12 Thread Eric Sandeen
Andrew Morton wrote:
> Ted is dir_index maintainer ;)
> 
> That's a nice-looking bug report, btw.  Thanks.
> 
> 
> Begin forwarded message:
> 
> Date: Fri, 01 Jun 2007 13:01:07 +0900
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
> Subject: ext3 dir_index causes an error
> 
> 
> 
> Hello,
> 
> First of all, I really appricate your great works.
> Now I've found a problem around dir_index feature.
> Here is a report following linux/REPORTING-BUGS.
> 
> 
> [1.] One line summary of the problem:
> ext3 dir_index causes an error

I'm looking at this now, FWIW... pretty easy to reproduce on ppc64,
though I've not yet hit it on x86.

-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: ext3 dir_index causes an error

2007-05-31 Thread Andrew Morton

Ted is dir_index maintainer ;)

That's a nice-looking bug report, btw.  Thanks.


Begin forwarded message:

Date: Fri, 01 Jun 2007 13:01:07 +0900
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: ext3 dir_index causes an error



Hello,

First of all, I really appricate your great works.
Now I've found a problem around dir_index feature.
Here is a report following linux/REPORTING-BUGS.


[1.] One line summary of the problem:
ext3 dir_index causes an error

[2.] Full description of the problem/report:
This is my local test program to reproduce this problem. The
readdir1.c calls creat(2), opendir(3) and readdir(3). And the shell
script execute it repeatedly with a brand-new ext3fs image on a
loopback device.
When the script adds '-O dir_index' to mkfs, some errors appear.

On a system with linux-2.6.21.3, ext3fs produces these error message,
and the filesystem seems to be corrupted.
--
kjournald starting.  Commit interval 5 seconds
EXT3 FS on loop0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
:::
EXT3-fs: mounted filesystem with ordered data mode.
EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory 
#2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, 
name_len=249
EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory 
#2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, 
name_len=249
EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory 
#2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, 
name_len=249
kjournald starting.  Commit interval 5 seconds
:::
--

On the other system with linux-2.6.18 (debian etch kernel), the same
error appears.
When the script adds '-O ^dir_index' to mkfs, the problem never appears.

It is not everytime that these errors appear. So the shell script
executes the readdir1 test program repeatedly.
Recently I upgraded my debian system from version 3.1 'sarge' to 4.0
'etch'. The debian etch sets the dir_index feature by default. So I
found this problem.

[3.] Keywords (i.e., modules, networking, kernel):
ext3 dir_index

[4.] Kernel information
[4.1.] Kernel version (from /proc/version):
[4.2.] Kernel .config file:
[5.] Most recent kernel version which did not have the bug:
[6.] Output of Oops.. message (if applicable) with symbolic information
 resolved (see Documentation/oops-tracing.txt)
[7.] A small shell script or example program which triggers the
 problem (if possible)

(readdir1.c)

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

void fin(char *s)
{
perror(s);
exit(1);
}

void msg(int found, char *fname)
{
printf("%s%s found\n", fname, found?"":" not");
}

int
main(int argc, char *argv[])
{
DIR *dp;
struct dirent *de;
int err, found, i;
char a[250];

err = chdir(argv[1]);
if (err)
fin("chdir");

memset(a, 'a', sizeof(a)-1);
a[sizeof(a)-1] = 0;
for (i = 0; i < 16+1; i++) {
a[0]++;
err = creat(a, 0644);
if (err < 0)
fin("creat");

err = creat(argv[2], 0644);
if (err < 0)
fin("creat");
}

#if 0
err = unlink(argv[2]);
if (err && errno != ENOENT)
fin("unlink");
#endif

dp = opendir(".");
if (!dp)
fin("opendir");

de = readdir(dp);
if (!de)
fin("1st readdir");
assert(strcmp(argv[2], de->d_name));

#if 0
argv[2][0]++;
err = creat(argv[2], 0644);
if (err < 0)
fin("creat");

argv[2][0]--;
#endif
err = creat(argv[2], 0644);
if (err < 0)
fin("creat");

#if 0
err = unlink(argv[2]);
if (err && errno != ENOENT)
fin("unlink");
#endif

found = 0;
while ((de = readdir(dp)) && !found)
found = !strcmp(argv[2], de->d_name);
msg(found, argv[2]);

found = 0;
rewinddir(dp);
while ((de = readdir(dp)) && !found)
found = !strcmp(argv[2], de->d_name);
msg(found, argv[2]);

closedir(dp);
dp = opendir(".");
if (!dp)
fin("opendir");

found = 0;
while ((de = readdir(dp)) && !found)
found = !strcmp(argv[2], de->d_name);
msg(found, argv[2]);

return 0;
}
--
#!/bin/sh

img=rw.img
dir=rw
set -e
make /tmp/readdir1

cd /dev/shm
dd if=/dev/zero of=$img bs=1k count=4k 2> /dev/null
mkdir -p $dir
ulimit -c un