Re: [patch 0/2] i_version update
On Thu, 2007-05-31 at 10:01 +1000, Neil Brown wrote: This will provide a change number that normally changes only when the file changes and doesn't require any extra storage on disk. The change number will change inappropriately only when the inode has fallen out of cache and is being reload, which is either after a crash (hopefully rare) of when a file hasn't been used for a while, implying that it is unlikely that any client has it in cache. It will also change inappropriately if the server is under heavy load and needs to reclaim memory by tossing out inodes that are cached and still in use by the clients. That change will trigger clients to invalidate their caches and to refetch the data from the server, further cranking up the load. Not an ideal solution... Trond - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e2fsprogs coverity patch cid-33.diff
On Tue, May 29, 2007 at 02:51:41PM -0600, Andreas Dilger wrote: Did cid-34.diff get lost? I still have it in my apply atop 1.39-WIP series, so it appears not to have made it into Ted's repo. I'm including the patch again for posterity. Fix applied. - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oops on disk write (kernel 2.6.16.y)
[linux-ext4 added to CC] Holger Eitzenberger napisaĆ(a): Hi, I am currently experiencing the same Kernel crash on several machines and Kernel version 2.6.16.43. Attached are some dumps of one particular machine which crashed several times because of this. Up until now I was unable to reproduce this behaviour in the testlab, also putting some I/O on the box helped not. All of them happened on UP kernels, but this may be just a coincidence. From the logs I see that at least in one case the machine didn't stop immediately but worked for few our from that point on until it hit the wall. Looking at the traces I can say that all of them follow a codepath from the block I/O layer downward to ext3, e.g. here in page writeback path, see: kernel: Unable to handle kernel NULL pointer dereference at virtual address 0004 kernel: printing eip: kernel: c018e36a kernel: *pde = kernel: Oops: [#1] kernel: Modules linked in: nfnetlink_queue ip_nat_ftp ip_conntrack_ftp edd sg sd_mod sr_mod scsi_mod ide_cd cdrom ipt_MASQUERADE ipt_hashlimit xt_condition ipt_REDIRECT xt_limit xt_conntrack ipt_esp xt_tcpudp ipt_psd ipt_addrtype ip_nat_mms ip_nat_pptp ip_nat_irc iptable_nat ebtable_nat ebtables iptable_ips ip_conntrack_mms ip_conntrack_pptp ip_conntrack_irc ppp_deflate zlib_deflate bsd_comp sha1 arc4 ppp_mppe ppp_async crc_ccitt ppp_generic slhc crypto_null blowfish cast5 serpent twofish ipsec af_packet ipt_logmark ipt_confirmed ipt_owner ipt_REJECT ipt_CONFIRMED evdev ehci_hcd uhci_hcd ohci_hcd parport_pc ppdev parport xt_state xt_NOTRACK iptable_raw iptable_filter ip_conntrack_netlink ip_nat ipt_LOG ip_conntrack ip_tables x_tables nfnetlink_log nfnetlink eepro100 mii e100 capability commoncap loop kernel: CPU:0 kernel: EIP:0060:[c018e36a]Not tainted VLI kernel: EFLAGS: 00010286 (2.6.16.43-46-default #1) kernel: EIP is at walk_page_buffers+0x1a/0x70 kernel: eax: ebx: ecx: edx: kernel: esi: edi: d5f3cb74 ebp: esp: c34a5d6c kernel: ds: 007b es: 007b ss: 0068 kernel: Process pdflush (pid: 22753, threadinfo=c34a4000 task=cd7cb070) kernel: Stack: 0d573c574 dbb7c720 dbb7c720 c1114540 dbb7c720 d5f3cb74 dbb7c720 kernel: c01919c3 1000 c018e3c0 cb6e1710 0246 c1114540 000a kernel: c01918c0 c34a5f48 c0176979 c1114540 c34a5f48 c34a5e28 000e kernel: Call Trace: kernel: [c01919c3] ext3_ordered_writepage+0x103/0x1f0 kernel: [c018e3c0] bget_one+0x0/0x10 kernel: [c01918c0] ext3_ordered_writepage+0x0/0x1f0 kernel: [c0176979] mpage_writepages+0x1c9/0x3e0 kernel: [c01918c0] ext3_ordered_writepage+0x0/0x1f0 kernel: [c013aa79] do_writepages+0x49/0x50 kernel: [c017504c] __writeback_single_inode+0x8c/0x3c0 kernel: [c02fbafc] schedule_timeout+0x4c/0xc0 kernel: [c01755a8] sync_sb_inodes+0x178/0x230 kernel: [c0175b1f] writeback_inodes+0x6f/0x89 kernel: [c013ac59] wb_kupdate+0xf9/0x170 kernel: [c013b5ee] pdflush+0x8e/0x180 ... The disassembly of write_page_buffers() is at [1]. At least some of the other crashes happen in the sys_write() path, I have attached some of them ([2], [3] and [4]). Looking at the LKML archive I can say that http://lkml.org/lkml/2007/3/4/11 looks similar. Any help appreciated. Thanks. /holger [1] http://ftp.astaromail.com/people/heitzenberger/v7/kernel/6313/walk_page_buffers.s [2] http://ftp.astaromail.com/people/heitzenberger/v7/kernel/6313/kernel-2007-05-03.log.gz [3] http://ftp.astaromail.com/people/heitzenberger/v7/kernel/6313/kernel-2007-05-10.log.gz [4] http://ftp.astaromail.com/people/heitzenberger/v7/kernel/6313/kernel-2007-05-15.log.gz - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e2fsprogs coverity patch cid-33.diff
On Tue, May 29, 2007 at 05:10:25PM -0600, Andreas Dilger wrote: I also have another outstanding patch: === Coverity ID: 6: Forward Null At the second conditional iter-file could still be NULL. We need to check for it again. Also applied. - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND e2progs] get_backup_sb: Consider block size when searching for backups
On Tue, May 29, 2007 at 10:26:44PM +0100, Daniel Drake wrote: I sent this in a few weeks ago, and it hasn't been applied to the hg tree. Any comments? Thanks. Thanks, applied. Sorry for the delay. - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH] Multiple mount protection
On Thu, May 31, 2007 at 02:28:33AM +0530, Kalpak Shah wrote: So can I assume that the INCOMPAT_MMP flag and the s_mmp_interval and s_mmp_block superblock fields will be reserved regardless of whether the patches go into ext4? I had attached the patches in the last mail so you can share your views on them. Yes, i've reserved the code point and superblock fields. I'm not going to add INCOMPAT_MMP flag to the supported file until I get and integrate the patch ext2fs_open() that actually tests for the flag, though, since that would be a bit silly. I assume the patch will add a flag to ext2fs_open which skips the MMP checking. After all, tune2fs is allowed to make changes to the superblock while the filesystem is mounted. So it needs to be able to open the filesystem read/only even if it is mounted. Regards, - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] store RAID stride in superblock
On Thu, May 24, 2007 at 07:45:32PM +0530, Rupesh Thakare wrote: Hello, I've added s_raid_stripe_width parameter in superblock. I've also incorporated s_raid_stride and s_raid_stripe_width parameters in tune2fs. The new options can be specified using '-E options' in both mke2fs and tune2fs. Both the Man pages (mke2fs and tune2fs) are updated accordingly. Patch is attached herewith. Thanks. I've used a different offset for the raid_stripe_width, to avoid conflicting with Kalpak's mmp patch. Could you send me a signed-off-by for your patch? Thanks, - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] store RAID stride in superblock
On May 31, 2007 12:21 -0400, Theodore Tso wrote: On Thu, May 24, 2007 at 07:45:32PM +0530, Rupesh Thakare wrote: I've added s_raid_stripe_width parameter in superblock. I've also incorporated s_raid_stride and s_raid_stripe_width parameters in tune2fs. The new options can be specified using '-E options' in both mke2fs and tune2fs. Both the Man pages (mke2fs and tune2fs) are updated accordingly. Patch is attached herewith. Thanks. I've used a different offset for the raid_stripe_width, to avoid conflicting with Kalpak's mmp patch. Ah, we've been doing it the other way around here. It makes sense to keep the s_raid_stripe_width fields together. I think this code is preliminary enough that nobody has actually started using it yet. Can you please post what the end of ext2_super_block looks like (whether you decide to reorder the fields or not). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] store RAID stride in superblock
On Thu, 2007-05-31 at 14:19 -0600, Andreas Dilger wrote: On May 31, 2007 12:21 -0400, Theodore Tso wrote: On Thu, May 24, 2007 at 07:45:32PM +0530, Rupesh Thakare wrote: I've added s_raid_stripe_width parameter in superblock. I've also incorporated s_raid_stride and s_raid_stripe_width parameters in tune2fs. The new options can be specified using '-E options' in both mke2fs and tune2fs. Both the Man pages (mke2fs and tune2fs) are updated accordingly. Patch is attached herewith. Thanks. I've used a different offset for the raid_stripe_width, to avoid conflicting with Kalpak's mmp patch. Ah, we've been doing it the other way around here. It makes sense to keep the s_raid_stripe_width fields together. I think this code is preliminary enough that nobody has actually started using it yet. Can you please post what the end of ext2_super_block looks like (whether you decide to reorder the fields or not). I can update the MMP patches when I actually send them for inclusion. So I think it makes sense to keep the s_raid_* fields together. Thanks, Kalpak. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH] Multiple mount protection
On Thu, 2007-05-31 at 12:16 -0400, Theodore Tso wrote: On Thu, May 31, 2007 at 02:28:33AM +0530, Kalpak Shah wrote: So can I assume that the INCOMPAT_MMP flag and the s_mmp_interval and s_mmp_block superblock fields will be reserved regardless of whether the patches go into ext4? I had attached the patches in the last mail so you can share your views on them. Yes, i've reserved the code point and superblock fields. Thanks. I'm not going to add INCOMPAT_MMP flag to the supported file until I get and integrate the patch ext2fs_open() that actually tests for the flag, though, since that would be a bit silly. I assume the patch will add a flag to ext2fs_open which skips the MMP checking. Yes I have added a EXT2_FLAG_SKIP_MMP flag to ext2fs_open() to bypass MMP which will be set if tunefs is used with -f option. Also MMP check will not be run if the filesystem is being opened readonly. Thanks, Kalpak. After all, tune2fs is allowed to make changes to the superblock while the filesystem is mounted. So it needs to be able to open the filesystem read/only even if it is mounted. Regards, - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT] Please pull ext4 bug fixes
Hi Linus, Please pull from the for_linus branch at: git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git for_linus It contains the following fixes, which (except for the superblock field reservations) have all been in the -mm tree for quite a while. Thanks, regards, - Ted fs/ext4/balloc.c|6 - fs/ext4/extents.c | 148 +++- fs/ext4/inode.c |4 - fs/ext4/namei.c |4 - fs/ext4/super.c |2 include/linux/ext4_fs.h | 33 +--- include/linux/ext4_fs_extents.h |5 - include/linux/ext4_fs_i.h |6 - 8 files changed, 134 insertions(+), 74 deletions(-) Alex Tomas (1): When ext4_ext_insert_extent() fails to insert new blocks Amit Arora (1): ext4: Extent overlap bugfix Dave Kleikamp (1): EXT4: Fix whitespace Mingming Cao (1): Remove unnecessary exported symbols. Theodore Ts'o (1): Define/reserve new ext4 superblock fields - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] store RAID stride in superblock
On May 31, 2007 17:33 -0400, Theodore Tso wrote: Oops, I just pushed a set of bugfixes to Linux that included the superblock field reservations. Oh well. What is in the e2fsprogs hg repository ... is: .. __u16 s_raid_stride; /* RAID stride */ __u16 s_mmp_interval; /* # seconds to wait in MMP checking */ __u64 s_mmp_block;/* Block for multi-mount protection */ __u32 s_raid_stripe_width;/* blocks on all data disks (N*stride)*/ __u32 s_reserved[163];/* Padding to the end of the block */ }; We're updating our patches to be based on the new HG code. One question which does come to mind; is there any reason why we might want to know the RAID level and/or the number of disks (as opposed to just the stripe width)? Not so far. The raid_stride is for bitmap placement (and could also be used for alignment of random IOs to avoid making 2 disks busy when 1 would do). The raid_stripe_width is the amount that delalloc+mballoc will use for allocations+writes to avoid read-modify-write of RAID stripes. It doesn't really matter what the RAID level is. And has anyone investigated where there are magic ioctl's or libdevmapper APi's so we can get the RAID parameters automatically? If so, patches so that mke2fs can get the information automatically (as opposed to forcing the user to have to specify lots of annoying options) would be most welcome For now we will specify this via mke2fs or tune2fs for existing filesystems. The XFS folks mentioned they have a library to extract this info for linux devices (e.g. DM, MD, etc), but of course that still won't work for e.g. external RAID devices. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fw: ext3 dir_index causes an error
Ted is dir_index maintainer ;) That's a nice-looking bug report, btw. Thanks. Begin forwarded message: Date: Fri, 01 Jun 2007 13:01:07 +0900 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: ext3 dir_index causes an error Hello, First of all, I really appricate your great works. Now I've found a problem around dir_index feature. Here is a report following linux/REPORTING-BUGS. [1.] One line summary of the problem: ext3 dir_index causes an error [2.] Full description of the problem/report: This is my local test program to reproduce this problem. The readdir1.c calls creat(2), opendir(3) and readdir(3). And the shell script execute it repeatedly with a brand-new ext3fs image on a loopback device. When the script adds '-O dir_index' to mkfs, some errors appear. On a system with linux-2.6.21.3, ext3fs produces these error message, and the filesystem seems to be corrupted. -- kjournald starting. Commit interval 5 seconds EXT3 FS on loop0, internal journal EXT3-fs: mounted filesystem with ordered data mode. ::: EXT3-fs: mounted filesystem with ordered data mode. EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 EXT3-fs error (device loop0): htree_dirblock_to_tree: bad entry in directory #2: rec_len is too small for name_len - offset=6924, inode=26, rec_len=244, name_len=249 kjournald starting. Commit interval 5 seconds ::: -- On the other system with linux-2.6.18 (debian etch kernel), the same error appears. When the script adds '-O ^dir_index' to mkfs, the problem never appears. It is not everytime that these errors appear. So the shell script executes the readdir1 test program repeatedly. Recently I upgraded my debian system from version 3.1 'sarge' to 4.0 'etch'. The debian etch sets the dir_index feature by default. So I found this problem. [3.] Keywords (i.e., modules, networking, kernel): ext3 dir_index [4.] Kernel information [4.1.] Kernel version (from /proc/version): [4.2.] Kernel .config file: [5.] Most recent kernel version which did not have the bug: [6.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) [7.] A small shell script or example program which triggers the problem (if possible) (readdir1.c) #include sys/types.h #include sys/stat.h #include fcntl.h #include dirent.h #include stdio.h #include stdlib.h #include unistd.h #include assert.h #include string.h #include errno.h void fin(char *s) { perror(s); exit(1); } void msg(int found, char *fname) { printf(%s%s found\n, fname, found?: not); } int main(int argc, char *argv[]) { DIR *dp; struct dirent *de; int err, found, i; char a[250]; err = chdir(argv[1]); if (err) fin(chdir); memset(a, 'a', sizeof(a)-1); a[sizeof(a)-1] = 0; for (i = 0; i 16+1; i++) { a[0]++; err = creat(a, 0644); if (err 0) fin(creat); err = creat(argv[2], 0644); if (err 0) fin(creat); } #if 0 err = unlink(argv[2]); if (err errno != ENOENT) fin(unlink); #endif dp = opendir(.); if (!dp) fin(opendir); de = readdir(dp); if (!de) fin(1st readdir); assert(strcmp(argv[2], de-d_name)); #if 0 argv[2][0]++; err = creat(argv[2], 0644); if (err 0) fin(creat); argv[2][0]--; #endif err = creat(argv[2], 0644); if (err 0) fin(creat); #if 0 err = unlink(argv[2]); if (err errno != ENOENT) fin(unlink); #endif found = 0; while ((de = readdir(dp)) !found) found = !strcmp(argv[2], de-d_name); msg(found, argv[2]); found = 0; rewinddir(dp); while ((de = readdir(dp)) !found) found = !strcmp(argv[2], de-d_name); msg(found, argv[2]); closedir(dp); dp = opendir(.); if (!dp) fin(opendir); found = 0; while ((de = readdir(dp)) !found) found = !strcmp(argv[2], de-d_name); msg(found, argv[2]); return 0; } -- #!/bin/sh img=rw.img dir=rw set -e make /tmp/readdir1 cd /dev/shm dd if=/dev/zero of=$img bs=1k count=4k 2