Re: inquiry on btrfs send/receive
On Mon, Jun 4, 2012 at 7:33 PM, Alex Lyakas alex.bolshoy.bt...@gmail.com wrote: Yes, I also noticed that sometimes transid gets bumped up, but there is no actual change. So let's say you identify that a particular part of the tree is not shared anymore, and, say, eventually you get to a particular leaf within that part of the tree. How would you detect that, say, a particular INODE_ITEM (or, more difficult, an INODE_REF) is missing from that leaf WRT to previous tree? The property you described perhaps suggests another method to find leafs, in which *something* has changed. (Although within a leaf, does it make sense to decode all items and to look at their transid - those that have them - to filter out even more?) And yes, perhaps, looking at transid alone will bring more such potential leafs into consideration. However, how does this property help to determine *what* actually has changed between the two trees? Like, for example, being able to tell over which range of keys there possibly was a change, and iterate within that range? When doing incremental sends, we always have two trees at hand. One where we know that it is already on the receiving side (we did already send it) and the one that we want to send now. To find the changes, we simply compare those trees. If an items misses on one tree, we know it's either new or deleted (depending on the tree the item lies in). I would suggest you to not put too much work into send. As already said, the btrfs send/receive patches are close to be posted to the mailing list. It's currently reviewed and when I get a looks good now, I'll post it. Alex. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/8] btrfs: Use printk_get_level and printk_skip_level, add __printf, fix fallout
Use the generic printk_get_level function to search a message for a kern_level. Add __printf to verify format and arguments. Fix a few messages that had mismatches in format and arguments. Add #ifdef CONFIG_PRINTK blocks to shrink the object size a bit when not using printk. Signed-off-by: Joe Perches j...@perches.com --- fs/btrfs/ctree.h | 13 + fs/btrfs/disk-io.c|2 +- fs/btrfs/relocation.c |2 +- fs/btrfs/super.c | 41 +++-- 4 files changed, 50 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0236d03..590b519 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3046,10 +3046,22 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size); /* super.c */ int btrfs_parse_options(struct btrfs_root *root, char *options); int btrfs_sync_fs(struct super_block *sb, int wait); + +#ifdef CONFIG_PRINTK +__printf(2, 3) void btrfs_printk(struct btrfs_fs_info *fs_info, const char *fmt, ...); +#else +static inline __printf(2, 3) +void btrfs_printk(struct btrfs_fs_info *fs_info, const char *fmt, ...) +{ +} +#endif + +__printf(5, 6) void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function, unsigned int line, int errno, const char *fmt, ...); + void __btrfs_abort_transaction(struct btrfs_trans_handle *trans, struct btrfs_root *root, const char *function, unsigned int line, int errno); @@ -3073,6 +3085,7 @@ do { \ (errno), fmt, ##args);\ } while (0) +__printf(5, 6) void __btrfs_panic(struct btrfs_fs_info *fs_info, const char *function, unsigned int line, int errno, const char *fmt, ...); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7ae51de..f02bba5 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1113,7 +1113,7 @@ void clean_tree_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, spin_unlock(root-fs_info-delalloc_lock); btrfs_panic(root-fs_info, -EOVERFLOW, Can't clear %lu bytes from - dirty_mdatadata_bytes (%lu), + dirty_mdatadata_bytes (%llu), buf-len, root-fs_info-dirty_metadata_bytes); } diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 646ee21..790f492 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1242,7 +1242,7 @@ static int __must_check __add_reloc_root(struct btrfs_root *root) kfree(node); btrfs_panic(root-fs_info, -EEXIST, Duplicate root found for start=%llu while inserting into relocation - tree\n); + tree\n, node-bytenr); } list_add_tail(root-root_list, rc-reloc_roots); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 96eb9fe..a84529d 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -124,6 +124,7 @@ static void btrfs_handle_error(struct btrfs_fs_info *fs_info) } } +#ifdef CONFIG_PRINTK /* * __btrfs_std_error decodes expected errors from the caller and * invokes the approciate error response. @@ -166,7 +167,7 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function, va_end(args); } -const char *logtypes[] = { +static const char *logtypes[] = { emergency, alert, critical, @@ -184,22 +185,50 @@ void btrfs_printk(struct btrfs_fs_info *fs_info, const char *fmt, ...) struct va_format vaf; va_list args; const char *type = logtypes[4]; + int kern_level; va_start(args, fmt); - if (fmt[0] == '' isdigit(fmt[1]) fmt[2] == '') { - memcpy(lvl, fmt, 3); - lvl[3] = '\0'; - fmt += 3; - type = logtypes[fmt[1] - '0']; + kern_level = printk_get_level(fmt); + if (kern_level) { + size_t size = printk_skip_level(fmt) - fmt; + memcpy(lvl, fmt, size); + lvl[size] = '\0'; + fmt += size; + type = logtypes[kern_level - '0']; } else *lvl = '\0'; vaf.fmt = fmt; vaf.va = args; + printk(%sBTRFS %s (device %s): %pV, lvl, type, sb-s_id, vaf); + + va_end(args); } +#else + +void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function, + unsigned int line, int errno, const char *fmt, ...) +{ + struct super_block *sb = fs_info-sb; + + /* +* Special case: if the error is EROFS, and we're already +
[PATCH 0/8] Rework KERN_LEVEL
KERN_LEVEL currently takes up 3 bytes. Shrink the kernel size by using an ASCII SOH and then the level byte. Remove the need for KERN_CONT. Convert directly embedded uses of . to KERN_LEVEL Joe Perches (8): printk: Add generic functions to find KERN_LEVEL headers printk: Add kern_levels.h to make KERN_LEVEL available for asm use arch: Remove direct definitions of KERN_LEVEL uses btrfs: Use printk_get_level and printk_skip_level, add __printf, fix fallout sound: Use printk_get_level and printk_skip_level staging: wlags49_h2: Remove direct declarations of KERN_LEVEL prefixes printk: Convert the format for KERN_LEVEL to a 2 byte pattern printk: Remove the now unnecessary C annotation for KERN_CONT arch/arm/lib/io-acorn.S |3 +- arch/arm/vfp/vfphw.S |7 +++-- arch/frv/kernel/kernel_thread.S |2 +- drivers/staging/wlags49_h2/hcf.c |8 +++--- drivers/staging/wlags49_h2/wl_main.c |4 +- fs/btrfs/ctree.h | 13 ++ fs/btrfs/disk-io.c |2 +- fs/btrfs/relocation.c|2 +- fs/btrfs/super.c | 41 +- include/linux/kern_levels.h | 25 include/linux/printk.h | 41 +++-- kernel/printk.c | 14 +++ sound/core/misc.c| 13 +++--- 13 files changed, 130 insertions(+), 45 deletions(-) create mode 100644 include/linux/kern_levels.h -- 1.7.8.111.gad25c.dirty -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with data recovering
Am Montag, 4. Juni 2012 schrieb Hugo Mills: On Mon, Jun 04, 2012 at 12:24:05PM -0400, Maxim Mikheev wrote: I run through all potential tree roots. It gave me everytime messages like these: parent transid verify failed on 3405159735296 wanted 9096 found 5263 parent transid verify failed on 3405159735296 wanted 9096 found 5263 […] The largest recovered data is 12Kb. max@s0:~/btrfs-recovering./recovered$ ls -lahs 3728819929088 total 28K 4.0K drwxr-xr-x 3 root root 4.0K Jun 4 12:06 . […] What can I do next? I'm out of ideas. At this point, though, you're probably looking at somebody writing custom code to scan the FS and attempt to find and retrieve anything that's recoverable. You might try writing a tool to scan all the disks for useful fragments of old trees, and see if you can find some of the tree roots independently of the tree of tree roots (which clearly isn't particularly functional right now). You might try simply scanning the disks looking for your lost data, and try to reconstruct as much of it as you can from that. You could try to find a company specialising in data recovery and pay them to try to get your data back. Or you might just have to accept that the data's gone and work on reconstructing it. Only thing that comes to my mind thats still tryable without involving a data recover firm or engage a developer for an improved recovery tool is: PhotoRec from testdisk package or some other data recovery tool that looks for headers for known fileformats like I think foremost. It has some drawbacks: - AFAIK it has no means to glue back together fragmented files, so these are likely gone or truncated - filenames are lost - directory structure is lost I think it has been said, but I think its important to repeat it: BTRFS - or any other filesystem - with RAID 0 without backup is not for important production data. Not ever. Maxim, I suggest if you learn anything out of this let it be at least this. When I think about your setup, Maxim, the sentence I want to have my data destroyed comes to my mind. I would try with photorec from testdisk first. Its quite easy to use. -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with data recovering
Am Dienstag, 5. Juni 2012 schrieb Martin Steigerwald: Am Montag, 4. Juni 2012 schrieb Hugo Mills: On Mon, Jun 04, 2012 at 12:24:05PM -0400, Maxim Mikheev wrote: I run through all potential tree roots. It gave me everytime messages like these: parent transid verify failed on 3405159735296 wanted 9096 found 5263 parent transid verify failed on 3405159735296 wanted 9096 found 5263 […] The largest recovered data is 12Kb. max@s0:~/btrfs-recovering./recovered$ ls -lahs 3728819929088 total 28K 4.0K drwxr-xr-x 3 root root 4.0K Jun 4 12:06 . […] What can I do next? I'm out of ideas. At this point, though, you're probably looking at somebody writing custom code to scan the FS and attempt to find and retrieve anything that's recoverable. You might try writing a tool to scan all the disks for useful fragments of old trees, and see if you can find some of the tree roots independently of the tree of tree roots (which clearly isn't particularly functional right now). You might try simply scanning the disks looking for your lost data, and try to reconstruct as much of it as you can from that. You could try to find a company specialising in data recovery and pay them to try to get your data back. Or you might just have to accept that the data's gone and work on reconstructing it. Only thing that comes to my mind thats still tryable without involving a data recover firm or engage a developer for an improved recovery tool is: PhotoRec from testdisk package or some other data recovery tool that looks for headers for known fileformats like I think foremost. It has some drawbacks: - AFAIK it has no means to glue back together fragmented files, so these are likely gone or truncated - filenames are lost - directory structure is lost It won´t work for striped files either, so it may only help for rather small files depending on BTRFS RAID 0 stripe size. -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with data recovering
Am Montag, 4. Juni 2012 schrieb Maxim Mikheev: --super works but my root tree 2 has many errors too. What can I do next? Have a data recovery company try to physically recover the bad harddisk to a good one and then try to mount BTRFS again and hope that your previous attempts to repair the filesystem didn´t make things worse? -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [btrfs-progs] [bug][patch] Leaking file handle in scrub_fs_info()
Hi, Goffredo, I'm trying to do a new integration branch, and the patch inline in the text is corrupt (plus it has a massively verbose commit message): Applying: Leaking file handle in scrub_fs_info() fatal: corrupt patch at line 74 Patch failed at 0001 Leaking file handle in scrub_fs_info() Normally, I'd use your git repo to get round the fact that every single patch you send by mail is whitespace-damaged, but: error: Unable to find 337bfb3250203a66b953ffcf7804418cb7c72052 under http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git Cannot obtain needed commit 337bfb3250203a66b953ffcf7804418cb7c72052 while processing commit ee3832b47a4b1d1dce8d1cacc802db32901b8c45. error: Fetch failed. Plus Jan had some suggested tweaks regarding whitespace -- could you re-roll the patch and maybe see what you can do to fix your git repo? (Or at least use git-send-email to send the patch so it doesn't get mangled). Thanks, Hugo. On Tue, Apr 24, 2012 at 08:43:23PM +0200, Goffredo Baroncelli wrote: I was giving a look to the function scrub_fs_info( ), and to me it seems that could be a potential file handle leaking problem. In fact: static int scrub_fs_info(int fd, char *path, struct btrfs_ioctl_fs_info_args *fi_args, struct btrfs_ioctl_dev_info_args **di_ret) { [...] ret = ioctl(fd, BTRFS_IOC_FS_INFO, fi_args); if (ret errno == EINVAL) { /* path is no mounted btrfs. try if it's a device */ [...] close(fd);--- Here the file handle is closed fd = open_file_or_dir(mp);--- then it is re-opened if (fd 0) return -errno; } else if (ret) { return -errno; } [...] But in the rest of the function: a) the file handle is not closed b) the (new) file handle isn't returned The function scrub_fs_info() is called from the functions 1) cmd_scrub_status(), which doesn't use the file handle after the call to the cmd_scrub_status() [except for a close()]. So no problem at all. 2) scrub_start(), which uses the file handle after the call to the cmd_scrub_status() functions. My suggestions is to change scrub_fs_info() to accept only the path. Then it open (and closes) its own (and private) the file descriptor. Instead scrub_start(), opens a file descriptor after the call to the scrub_fs_info() function. What do you think ? BR G.Baroncelli You can pull the patch below from http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git branch fd-leaking - diff --git a/cmds-scrub.c b/cmds-scrub.c index c4503f4..486768c 100644 --- a/cmds-scrub.c +++ b/cmds-scrub.c @@ -979,19 +979,26 @@ static int scrub_device_info(int fd, u64 devid, return ret ? -errno : 0; } -static int scrub_fs_info(int fd, char *path, +static int scrub_fs_info( char *path, struct btrfs_ioctl_fs_info_args *fi_args, struct btrfs_ioctl_dev_info_args **di_ret) { int ret = 0; int ndevs = 0; int i = 1; + int fd; struct btrfs_fs_devices *fs_devices_mnt = NULL; struct btrfs_ioctl_dev_info_args *di_args; char mp[BTRFS_PATH_NAME_MAX + 1]; memset(fi_args, 0, sizeof(*fi_args)); + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return -1; + } + ret = ioctl(fd, BTRFS_IOC_FS_INFO, fi_args); if (ret errno == EINVAL) { /* path is no mounted btrfs. try if it's a device */ @@ -1010,28 +1017,36 @@ static int scrub_fs_info(int fd, char *path, if (fd 0) return -errno; } else if (ret) { + close(fd); return -errno; } - if (!fi_args-num_devices) + if (!fi_args-num_devices){ + close(fd); return 0; + } di_args = *di_ret = malloc(fi_args-num_devices * sizeof(*di_args)); - if (!di_args) + if (!di_args){ + close(fd); return -errno; + } for (; i = fi_args-max_id; ++i) { BUG_ON(ndevs = fi_args-num_devices); ret = scrub_device_info(fd, i, di_args[ndevs]); if (ret == -ENODEV) continue; - if (ret) + if (ret){ + close(fd); return ret; + } ++ndevs; } BUG_ON(ndevs == 0); + close(fd); return 0; } @@ -1155,7 +1170,7 @@ static int scrub_start(int argc, char **argv, int resume) return 12;
Re: Help with data recovering
Hallo, Martin, Du meintest am 05.06.12: --super works but my root tree 2 has many errors too. What can I do next? Have a data recovery company try to physically recover the bad harddisk to a good one About 1 year ago I asked Kroll-Ontrack. They told me they couldn't (yet) recover btrfs. Maybe not only time has changed ... Viele Gruesse! Helmut -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recovering a file from a snapshot without duplicating its blocks (--reflink across devs?)
On Tue, Jun 05, 2012 at 01:42:50PM +1000, Chris Samuel wrote: On 05/06/12 13:01, Marc MERLIN wrote: First I though, I sure would be nice if I could take btrfs to reference the same blocks from the snapshot to my current image. But, --reflink failed across devices nodes, so I was forced to copy/duplicate the blocks (36GB). Patches for this were posted over a year ago, but it was NAK'd by Christoph Hellwig. I don't know if it's got any further since then. :-( Patch description: http://www.spinics.net/lists/linux-btrfs/msg09226.html NAK: http://www.spinics.net/lists/linux-btrfs/msg09229.html Thanks for that. So, I'm not sure how common my use case is, but obviously for cases were recovering a huge file is important, like disk images, being able to re-link blocks without copying would be fantastic. So here's my vote for that. Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [systemd-devel] systemd-udevd: excessive I/O usage
[resend, for some reason kmail formatted the previous message with html] On Martes, 5 de junio de 2012 09:50:56 Alexander E. Patrakov escribió: Result: boot from ext4 takes less than 15 seconds, while boot from btrfs takes 9 minutes (or 5 minutes if I disable readahead - the data file is not valid anyway on btrfs). I also had this problem. It turns out that btrfs was creating the space cache from scratch (which takes several minutes) on each boot, for some reason. I added the space_cache mount option to fstab, and now my system boots fast. I suggest trying it. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
delete disk proceedure
Good morning btrfs list, I had written about 2 weeks ago about using extra btrfs space in an nfs file system setup. Nfs seems to export the files but the mounts don't work on older machines without btrfs kernels. So I am down to deleting several drives from btrfs to setup a standard raid 1 array for storage and export via nfs. My system stats are: [root@advanced ~]# df -h FilesystemSize Used Avail Use% Mounted on /dev/sdm2 196G 50G 137G 27% / tmpfs 16G 0 16G 0% /dev/shm /dev/sdm1 2.0G 141M 1.8G 8% /boot /dev/sdm5 1.2T 20G 1.1T 2% /var 10.2.0.40:/data/sites 2.6T 2.4T 155G 94% /nfs1/data/sites 10.2.0.42:/data/sites 2.6T 2.2T 328G 87% /nfs2/data/sites /dev/sda 11T 4.9T 6.0T 46% /btrfs [root@advanced ~]# btrfs fi show failed to read /dev/sr0 Label: none uuid: c21f1221-a224-4ba4-92e5-cdea0fa6d0f9 Total devices 12 FS bytes used 4.76TB devid6 size 930.99GB used 429.32GB path /dev/sdf devid5 size 930.99GB used 429.32GB path /dev/sde devid8 size 930.99GB used 429.32GB path /dev/sdh devid9 size 930.99GB used 429.32GB path /dev/sdi devid4 size 930.99GB used 429.32GB path /dev/sdd devid3 size 930.99GB used 429.32GB path /dev/sdc devid 11 size 930.99GB used 429.08GB path /dev/sdk devid2 size 930.99GB used 429.32GB path /dev/sdb devid 10 size 930.99GB used 429.32GB path /dev/sdj devid 12 size 930.99GB used 429.33GB path /dev/sdl devid7 size 930.99GB used 429.32GB path /dev/sdg devid1 size 930.99GB used 429.09GB path /dev/sda Btrfs v0.19-35-g1b444cd df -h and btrfs fi show seem to be in good size agreement. Btrfs was created as raid1 metadata and raid0 data. I would like to delete the last 4 drives leaving 7T of space to hold 4.9T of data. My plan would be to remove /dev/sdi, j, k, l one at a time. After all are deleted run btrfs fi balance /btrfs. The data is not critical and can be lost but I am really trying to avoid the hassle of having to completely redo the filesystem. Does my deletion plan seem reasonable. Please, I'm really swimming alone here and would value some advice. Jim Maloney -- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [systemd-devel] systemd-udevd: excessive I/O usage
2012/6/5 Diego Calleja dieg...@gmail.com: [resend, for some reason kmail formatted the previous message with html] On Martes, 5 de junio de 2012 09:50:56 Alexander E. Patrakov escribió: Result: boot from ext4 takes less than 15 seconds, while boot from btrfs takes 9 minutes (or 5 minutes if I disable readahead - the data file is not valid anyway on btrfs). I also had this problem. It turns out that btrfs was creating the space cache from scratch (which takes several minutes) on each boot, for some reason. I added the space_cache mount option to fstab, and now my system boots fast. I suggest trying it. It helped me, too - but ext4 is still faster under the typical startup under systemd load type. The difference manifests itself as GDM login screen sometimes timing out some components of its fancy version (due to something resembling kernel bug 12309) and falling back to the simple non-gnome-shell version. Sorry for hijacking the thread, but the amount of parallelization achieved by systemd is way too much for a rotating drive from 2007, especially since some system components like gdm have aggressive timeouts easily triggered by disk i/o. -- Alexander E. Patrakov -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delete disk proceedure
On Tue, Jun 05, 2012 at 10:38:11AM -0400, Jim wrote: Good morning btrfs list, I had written about 2 weeks ago about using extra btrfs space in an nfs file system setup. Nfs seems to export the files but the mounts don't work on older machines without btrfs kernels. The mounts don't work -- can you be more specific here? It would seem that if we can get to the bottom of that problem, you won't have to muck around with your current set-up at all. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- I am an opera lover from planet Zog. Take me to your lieder. --- signature.asc Description: Digital signature
PATCH: btrfs defrag ioctl, override extent count and size checks compression enabled.
I noticed a few days ago that btrfs fi defrag -cXXX can not be used to compress files unless they are fragmented. The attached patch passes the compression flag to should_defrag_range, where it disables the adjacent-extent and extent size checks if set. The inline/sparse extent check is not modified - I assumed it would not be useful to compress inline extents. -- Andrew Mahone andrew DOT mahone AT gmail DOT com\ btrfs_defrag_compress.patch Description: Binary data
Re: delete disk proceedure
Hallo, Jim, Du meintest am 05.06.12: /dev/sda 11T 4.9T 6.0T 46% /btrfs [root@advanced ~]# btrfs fi show failed to read /dev/sr0 Label: none uuid: c21f1221-a224-4ba4-92e5-cdea0fa6d0f9 Total devices 12 FS bytes used 4.76TB devid6 size 930.99GB used 429.32GB path /dev/sdf devid5 size 930.99GB used 429.32GB path /dev/sde devid8 size 930.99GB used 429.32GB path /dev/sdh devid9 size 930.99GB used 429.32GB path /dev/sdi devid4 size 930.99GB used 429.32GB path /dev/sdd devid3 size 930.99GB used 429.32GB path /dev/sdc devid 11 size 930.99GB used 429.08GB path /dev/sdk devid2 size 930.99GB used 429.32GB path /dev/sdb devid 10 size 930.99GB used 429.32GB path /dev/sdj devid 12 size 930.99GB used 429.33GB path /dev/sdl devid7 size 930.99GB used 429.32GB path /dev/sdg devid1 size 930.99GB used 429.09GB path /dev/sda Btrfs v0.19-35-g1b444cd df -h and btrfs fi show seem to be in good size agreement. Btrfs was created as raid1 metadata and raid0 data. I would like to delete the last 4 drives leaving 7T of space to hold 4.9T of data. My plan would be to remove /dev/sdi, j, k, l one at a time. After all are deleted run btrfs fi balance /btrfs. I'd prefer btrfs device delete /dev/sdi btrfs filesystem balance /btrfs btrfs device delete /dev/sdj btrfs filesystem balance /btrfs etc. - after every delete its balance run. That may take a lot of hours - I use the last lines of dmesg to extrapolate the needed time (btrfs produces a message about every minute). And you can't use the console from where you have started the balance command. Therefore I wrap this command: echo 'btrfs filesystem balance /btrfs' | at now Viele Gruesse! Helmut -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delete disk proceedure
[sorry for the resend, signature again] I am waiting for a window (later tonight) when I can try mounting the btrfs export. Am I reading you guys correctly, that you think I should be deleting drives from the array? Or is this a just in case? Thanks. Jim Maloney On 06/05/2012 01:04 PM, Hugo Mills wrote: On Tue, Jun 05, 2012 at 06:19:00PM +0200, Helmut Hullen wrote: Hallo, Jim, Du meintest am 05.06.12: /dev/sda 11T 4.9T 6.0T 46% /btrfs [root@advanced ~]# btrfs fi show failed to read /dev/sr0 Label: none uuid: c21f1221-a224-4ba4-92e5-cdea0fa6d0f9 Total devices 12 FS bytes used 4.76TB devid6 size 930.99GB used 429.32GB path /dev/sdf devid5 size 930.99GB used 429.32GB path /dev/sde devid8 size 930.99GB used 429.32GB path /dev/sdh devid9 size 930.99GB used 429.32GB path /dev/sdi devid4 size 930.99GB used 429.32GB path /dev/sdd devid3 size 930.99GB used 429.32GB path /dev/sdc devid 11 size 930.99GB used 429.08GB path /dev/sdk devid2 size 930.99GB used 429.32GB path /dev/sdb devid 10 size 930.99GB used 429.32GB path /dev/sdj devid 12 size 930.99GB used 429.33GB path /dev/sdl devid7 size 930.99GB used 429.32GB path /dev/sdg devid1 size 930.99GB used 429.09GB path /dev/sda Btrfs v0.19-35-g1b444cd df -h and btrfs fi show seem to be in good size agreement. Btrfs was created as raid1 metadata and raid0 data. I would like to delete the last 4 drives leaving 7T of space to hold 4.9T of data. My plan would be to remove /dev/sdi, j, k, l one at a time. After all are deleted run btrfs fi balance /btrfs. I'd prefer btrfs device delete /dev/sdi btrfs filesystem balance /btrfs btrfs device delete /dev/sdj btrfs filesystem balance /btrfs etc. - after every delete its balance run. That's not necessary. Delete will move the blocks from the device being removed into spare space on the other devices. The balance is unnecessary. (In fact, delete and balance share quite a lot of code) That may take a lot of hours - I use the last lines of dmesg to extrapolate the needed time (btrfs produces a message about every minute). And you can't use the console from where you have started the balance command. Therefore I wrap this command: echo 'btrfs filesystem balance /btrfs' | at now ... or just put it into the background with btrfs bal start /mountpoint. You know, like everyone else does. :) Hugo. -- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: delete disk proceedure
On Tue, Jun 05, 2012 at 01:12:17PM -0400, Jim wrote: [sorry for the resend, signature again] I am waiting for a window (later tonight) when I can try mounting the btrfs export. Am I reading you guys correctly, that you think I should be deleting drives from the array? Or is this a just in case? Thanks. Try the modified exports as I suggested in the other part of the thread first. If that turns out to be problematic still, then we can discuss any migration strategies. Hugo. Jim Maloney On 06/05/2012 01:04 PM, Hugo Mills wrote: On Tue, Jun 05, 2012 at 06:19:00PM +0200, Helmut Hullen wrote: Hallo, Jim, Du meintest am 05.06.12: /dev/sda 11T 4.9T 6.0T 46% /btrfs [root@advanced ~]# btrfs fi show failed to read /dev/sr0 Label: none uuid: c21f1221-a224-4ba4-92e5-cdea0fa6d0f9 Total devices 12 FS bytes used 4.76TB devid6 size 930.99GB used 429.32GB path /dev/sdf devid5 size 930.99GB used 429.32GB path /dev/sde devid8 size 930.99GB used 429.32GB path /dev/sdh devid9 size 930.99GB used 429.32GB path /dev/sdi devid4 size 930.99GB used 429.32GB path /dev/sdd devid3 size 930.99GB used 429.32GB path /dev/sdc devid 11 size 930.99GB used 429.08GB path /dev/sdk devid2 size 930.99GB used 429.32GB path /dev/sdb devid 10 size 930.99GB used 429.32GB path /dev/sdj devid 12 size 930.99GB used 429.33GB path /dev/sdl devid7 size 930.99GB used 429.32GB path /dev/sdg devid1 size 930.99GB used 429.09GB path /dev/sda Btrfs v0.19-35-g1b444cd df -h and btrfs fi show seem to be in good size agreement. Btrfs was created as raid1 metadata and raid0 data. I would like to delete the last 4 drives leaving 7T of space to hold 4.9T of data. My plan would be to remove /dev/sdi, j, k, l one at a time. After all are deleted run btrfs fi balance /btrfs. I'd prefer btrfs device delete /dev/sdi btrfs filesystem balance /btrfs btrfs device delete /dev/sdj btrfs filesystem balance /btrfs etc. - after every delete its balance run. That's not necessary. Delete will move the blocks from the device being removed into spare space on the other devices. The balance is unnecessary. (In fact, delete and balance share quite a lot of code) That may take a lot of hours - I use the last lines of dmesg to extrapolate the needed time (btrfs produces a message about every minute). And you can't use the console from where you have started the balance command. Therefore I wrap this command: echo 'btrfs filesystem balance /btrfs' | at now ... or just put it into the background with btrfs bal start /mountpoint. You know, like everyone else does. :) Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Quantum est ille canis in fenestra? --- signature.asc Description: Digital signature
Re: [systemd-devel] systemd-udevd: excessive I/O usage
On Mon, Jun 4, 2012 at 8:50 PM, Alexander E. Patrakov patra...@gmail.com wrote: 2012/6/5 Kok, Auke-jan H auke-jan.h@intel.com wrote on systemd-devel list: It seems your system is taking well into 15+ seconds before btrfs is actually *ready* on your system, which seems to be the main hiccup (note, speculation here). I've personally become a bit displeased with btrfs performance recently myself, so, I'm wondering if you should try ext4 for now. Other than that, after btrfs/udev finally pops to life, things seem to start relatively quickly. I think btrfs is to blame here, because I think my system started to be affected by this problem after ext4 - btrfs conversion. I recently changed my ext4-on-lvm gentoo system at home by defragmenting the LVM (http://bisqwit.iki.fi/source/lvm2defrag.html), converting the biggest (200 GB, 80% used, mostly video files, git trees and SVN checkouts of various projects) logical volume to btrfs, making the backup of metadata, deleting the LVM partition and creating ordinary partitions in places that were occupied by LVM volumes before according to the backup. It worked. Then I made a btrfs subvolume and transferred the contents of the former root partition there using tar. So now I have two copies of my root filesystem - one on ext4 and one on btrfs. I recreated an initramfs for each of them using dracut. Result: boot from ext4 takes less than 15 seconds, while boot from btrfs takes 9 minutes (or 5 minutes if I disable readahead - the data file is not valid anyway on btrfs). One problem is btrfsck in the dracut-created initramfs - it fires every time (with btrfs mounted read-only?). The other problem is the btrfs-cache-1 kernel thread - I was told on #btrfs that it is a one-time thing, but apparently it wants to do its caching every boot due to some breakage. During the boot, there are also warnings about hung tasks with some locks held. I am attaching a dmesg file illustrating all of the problems mentioned above. I've had the same (bad) experiences since 3.3. The one time thing is creating the free space cache. On my home systems, with 3.4.x, it's still creating them *every* boot, which certainly accounts for IO busy, which on a sluggish spinning rust is disastrous, to say the least. Hung tasks in btrfs have been present since it was merged. Remember that they're only a warning - eventially almost always they will unhang. But still ver frustrating. I'm currently dropping btrfs from my home development system because I've spent too much time in the last month trying to get my btrfs volumes back up after a kernel upgrade. So, I feel your pain. Auke -- Alexander E. Patrakov -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [btrfs-progs] [bug][patch V2] Leaking file handle in scrub_fs_info()
Hi Hugo, I was not able to reproduce your error with my repository: I pulled it without problem. However, enclosed you can find the patch (as attachment) with the Jan suggestions about the spaces. You can pull the patch also from my repo ( same branch: fd-leaking). Moreover I inserted the patch as inlined. Hoping that helps. commit b73abc9465d3a105b5f8f464f2867543a638e5e2 Author: Goffredo Baroncelli kreij...@inwind.it Date: Tue Jun 5 19:11:06 2012 +0200 scrub_fs_info( ) file handle leaking The function scrub_fs_info( ) closes and reopen a file handle passed as argument, when a caller uses the file handle even after the call. The function scrub_fs_info( ) is updated to remove the file handle argument, and instead uses a private own file handle. The callers are updated to not pass the argument. diff --git a/cmds-scrub.c b/cmds-scrub.c index c4503f4..6313e02 100644 --- a/cmds-scrub.c +++ b/cmds-scrub.c @@ -979,19 +979,26 @@ static int scrub_device_info(int fd, u64 devid, return ret ? -errno : 0; } -static int scrub_fs_info(int fd, char *path, +static int scrub_fs_info(char *path, struct btrfs_ioctl_fs_info_args *fi_args, struct btrfs_ioctl_dev_info_args **di_ret) { int ret = 0; int ndevs = 0; int i = 1; + int fd; struct btrfs_fs_devices *fs_devices_mnt = NULL; struct btrfs_ioctl_dev_info_args *di_args; char mp[BTRFS_PATH_NAME_MAX + 1]; memset(fi_args, 0, sizeof(*fi_args)); + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return -1; + } + ret = ioctl(fd, BTRFS_IOC_FS_INFO, fi_args); if (ret errno == EINVAL) { /* path is no mounted btrfs. try if it's a device */ @@ -1010,28 +1017,36 @@ static int scrub_fs_info(int fd, char *path, if (fd 0) return -errno; } else if (ret) { + close(fd); return -errno; } - if (!fi_args-num_devices) + if (!fi_args-num_devices) { + close(fd); return 0; + } di_args = *di_ret = malloc(fi_args-num_devices * sizeof(*di_args)); - if (!di_args) + if (!di_args) { + close(fd); return -errno; + } for (; i = fi_args-max_id; ++i) { BUG_ON(ndevs = fi_args-num_devices); ret = scrub_device_info(fd, i, di_args[ndevs]); if (ret == -ENODEV) continue; - if (ret) + if (ret) { + close(fd); return ret; + } ++ndevs; } BUG_ON(ndevs == 0); + close(fd); return 0; } @@ -1155,7 +1170,7 @@ static int scrub_start(int argc, char **argv, int resume) return 12; } - ret = scrub_fs_info(fdmnt, path, fi_args, di_args); + ret = scrub_fs_info(path, fi_args, di_args); if (ret) { ERR(!do_quiet, ERROR: getting dev info for scrub failed: %s\n, strerror(-ret)); @@ -1586,7 +1601,6 @@ static int cmd_scrub_status(int argc, char **argv) .sun_family = AF_UNIX, }; int ret; - int fdmnt; int i; int print_raw = 0; int do_stats_per_dev = 0; @@ -1615,13 +1629,7 @@ static int cmd_scrub_status(int argc, char **argv) path = argv[optind]; - fdmnt = open_file_or_dir(path); - if (fdmnt 0) { - fprintf(stderr, ERROR: can't access to '%s'\n, path); - return 12; - } - - ret = scrub_fs_info(fdmnt, path, fi_args, di_args); + ret = scrub_fs_info(path, fi_args, di_args); if (ret) { fprintf(stderr, ERROR: getting dev info for scrub failed: %s\n, strerror(-ret)); @@ -1698,7 +1706,6 @@ static int cmd_scrub_status(int argc, char **argv) out: free_history(past_scrubs); free(di_args); - close(fdmnt); if (fdres -1) close(fdres); On 06/05/2012 01:01 PM, Hugo Mills wrote: Hi, Goffredo, I'm trying to do a new integration branch, and the patch inline in the text is corrupt (plus it has a massively verbose commit message): Applying: Leaking file handle in scrub_fs_info() fatal: corrupt patch at line 74 Patch failed at 0001 Leaking file handle in scrub_fs_info() Normally, I'd use your git repo to get round the fact that every single patch you send by mail is whitespace-damaged, but: error: Unable to find 337bfb3250203a66b953ffcf7804418cb7c72052 under http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git Cannot obtain needed commit 337bfb3250203a66b953ffcf7804418cb7c72052 while processing commit
Re: delete disk proceedure
Hallo, Hugo, Du meintest am 05.06.12: [...] And you can't use the console from where you have started the balance command. Therefore I wrap this command: echo 'btrfs filesystem balance /btrfs' | at now ... or just put it into the background with btrfs bal start /mountpoint . You know, like everyone else does. :) I know that possibility too. My proposal puts every message (normal messages and error messages) into a mail to root (when root has started this command). Viele Gruesse! Helmut -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Btrfs: use rcu to protect device-name
Al pointed out that we can just toss out the old name on a device and add a new one arbitrarily, so anybody who uses device-name in printk could possibly use free'd memory. Instead of adding locking around all of this he suggested doing it with RCU, so I've introduced a struct rcu_string that does just that and have gone through and protected all accesses to device-name that aren't under the uuid_mutex with rcu_read_lock(). This protects us and I will use it for dealing with removing the device that we used to mount the file system in a later patch. Thanks, Signed-off-by: Josef Bacik jo...@redhat.com --- fs/btrfs/check-integrity.c | 11 - fs/btrfs/disk-io.c | 14 +- fs/btrfs/extent_io.c |7 +++- fs/btrfs/ioctl.c | 14 +- fs/btrfs/scrub.c | 39 ++--- fs/btrfs/volumes.c | 102 fs/btrfs/volumes.h |2 +- 7 files changed, 147 insertions(+), 42 deletions(-) diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index 9cebb1f..9f69855 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -93,6 +93,7 @@ #include print-tree.h #include locking.h #include check-integrity.h +#include rcu-string.h #define BTRFSIC_BLOCK_HASHTABLE_SIZE 0x1 #define BTRFSIC_BLOCK_LINK_HASHTABLE_SIZE 0x1 @@ -842,14 +843,20 @@ static int btrfsic_process_superblock_dev_mirror( superblock_tmp-is_iodone = 1; superblock_tmp-never_written = 0; superblock_tmp-mirror_num = 1 + superblock_mirror_num; - if (state-print_mask BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE) + if (state-print_mask BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE) { + struct rcu_string *name; + + rcu_read_lock(); + name = rcu_dereference(device-name); printk(KERN_INFO New initial S-block (bdev %p, %s) @%llu (%s/%llu/%d)\n, - superblock_bdev, device-name, + superblock_bdev, name-str, (unsigned long long)dev_bytenr, dev_state-name, (unsigned long long)dev_bytenr, superblock_mirror_num); + rcu_read_unlock(); + } list_add(superblock_tmp-all_blocks_node, state-all_blocks_list); btrfsic_block_hashtable_add(superblock_tmp, diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e39a3b9..c3fa508 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -44,6 +44,7 @@ #include free-space-cache.h #include inode-map.h #include check-integrity.h +#include rcu-string.h static struct extent_io_ops btree_extent_io_ops; static void end_workqueue_fn(struct btrfs_work *work); @@ -2574,12 +2575,16 @@ static void btrfs_end_buffer_write_sync(struct buffer_head *bh, int uptodate) } else { struct btrfs_device *device = (struct btrfs_device *) bh-b_private; + struct rcu_string *name; + rcu_read_lock(); + name = rcu_dereference(device-name); printk_ratelimited(KERN_WARNING lost page write due to - I/O error on %s\n, device-name); + I/O error on %s\n, name-str); /* note, we dont' set_buffer_write_io_error because we have * our own ways of dealing with the IO errors */ + rcu_read_unlock(); clear_buffer_uptodate(bh); btrfs_dev_stat_inc_and_print(device, BTRFS_DEV_STAT_WRITE_ERRS); } @@ -2749,8 +2754,13 @@ static int write_dev_flush(struct btrfs_device *device, int wait) wait_for_completion(device-flush_wait); if (bio_flagged(bio, BIO_EOPNOTSUPP)) { + struct rcu_string *name; + + rcu_read_lock(); + name = rcu_dereference(device-name); printk(btrfs: disabling barriers on dev %s\n, - device-name); + name-str); + rcu_read_unlock(); device-nobarriers = 1; } if (!bio_flagged(bio, BIO_UPTODATE)) { diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2c8f7b2..d79a815 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -20,6 +20,7 @@ #include volumes.h #include check-integrity.h #include locking.h +#include rcu-string.h static struct kmem_cache *extent_state_cache; static struct kmem_cache *extent_buffer_cache; @@ -1874,6 +1875,7 @@ int repair_io_failure(struct btrfs_mapping_tree *map_tree, u64 start, {
[PATCH 2/2] Btrfs: implement -show_devname
Because btrfs can remove the device that was mounted we need to have a -show_devname so that in this case we can print out some other device in the file system to /proc/mount. We keep track of what device we called mount() with so that we can print out the correct one if it is still available, but otherwise we just pick the first device that has the lowest device id. This was inspired (and copied in the case of btrfs_show_devname) from Miao Xie's patch. Thanks, Signed-off-by: Josef Bacik jo...@redhat.com --- fs/btrfs/super.c | 43 --- fs/btrfs/volumes.c |9 ++--- fs/btrfs/volumes.h |5 - 3 files changed, 50 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 85cef50..2f36f28 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -54,6 +54,7 @@ #include version.h #include export.h #include compression.h +#include rcu-string.h #define CREATE_TRACE_POINTS #include trace/events/btrfs.h @@ -647,7 +648,7 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags, goto out; } error = btrfs_scan_one_device(device_name, - flags, holder, fs_devices); + flags, holder, fs_devices, 0); kfree(device_name); if (error) goto out; @@ -1034,7 +1035,7 @@ static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags, return root; } - error = btrfs_scan_one_device(device_name, mode, fs_type, fs_devices); + error = btrfs_scan_one_device(device_name, mode, fs_type, fs_devices, 1); if (error) return ERR_PTR(error); @@ -1448,7 +1449,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, switch (cmd) { case BTRFS_IOC_SCAN_DEV: ret = btrfs_scan_one_device(vol-name, FMODE_READ, - btrfs_fs_type, fs_devices); + btrfs_fs_type, fs_devices, 0); break; } @@ -1472,12 +1473,48 @@ static int btrfs_unfreeze(struct super_block *sb) return 0; } +static int btrfs_show_devname(struct seq_file *m, struct dentry *root) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(root-d_sb); + struct btrfs_fs_devices *cur_devices; + struct btrfs_device *dev, *first_dev = NULL; + struct list_head *head; + struct rcu_string *name; + + mutex_lock(fs_info-fs_devices-device_list_mutex); + cur_devices = fs_info-fs_devices; + while (cur_devices) { + head = cur_devices-devices; + list_for_each_entry(dev, head, dev_list) { + if (dev-mounted) { + first_dev = dev; + goto out; + } + if (!first_dev || dev-devid first_dev-devid) + first_dev = dev; + } + cur_devices = cur_devices-seed; + } +out: + if (first_dev) { + rcu_read_lock(); + name = rcu_dereference(first_dev-name); + seq_escape(m, name-str, \t\n\\); + rcu_read_unlock(); + } else { + WARN_ON(1); + } + mutex_unlock(fs_info-fs_devices-device_list_mutex); + return 0; +} + static const struct super_operations btrfs_super_ops = { .drop_inode = btrfs_drop_inode, .evict_inode= btrfs_evict_inode, .put_super = btrfs_put_super, .sync_fs= btrfs_sync_fs, .show_options = btrfs_show_options, + .show_devname = btrfs_show_devname, .write_inode= btrfs_write_inode, .alloc_inode= btrfs_alloc_inode, .destroy_inode = btrfs_destroy_inode, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1eaa495..5e72fea 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -331,7 +331,8 @@ static void pending_bios_fn(struct btrfs_work *work) static noinline int device_list_add(const char *path, struct btrfs_super_block *disk_super, - u64 devid, struct btrfs_fs_devices **fs_devices_ret) + u64 devid, struct btrfs_fs_devices **fs_devices_ret, + int mount) { struct btrfs_device *device; struct btrfs_fs_devices *fs_devices; @@ -405,6 +406,7 @@ static noinline int device_list_add(const char *path, } } + device-mounted = mount; if (found_transid fs_devices-latest_trans) { fs_devices-latest_devid = devid; fs_devices-latest_trans = found_transid; @@ -562,6 +564,7 @@ static int
New btrfs-progs integration branch
I've just pushed out a new integration branch to my git repo. This is purely bugfix patches -- there are no new features in this issue of the integration branch. I've got a stack of about a dozen more patches with new features in them still to go. I'll be working on those tomorrow. As always, there's minimal testing involved here, but it does at least compile on my system(*). The branch is fetchable with git from: http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ integration-20120605 And viewable in human-readable form at: http://git.darksatanic.net/cgi/gitweb.cgi?p=btrfs-progs-unstable.git Shortlog is below. Hugo. (*) I don't care about works-on-my-machine. We are not shipping your machine! Akira Fujita (1): Btrfs-progs: Fix manual of btrfs command Chris Samuel (1): Fix set-dafault typo in cmds-subvolume.c Csaba Tóth (1): mkfs.btrfs on ARM Goffredo Baroncelli (1): scrub_fs_info( ) file handle leaking Hubert Kario (2): Fix segmentation fault when opening invalid file system man: fix btrfs man page formatting Jan Kara (1): mkfs: Handle creation of filesystem larger than the first device Jim Meyering (5): btrfs_scan_one_dir: avoid use-after-free on error path mkfs: use strdup in place of strlen,malloc,strcpy sequence restore: don't corrupt stack for a zero-length command-line argument avoid several strncpy-induced buffer overruns mkfs: avoid heap-buffer-read-underrun for zero-length size arg Josef Bacik (3): Btrfs-progs: make btrfsck aware of free space inodes Btrfs-progs: make btrfs filesystem show uuid actually work btrfs-progs: enforce block count on all devices in mkfs Miao Xie (3): Btrfs-progs: fix btrfsck's snapshot wrong unresolved refs Btrfs-progs, btrfs-corrupt-block: fix the wrong usage Btrfs-progs, btrfs-map-logical: Fix typo in usage Phillip Susi (2): btrfs-progs: removed extraneous whitespace from mkfs man page btrfs-progs: document --rootdir mkfs switch Sergei Trofimovich (2): Makefile: use $(CC) as a compilers instead of $(CC)/gcc Makefile: use $(MAKE) instead of hardcoded 'make' Shawn Bohrer (1): btrfs-progs: Update resize documentation Wang Sheng-Hui (1): btrfs-progs: cleanup: remove the redundant BTRFS_CSUM_TYPE_CRC32 macro def -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Great oxymorons of the world, no. 5: Manifesto Promise --- signature.asc Description: Digital signature
Re: [btrfs-progs] [bug][patch V2] Leaking file handle in scrub_fs_info()
Hi Hugo, On 06/05/2012 08:19 PM, Hugo Mills wrote: On Tue, Jun 05, 2012 at 07:26:34PM +0200, Goffredo Baroncelli wrote: Hi Hugo, I was not able to reproduce your error with my repository: I pulled it without problem. Well, I'd have normally written it off as a corrupt local repo at this end -- except that David Sterba said he'd also seen a similar problem pulling from your repo. However, enclosed you can find the patch (as attachment) with the Jan suggestions about the spaces. You can pull the patch also from my repo ( same branch: fd-leaking). Moreover I inserted the patch as inlined. Hoping that helps. Not really. Your mailer is breaking things (there's a wrapped line in there, and it's heavily whitespace-damaged). Could you *please* learn to use git send-email? Every single patch of yours that I've seen on the mailing list is badly broken, and takes an age to get it to the point where git am will look at it cleanly. It's incredibly frustrating. This time around, I think I've managed to get it to work OK, but it's taken me 10 minutes of buggering around, and I'm getting irritable, which isn't a good thing when I'm meant to be having a quiet relaxing week off work... I perfectly understand what means loosing 10 minutes instead of 1 seconds :-) But was you able to use the patch attached (not the one inlined ) ? Hugo. commit b73abc9465d3a105b5f8f464f2867543a638e5e2 Author: Goffredo Baroncelli kreij...@inwind.it Date: Tue Jun 5 19:11:06 2012 +0200 scrub_fs_info( ) file handle leaking The function scrub_fs_info( ) closes and reopen a file handle passed as argument, when a caller uses the file handle even after the call. The function scrub_fs_info( ) is updated to remove the file handle argument, and instead uses a private own file handle. The callers are updated to not pass the argument. diff --git a/cmds-scrub.c b/cmds-scrub.c index c4503f4..6313e02 100644 --- a/cmds-scrub.c +++ b/cmds-scrub.c @@ -979,19 +979,26 @@ static int scrub_device_info(int fd, u64 devid, return ret ? -errno : 0; } -static int scrub_fs_info(int fd, char *path, +static int scrub_fs_info(char *path, struct btrfs_ioctl_fs_info_args *fi_args, struct btrfs_ioctl_dev_info_args **di_ret) { int ret = 0; int ndevs = 0; int i = 1; + int fd; struct btrfs_fs_devices *fs_devices_mnt = NULL; struct btrfs_ioctl_dev_info_args *di_args; char mp[BTRFS_PATH_NAME_MAX + 1]; memset(fi_args, 0, sizeof(*fi_args)); + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return -1; + } + ret = ioctl(fd, BTRFS_IOC_FS_INFO, fi_args); if (ret errno == EINVAL) { /* path is no mounted btrfs. try if it's a device */ @@ -1010,28 +1017,36 @@ static int scrub_fs_info(int fd, char *path, if (fd 0) return -errno; } else if (ret) { + close(fd); return -errno; } - if (!fi_args-num_devices) + if (!fi_args-num_devices) { + close(fd); return 0; + } di_args = *di_ret = malloc(fi_args-num_devices * sizeof(*di_args)); - if (!di_args) + if (!di_args) { + close(fd); return -errno; + } for (; i = fi_args-max_id; ++i) { BUG_ON(ndevs = fi_args-num_devices); ret = scrub_device_info(fd, i, di_args[ndevs]); if (ret == -ENODEV) continue; - if (ret) + if (ret) { + close(fd); return ret; + } ++ndevs; } BUG_ON(ndevs == 0); + close(fd); return 0; } @@ -1155,7 +1170,7 @@ static int scrub_start(int argc, char **argv, int resume) return 12; } - ret = scrub_fs_info(fdmnt, path, fi_args, di_args); + ret = scrub_fs_info(path, fi_args, di_args); if (ret) { ERR(!do_quiet, ERROR: getting dev info for scrub failed: %s\n, strerror(-ret)); @@ -1586,7 +1601,6 @@ static int cmd_scrub_status(int argc, char **argv) .sun_family = AF_UNIX, }; int ret; - int fdmnt; int i; int print_raw = 0; int do_stats_per_dev = 0; @@ -1615,13 +1629,7 @@ static int cmd_scrub_status(int argc, char **argv) path = argv[optind]; - fdmnt = open_file_or_dir(path); - if (fdmnt 0) { - fprintf(stderr, ERROR: can't access to '%s'\n, path); - return 12; - } - - ret = scrub_fs_info(fdmnt, path, fi_args, di_args); + ret = scrub_fs_info(path, fi_args, di_args); if (ret) { fprintf(stderr, ERROR: getting dev info for scrub failed: %s\n, strerror(-ret)); @@ -1698,7 +1706,6 @@ static int cmd_scrub_status(int argc, char **argv) out: free_history(past_scrubs); free(di_args); - close(fdmnt); if (fdres -1) close(fdres); On 06/05/2012 01:01 PM, Hugo Mills wrote: Hi, Goffredo, I'm trying to do a new integration branch, and the patch inline in the text is corrupt (plus it has a massively verbose commit message): Applying: Leaking file handle in scrub_fs_info() fatal: corrupt patch at line 74 Patch
Re: inquiry on btrfs send/receive
On Tue, Jun 5, 2012 at 12:16 PM, Alexander Block abloc...@googlemail.com wrote: On Mon, Jun 4, 2012 at 7:33 PM, Alex Lyakas alex.bolshoy.bt...@gmail.com wrote: Yes, I also noticed that sometimes transid gets bumped up, but there is no actual change. So let's say you identify that a particular part of the tree is not shared anymore, and, say, eventually you get to a particular leaf within that part of the tree. How would you detect that, say, a particular INODE_ITEM (or, more difficult, an INODE_REF) is missing from that leaf WRT to previous tree? The property you described perhaps suggests another method to find leafs, in which *something* has changed. (Although within a leaf, does it make sense to decode all items and to look at their transid - those that have them - to filter out even more?) And yes, perhaps, looking at transid alone will bring more such potential leafs into consideration. However, how does this property help to determine *what* actually has changed between the two trees? Like, for example, being able to tell over which range of keys there possibly was a change, and iterate within that range? When doing incremental sends, we always have two trees at hand. One where we know that it is already on the receiving side (we did already send it) and the one that we want to send now. To find the changes, we simply compare those trees. If an items misses on one tree, we know it's either new or deleted (depending on the tree the item lies in). I would suggest you to not put too much work into send. As already said, the btrfs send/receive patches are close to be posted to the mailing list. It's currently reviewed and when I get a looks good now, I'll post it. Alex. Thanks for the update, Alex. Efficiently comparing the trees is the crux of what we were looking into. Thanks, Alex. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 5 Jun 2012 02:46:29 -0700 Joe Perches j...@perches.com wrote: KERN_LEVEL currently takes up 3 bytes. Shrink the kernel size by using an ASCII SOH and then the level byte. Remove the need for KERN_CONT. Convert directly embedded uses of . to KERN_LEVEL What an epic patchset. I guess that saving a byte per printk does make the world a better place, and forcibly ensuring that nothing is dependent upon the internal format of the KERN_foo strings is nice. Unfortunately the n thing is part of the kernel ABI: echo 4foo /dev/kmsg devkmsg_writev() does weird and wonderful things with facilities/levels. That function incorrectly returns success when copy_from_user() faults, btw. It also babbles on about LOG_USER and LOG_KERN without ever defining these things. I guess they're userspace-only concepts and are hardwired to 0 and 1 in the kernel. Or not. So what to do about /dev/kmsg? I'd say nothing: we retain n as the externally-presented kernel format for a facility level, and the fact that the kernel internally uses a different encoding is hidden from userspace. And if the user does echo \0014foo /dev/kmsg then I guess we should pass it straight through, retaining the \0014. But from my reading of your code, this doesn't work - vprintk_emit() will go ahead and strip and interpret the \0014, evading the stuff which devkmsg_writev() did. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, Jun 5, 2012 at 11:28 PM, Andrew Morton a...@linux-foundation.org wrote: On Tue, 5 Jun 2012 02:46:29 -0700 Joe Perches j...@perches.com wrote: KERN_LEVEL currently takes up 3 bytes. Shrink the kernel size by using an ASCII SOH and then the level byte. Remove the need for KERN_CONT. Convert directly embedded uses of . to KERN_LEVEL What an epic patchset. I guess that saving a byte per printk does make the world a better place, and forcibly ensuring that nothing is dependent upon the internal format of the KERN_foo strings is nice. Unfortunately the n thing is part of the kernel ABI: echo 4foo /dev/kmsg devkmsg_writev() does weird and wonderful things with facilities/levels. That function incorrectly returns success when copy_from_user() faults, btw. It also babbles on about LOG_USER and LOG_KERN without ever defining these things. I guess they're userspace-only concepts and are hardwired to 0 and 1 in the kernel. Or not. It's as old as BSD, defined by syslog(3), used by glibc. The whole %u prefix notation and the LOG_* names come from there. The kernel is just user/facility == 0, so it never really was apparent that the whole concept has more than a log level in that number. Userspace syslog defines these pretty stupid numbers: /* facility codes */ #define LOG_KERN(03) /* kernel messages */ #define LOG_USER(13) /* random user-level messages */ #define LOG_MAIL(23) /* mail system */ #define LOG_DAEMON (33) /* system daemons */ #define LOG_AUTH(43) /* security/authorization messages */ #define LOG_SYSLOG (53) /* messages generated internally by syslogd */ #define LOG_LPR (63) /* line printer subsystem */ #define LOG_NEWS(73) /* network news subsystem */ #define LOG_UUCP(83) /* UUCP subsystem */ #define LOG_CRON(93) /* clock daemon */ #define LOG_AUTHPRIV(103) /* security/authorization messages (private) */ #define LOG_FTP (113) /* ftp daemon */ but it *can* still all be pretty useful, and people *can* get creative with facility numbers if they want to, as we have like 13 bits at the moment to use for the facility which is stored in the kernel log buffer. :) /dev/kmsg just enforces LOG_USER, if userspace tries to inject stuff with LOG_KERN, which it should not be allowed. The non-LOG_KERN number itself has not much meaning it just says: this is not from the kernel which is important to keep in the message. Als, dmesg(1) has a -k option, that filters out all userspace-injected stuff. So what to do about /dev/kmsg? I'd say nothing: we retain n as the externally-presented kernel format for a facility level, and the fact that the kernel internally uses a different encoding is hidden from userspace. Yeah, I think so. Yeah, we strip the at printk() time, add the back at output time; they are not stored internally anymore, so that should not affect the current behaviour. And if the user does echo \0014foo /dev/kmsg then I guess we should pass it straight through, retaining the \0014. But from my reading of your code, this doesn't work - vprintk_emit() will go ahead and strip and interpret the \0014, evading the stuff which devkmsg_writev() did. We should make it not accept faked prefixes, yes. It should be impossible to let messages look like they originated from the kernel, just like the current code enforces a non-LOG_KERN prefix. Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 14:28 -0700, Andrew Morton wrote: Unfortunately the n thing is part of the kernel ABI: echo 4foo /dev/kmsg Which works the same way it did before. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 05 Jun 2012 15:11:43 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 14:28 -0700, Andrew Morton wrote: Unfortunately the n thing is part of the kernel ABI: echo 4foo /dev/kmsg Which works the same way it did before. I didn't say it didn't. What I did say is that echo \0014/dev/kmsg will subvert the intent of the new logging code. Or might. But you just ignored all that, forcing me to repeat myself, irritatedly. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 15:17 -0700, Andrew Morton wrote: On Tue, 05 Jun 2012 15:11:43 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 14:28 -0700, Andrew Morton wrote: Unfortunately the n thing is part of the kernel ABI: echo 4foo /dev/kmsg Which works the same way it did before. I didn't say it didn't. What I did say is that echo \0014/dev/kmsg will subvert the intent of the new logging code. Or might. But you just ignored all that, forcing me to repeat myself, irritatedly. It works the same way before and after the patch. Any write to /dev/kmsg without a KERN_LEVEL emits at (1 3) + KERN_DEFAULT. Writes with n values = 8 are emitted at that level. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 14:28 -0700, Andrew Morton wrote: devkmsg_writev() does weird and wonderful things with facilities/levels. That function incorrectly returns success when copy_from_user() faults, btw. Oh. Better? Thanks, Kay From: Kay Sievers k...@vrfy.org Subject: kmsg: /dev/kmsg - properly return possible copy_from_user() failure Reported-By: Andrew Morton a...@linux-foundation.org Signed-off-by: Kay Sievers k...@vrfy.org --- printk.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/printk.c b/kernel/printk.c index 32462d2..6bdacab 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -365,8 +365,10 @@ static ssize_t devkmsg_writev(struct kiocb *iocb, const struct iovec *iv, line = buf; for (i = 0; i count; i++) { - if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) + if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) { + ret = -EFAULT; goto out; + } line += iv[i].iov_len; } -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 06 Jun 2012 00:55:00 +0200 Kay Sievers k...@vrfy.org wrote: On Tue, 2012-06-05 at 14:28 -0700, Andrew Morton wrote: devkmsg_writev() does weird and wonderful things with facilities/levels. That function incorrectly returns success when copy_from_user() faults, btw. Oh. Better? Thanks, Kay From: Kay Sievers k...@vrfy.org Subject: kmsg: /dev/kmsg - properly return possible copy_from_user() failure Reported-By: Andrew Morton a...@linux-foundation.org Signed-off-by: Kay Sievers k...@vrfy.org --- printk.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/printk.c b/kernel/printk.c index 32462d2..6bdacab 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -365,8 +365,10 @@ static ssize_t devkmsg_writev(struct kiocb *iocb, const struct iovec *iv, line = buf; for (i = 0; i count; i++) { - if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) + if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) { + ret = -EFAULT; goto out; + } line += iv[i].iov_len; } Strictly speaking, when write() encounters an error it should return number-of-bytes-written, or an errno if it wrote zero bytes. So something like --- a/kernel/printk.c~a +++ a/kernel/printk.c @@ -355,7 +355,7 @@ static ssize_t devkmsg_writev(struct kio int level = default_message_loglevel; int facility = 1; /* LOG_USER */ size_t len = iov_length(iv, count); - ssize_t ret = len; + ssize_t ret; if (len LOG_LINE_MAX) return -EINVAL; @@ -365,8 +365,12 @@ static ssize_t devkmsg_writev(struct kio line = buf; for (i = 0; i count; i++) { - if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) + if (copy_from_user(line, iv[i].iov_base, iv[i].iov_len)) { + ret = line - buf; + if (!ret) + ret = -EFAULT; goto out; + } line += iv[i].iov_len; } @@ -396,6 +400,7 @@ static ssize_t devkmsg_writev(struct kio line[len] = '\0'; printk_emit(facility, level, NULL, 0, %s, line); + ret = 0; out: kfree(buf); return ret; _ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 16:29 -0700, Andrew Morton wrote: What about writes starting with \001n? AFACIT, that will be stripped away and the printk level will be altered. This is new behavior. Nope. # echo \001Hello Andrew /dev/kmsg /dev/kmsg has 12,774,2462339252;\001Hello Andrew 12 is 8 + KERN_DEFAULT # echo 1Hello Andrew /dev/kmsg /dev/kmsg has: 9,775,2516023444;Hello Andrew -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, Jun 6, 2012 at 1:35 AM, Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:29 -0700, Andrew Morton wrote: What about writes starting with \001n? AFACIT, that will be stripped away and the printk level will be altered. This is new behavior. Nope. # echo \001Hello Andrew /dev/kmsg /dev/kmsg has 12,774,2462339252;\001Hello Andrew Try echo -e? The stuff is copied verbatim otherwise. Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 2012-06-06 at 01:39 +0200, Kay Sievers wrote: On Wed, Jun 6, 2012 at 1:35 AM, Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:29 -0700, Andrew Morton wrote: What about writes starting with \001n? AFACIT, that will be stripped away and the printk level will be altered. This is new behavior. Nope. # echo \001Hello Andrew /dev/kmsg /dev/kmsg has 12,774,2462339252;\001Hello Andrew Try echo -e? The stuff is copied verbatim otherwise. # echo -e \001Hello Kay /dev/kmsg gives 12,776,3046752764;\x01Hello Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 2012-06-06 at 01:39 +0200, Kay Sievers wrote: Try echo -e? The stuff is copied verbatim otherwise. btw: I didn't change the devkmsg_writev function which does the decoding of any n prefix for printk_emit. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, Jun 6, 2012 at 1:43 AM, Joe Perches j...@perches.com wrote: On Wed, 2012-06-06 at 01:39 +0200, Kay Sievers wrote: # echo \001Hello Andrew /dev/kmsg /dev/kmsg has 12,774,2462339252;\001Hello Andrew Try echo -e? The stuff is copied verbatim otherwise. # echo -e \001Hello Kay /dev/kmsg gives 12,776,3046752764;\x01Hello Kay Don't you need two bytes to trigger the logic? Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 05 Jun 2012 16:52:25 -0700 Joe Perches j...@perches.com wrote: On Wed, 2012-06-06 at 01:48 +0200, Kay Sievers wrote: On Wed, Jun 6, 2012 at 1:43 AM, Joe Perches j...@perches.com wrote: On Wed, 2012-06-06 at 01:39 +0200, Kay Sievers wrote: # echo \001Hello Andrew /dev/kmsg /dev/kmsg has 12,774,2462339252;\001Hello Andrew Try echo -e? The stuff is copied verbatim otherwise. # echo -e \001Hello Kay /dev/kmsg gives 12,776,3046752764;\x01Hello Kay Don't you need two bytes to trigger the logic? Yes. Angle brackets fore and aft. he means echo \0014Hello Joe /dev/kmsg ^ It needs one of [0-7cd] to trigger the prefix handling. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me #echo -e \x011Hello Me_2 /dev/kmsg gives: 12,779,4140452093;Hello Me_2 I didn't change devkmsg_writev so the original parsing style for . is unchanged. from printk.c: static ssize_t devkmsg_writev(struct kiocb *iocb, const struct iovec *iv, unsigned long count, loff_t pos) [...] int level = default_message_loglevel; [...] if (line[0] == '') { char *endp = NULL; i = simple_strtoul(line+1, endp, 10); if (endp endp[0] == '') { level = i 7; if (i 3) facility = i 3; endp++; len -= endp - line; line = endp; } } line[len] = '\0'; printk_emit(facility, level, NULL, 0, %s, line); [] level is what matters. from dmesg -r 12[ 2462.339252] \001Hello Andrew 9[ 2516.023444] Hello Andrew 12[ 3046.752764] \x01Hello Kay 12[ 3940.871850] \x01Hello Kay 12[ 4057.982669] Hello Me 12[ 4140.452093] Hello Me_2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, Jun 6, 2012 at 2:07 AM, Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me #echo -e \x011Hello Me_2 /dev/kmsg gives: 12,779,4140452093;Hello Me_2 I didn't change devkmsg_writev so the original parsing style for . is unchanged. from printk.c: static ssize_t devkmsg_writev(struct kiocb *iocb, const struct iovec *iv, unsigned long count, loff_t pos) [...] int level = default_message_loglevel; [...] if (line[0] == '') { char *endp = NULL; i = simple_strtoul(line+1, endp, 10); if (endp endp[0] == '') { level = i 7; if (i 3) facility = i 3; endp++; len -= endp - line; line = endp; } } line[len] = '\0'; printk_emit(facility, level, NULL, 0, %s, line); [] level is what matters. from dmesg -r 12[ 2462.339252] \001Hello Andrew 9[ 2516.023444] Hello Andrew 12[ 3046.752764] \x01Hello Kay 12[ 3940.871850] \x01Hello Kay 12[ 4057.982669] Hello Me 12[ 4140.452093] Hello Me_2 The question is what happens if you inject your new binary two-byte prefix, like: echo -e \x01\x02Hello /dev/kmsg And if that changes the log-level to 2 instead of the default 4? (assuming that I read your patch right, otherwise please correct the bytes, but use the full sequence which your patch will recognize as an internal level marker; seems your examples are all not triggering that) Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 2012-06-06 at 02:13 +0200, Kay Sievers wrote: The question is what happens if you inject your new binary two-byte prefix, like: echo -e \x01\x02Hello /dev/kmsg It's not a 2 byte binary. It's a leading ascii SOH and a standard ascii char '0' ... '7' or 'd'. #define KERN_EMERG KERN_SOH 0/* system is unusable */ #define KERN_ALERT KERN_SOH 1/* action must be taken immediately */ etc... And if that changes the log-level to 2 instead of the default 4? No it doesn't. It's not triggering that because devkmsg_writev does prefix parsing only on the old n form. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, Jun 6, 2012 at 2:19 AM, Joe Perches j...@perches.com wrote: On Wed, 2012-06-06 at 02:13 +0200, Kay Sievers wrote: The question is what happens if you inject your new binary two-byte prefix, like: echo -e \x01\x02Hello /dev/kmsg It's not a 2 byte binary. It's a leading ascii SOH and a standard ascii char '0' ... '7' or 'd'. #define KERN_EMERG KERN_SOH 0 /* system is unusable */ #define KERN_ALERT KERN_SOH 1 /* action must be taken immediately */ etc... Ok. And if that changes the log-level to 2 instead of the default 4? No it doesn't. So: echo -e \x012Hello /dev/kmsg is still level 4? Sounds all fine then. It's not triggering that because devkmsg_writev does prefix parsing only on the old n form. Yeah, but printk_emit() will not try to parse it? I did not check, but with your change, the prefix parsing in printk_emit() is still skipped if a level is given as a parameter to printk_emit(), right? Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 05 Jun 2012 17:07:27 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me That's changed behavior. On Wed, 6 Jun 2012 02:28:39 +0200 Kay Sievers k...@vrfy.org wrote: Yeah, but printk_emit() will not try to parse it? I did not check, but with your change, the prefix parsing in printk_emit() is still skipped if a level is given as a parameter to printk_emit(), right? printk_emit() does parse the leading \0014, and then skips over it, removing it from the output stream. printk_emit() then throws away the resulting level because devkmsg_writev() did not pass in level==-1. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 2012-06-06 at 02:28 +0200, Kay Sievers wrote: On Wed, Jun 6, 2012 at 2:19 AM, Joe Perches j...@perches.com wrote: On Wed, 2012-06-06 at 02:13 +0200, Kay Sievers wrote: The question is what happens if you inject your new binary two-byte prefix, like: echo -e \x01\x02Hello /dev/kmsg It's not a 2 byte binary. It's a leading ascii SOH and a standard ascii char '0' ... '7' or 'd'. #define KERN_EMERG KERN_SOH 0/* system is unusable */ #define KERN_ALERT KERN_SOH 1/* action must be taken immediately */ etc... Ok. And if that changes the log-level to 2 instead of the default 4? No it doesn't. So: echo -e \x012Hello /dev/kmsg is still level 4? Sounds all fine then. Yes. # echo -e \x012Hello again Kay /dev/kmsg gives: 12,780,6031964979;Hello again Kay It's not triggering that because devkmsg_writev does prefix parsing only on the old n form. Yeah, but printk_emit() will not try to parse it? I did not check, but with your change, the prefix parsing in printk_emit() is still skipped if a level is given as a parameter to printk_emit(), right? If level is not -1, then whatever prefix level the string has is ignored by vprintk_emit. from vprintk_emit: /* strip syslog prefix and extract log level or control flags */ kern_level = printk_get_level(text); if (kern_level) { const char *end_of_header = printk_skip_level(text); switch (kern_level) { case '0' ... '7': if (level == -1) level = kern_level - '0'; case 'd': /* Strip d KERN_DEFAULT, start new line */ plen = 0; default: if (!new_text_line) { log_buf_emit_char('\n'); new_text_line = 1; } } text_len -= end_of_header - text; text = (char *)end_of_header; } Only level == -1 will use the prefix level. devkmsg_writev always passes a non -1 level. cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 2012-06-05 at 17:37 -0700, Andrew Morton wrote: On Tue, 05 Jun 2012 17:07:27 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me That's changed behavior. Which is an improvement too. I very much doubt a single app will change because of this. printk_emit() does parse the leading \0014, and then skips over it, removing it from the output stream. printk_emit() then throws away the resulting level because devkmsg_writev() did not pass in level==-1. I'm glad you know how it works now. cheers, Joe -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Tue, 05 Jun 2012 17:40:05 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 17:37 -0700, Andrew Morton wrote: On Tue, 05 Jun 2012 17:07:27 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me That's changed behavior. Which is an improvement too. No it isn't. It exposes internal kernel implementation details in random weird inexplicable ways. It doesn't seem at all important though. I very much doubt a single app will change because of this. I doubt it as well. printk_emit() does parse the leading \0014, and then skips over it, removing it from the output stream. printk_emit() then throws away the resulting level because devkmsg_writev() did not pass in level==-1. I'm glad you know how it works now. It's nice to see you learning about it as well. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, Jun 6, 2012 at 2:46 AM, Andrew Morton a...@linux-foundation.org wrote: On Tue, 05 Jun 2012 17:40:05 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 17:37 -0700, Andrew Morton wrote: On Tue, 05 Jun 2012 17:07:27 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me That's changed behavior. Which is an improvement too. No it isn't. It exposes internal kernel implementation details in random weird inexplicable ways. It doesn't seem at all important though. I very much doubt a single app will change because of this. I doubt it as well. Yeah, the value of injecting such binary data is kind of questionable. :) Joe, maybe you can change printk_emit() to skip the prefix detection/stripping if a prefix is already passed to the function? Kay -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] Rework KERN_LEVEL
On Wed, 2012-06-06 at 03:10 +0200, Kay Sievers wrote: On Wed, Jun 6, 2012 at 2:46 AM, Andrew Morton a...@linux-foundation.org wrote: On Tue, 05 Jun 2012 17:40:05 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 17:37 -0700, Andrew Morton wrote: On Tue, 05 Jun 2012 17:07:27 -0700 Joe Perches j...@perches.com wrote: On Tue, 2012-06-05 at 16:58 -0700, Andrew Morton wrote: echo \0014Hello Joe /dev/kmsg # echo -e \x014Hello Me /dev/kmsg gives: 12,778,4057982669;Hello Me That's changed behavior. Which is an improvement too. No it isn't. It exposes internal kernel implementation details in random weird inexplicable ways. It doesn't seem at all important though. I very much doubt a single app will change because of this. I doubt it as well. Yeah, the value of injecting such binary data is kind of questionable. :) Joe, maybe you can change printk_emit() to skip the prefix detection/stripping if a prefix is already passed to the function? Sure, it's pretty trivial. Perhaps all binary data data should be elided. Maybe print . for all non-printable ascii chars. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: Drop unused function btrfs_abort_devices()
Hello, Jeff On 05/28/2012 09:41 PM, David Sterba wrote: On Fri, May 25, 2012 at 11:10:21AM +0800, Asias He wrote: 1) This function is not used anywhere. This was added in 49b25e0540904be0bf558b84475c69d72e4de66e adding a transaction abort infrastructure. I'm not sure if Jeff had some intentions with it or whether it got obsolete during the patchset long evolution, CCed. Any comments? david 2) Using the blk_abort_queue() to abort the queue seems not correct. blk_abort_queue() is used for timeout handling (block/blk-timeout.c). Cc: Chris Masonchris.ma...@oracle.com Cc: linux-btrfs@vger.kernel.org Cc: Jens Axboeax...@kernel.dk Cc: linux-ker...@vger.kernel.org Signed-off-by: Asias Heas...@redhat.com --- fs/btrfs/disk-io.c | 13 - fs/btrfs/disk-io.h |1 - 2 files changed, 14 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e1fe74a..3521866 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2902,19 +2902,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans, return ret; } -/* Kill all outstanding I/O */ -void btrfs_abort_devices(struct btrfs_root *root) -{ - struct list_head *head; - struct btrfs_device *dev; - mutex_lock(root-fs_info-fs_devices-device_list_mutex); - head =root-fs_info-fs_devices-devices; - list_for_each_entry_rcu(dev, head, dev_list) { - blk_abort_queue(dev-bdev-bd_disk-queue); - } - mutex_unlock(root-fs_info-fs_devices-device_list_mutex); -} - void btrfs_free_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_root *root) { spin_lock(fs_info-fs_roots_radix_lock); diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index ab1830a..05b3fab 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -89,7 +89,6 @@ int btrfs_add_log_tree(struct btrfs_trans_handle *trans, int btrfs_cleanup_transaction(struct btrfs_root *root); void btrfs_cleanup_one_transaction(struct btrfs_transaction *trans, struct btrfs_root *root); -void btrfs_abort_devices(struct btrfs_root *root); #ifdef CONFIG_DEBUG_LOCK_ALLOC void btrfs_init_lockdep(void); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Asias -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/8] printk: Only look for prefix levels in kernel messages
vprintk_emit prefix parsing should only be done for internal kernel messages. This allows existing behavior to be kept in all cases. Signed-off-by: Joe Perches j...@perches.com --- kernel/printk.c | 32 +--- 1 files changed, 17 insertions(+), 15 deletions(-) diff --git a/kernel/printk.c b/kernel/printk.c index 5cd73f7..4e72c07 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -1267,7 +1267,6 @@ asmlinkage int vprintk_emit(int facility, int level, static char cont_buf[LOG_LINE_MAX]; static size_t cont_len; static int cont_level; - int kern_level; static struct task_struct *cont_task; static char textbuf[LOG_LINE_MAX]; char *text = textbuf; @@ -1329,21 +1328,24 @@ asmlinkage int vprintk_emit(int facility, int level, newline = true; } - /* strip syslog prefix and extract log level or control flags */ - kern_level = printk_get_level(text); - if (kern_level) { - const char *end_of_header = printk_skip_level(text); - switch (kern_level) { - case '0' ... '7': - if (level == -1) - level = kern_level - '0'; - case 'd': /* KERN_DEFAULT */ - prefix = true; - case 'c': /* KERN_CONT */ - break; + /* strip kernel syslog prefix and extract log level or control flags */ + if (facility == 0) { + int kern_level = printk_get_level(text); + + if (kern_level) { + const char *end_of_header = printk_skip_level(text); + switch (kern_level) { + case '0' ... '7': + if (level == -1) + level = kern_level - '0'; + case 'd': /* KERN_DEFAULT */ + prefix = true; + case 'c': /* KERN_CONT */ + break; + } + text_len -= end_of_header - text; + text = (char *)end_of_header; } - text_len -= end_of_header - text; - text = (char *)end_of_header; } if (level == -1) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html