re: btrfs: initial readahead code and prototypes
Hi, I'm working on some new Smatch code and it complains about this patch from last year. -Dan This is a semi-automatic email about new static checker warnings. The patch 7414a03fbf9e: btrfs: initial readahead code and prototypes from May 23, 2011, leads to the following Smatch complaint: fs/btrfs/reada.c:147 __readahead_hook() error: we previously assumed 'eb' could be null (see line 122) fs/btrfs/reada.c 121 122 if (eb) Checked here. 123 level = btrfs_header_level(eb); 124 125 /* find extent */ 126 spin_lock(fs_info-reada_lock); 127 re = radix_tree_lookup(fs_info-reada_tree, index); 128 if (re) 129 kref_get(re-refcnt); 130 spin_unlock(fs_info-reada_lock); 131 132 if (!re) 133 return -1; 134 135 spin_lock(re-lock); 136 /* 137 * just take the full list from the extent. afterwards we 138 * don't need the lock anymore 139 */ 140 list_replace_init(re-extctl, list); 141 for_dev = re-scheduled_for; 142 re-scheduled_for = NULL; 143 spin_unlock(re-lock); 144 145 if (err == 0) { 146 nritems = level ? btrfs_header_nritems(eb) : 0; ^ Checked here again indirectly. 147 generation = btrfs_header_generation(eb); ^^^ Dereferenced inside function without checking. 148 /* 149 * FIXME: currently we just set nritems to 0 if this is a leaf, regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/3] Btrfs-progs: support get/reset device stats via ioctl
On 05/16/2012 19:03, Andrei Popa wrote: It would be nice if this function could show the file names affected by errors, in case of a single, non-redundant drive, btrfs-progs should show what files are affected by errors. Then, an admin could restore only those files from backup. On Wed, 2012-05-16 at 18:50 +0200, Stefan Behrens wrote: btrfs device stats is used to retrieve and print the device stats. btrfs device stats -z is used to atomically retrieve, reset and print the stats. In case of disk errors, it is recommended to run scrub on that disk. It checks the in-use disk contents for errors, repairs errors where possible, and the scrub tool does print the paths and filenames of errored files into the kernel log. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/3] Btrfs: read device stats on mount, write modified ones during commit
On 05/17/2012 03:52, Liu Bo wrote: On 05/17/2012 12:50 AM, Stefan Behrens wrote: The device statistics are written into the device tree with each transaction commit. Only modified statistics are written. When a filesystem is mounted, the device statistics for each involved device are read from the device tree and used to initialize the counters. Hi Stefan, Just scaned the patch for a while and got a question: Adding a new key type usually means changing the disk format, so have you done something for this? Hi Liu, New tree entries with new keys are added to the device tree. Old kernels do not search for these keys and therefore ignore them. New kernels (with this patch) search for these keys and read and write them, or create them when required. That works fine. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 3.4.0-rc6: WARNING: at fs/btrfs/super.c:219 __btrfs_abort_transaction+0xae/0xc0 [btrfs]()
Hi, I got the same Warning but triggered it differently, I created a new cephfs on top of btrfs via mkcephfs, the command than hangs. [ 100.643838] Btrfs loaded [ 100.644313] device fsid 49b89a47-76a0-45cf-9e4a-a7e1f4c64bb8 devid 1 transid 4 /dev/sdc [ 100.645523] btrfs: setting nodatacow [ 100.645527] btrfs: enabling auto defrag [ 100.645529] btrfs: disk space caching is enabled [ 100.645531] btrfs flagging fs with big metadata feature ... [ 2501.141664] [ cut here ] [ 2501.141700] WARNING: at fs/btrfs/super.c:219 __btrfs_abort_transaction+0xae/0xc0 [btrfs]() [ 2501.141714] Hardware name: X9SRi [ 2501.141721] btrfs: Transaction aborted [ 2501.141722] Modules linked in: btrfs zlib_deflate libcrc32c ext2 bonding coretemp ghash_clmulni_intel aesni_intel cryptd aes_x86_64 microcode sb_edac psmouse serio_raw edac_core joydev mei(C) ioatdma mac_hid lp parport ses enclosure usbhid isci hid libsas scsi_transport_sas megaraid_sas ixgbe igb mdio dca [ 2501.141892] Pid: 12129, comm: ceph-osd Tainted: G C 3.4.0-rc7+ #10 [ 2501.141910] Call Trace: [ 2501.141927] [810504ef] warn_slowpath_common+0x7f/0xc0 [ 2501.141945] [810505e6] warn_slowpath_fmt+0x46/0x50 [ 2501.141972] [a01ffbde] __btrfs_abort_transaction+0xae/0xc0 [btrfs] [ 2501.142024] [a026913a] ? btrfs_add_delayed_tree_ref+0x8a/0x1c0 [btrfs] [ 2501.142090] [a022b70b] cow_file_range_inline+0x1bb/0x1c0 [btrfs] [ 2501.142137] [a022b82f] cow_file_range+0x11f/0x480 [btrfs] [ 2501.142187] [a024a31f] ? free_extent_buffer+0x2f/0x70 [btrfs] [ 2501.142235] [a022bf77] run_delalloc_nocow+0x3e7/0x8c0 [btrfs] [ 2501.142281] [a022c749] run_delalloc_range+0x2f9/0x360 [btrfs] [ 2501.142331] [a024919d] __extent_writepage+0x61d/0x760 [btrfs] [ 2501.142366] [81165f9f] ? kmem_cache_free+0x2f/0x110 [ 2501.142412] [a02495aa] extent_write_cache_pages.isra.25.constprop.39+0x2ca/0x3f0 [btrfs] [ 2501.142477] [a0249915] extent_writepages+0x45/0x60 [btrfs] [ 2501.142524] [a0228980] ? btrfs_submit_direct+0x640/0x640 [btrfs] [ 2501.142570] [a0226d08] btrfs_writepages+0x28/0x30 [btrfs] [ 2501.142604] [81125b41] do_writepages+0x21/0x40 [ 2501.142635] [8111b18b] __filemap_fdatawrite_range+0x5b/0x60 [ 2501.142669] [8111c05c] filemap_flush+0x1c/0x20 [ 2501.142713] [a02328b9] btrfs_start_delalloc_inodes+0xc9/0x1f0 [btrfs] [ 2501.142763] [8107cc13] ? __wake_up+0x53/0x70 [ 2501.142806] [a02244bd] btrfs_commit_transaction+0x3bd/0xa60 [btrfs] [ 2501.142856] [810737c0] ? add_wait_queue+0x60/0x60 [ 2501.142896] [a020982a] ? block_rsv_migrate_bytes+0x3a/0x50 [btrfs] [ 2501.142946] [a0258106] btrfs_mksubvol+0x356/0x3a0 [btrfs] [ 2501.142991] [a025827a] btrfs_ioctl_snap_create_transid+0x12a/0x190 [btrfs] [ 2501.143053] [a0258336] btrfs_ioctl_snap_create+0x56/0x80 [btrfs] [ 2501.143099] [a025a40d] btrfs_ioctl+0x44d/0x1320 [btrfs] [ 2501.143134] [81140dd8] ? handle_mm_fault+0x1f8/0x310 [ 2501.143166] [81189dd2] ? do_filp_open+0x42/0xa0 [ 2501.143197] [8118be98] do_vfs_ioctl+0x98/0x550 [ 2501.143228] [81165f9f] ? kmem_cache_free+0x2f/0x110 [ 2501.143259] [8118c3e1] sys_ioctl+0x91/0xa0 [ 2501.143291] [8165fd29] system_call_fastpath+0x16/0x1b [ 2501.143321] ---[ end trace 7d4c76238d6eae30 ]--- [ 2501.143350] BTRFS error (device sdc) in cow_file_range_inline:261: error 28 [ 2501.143381] btrfs is forced readonly [ 2501.143407] BTRFS error (device sdc) in cow_file_range:871: error 28 [ 2501.143444] BTRFS error (device sdc) in run_delalloc_nocow:1333: error 28 btrfs filesystem df /data/osd.0/ Data: total=112.01GB, used=1.02MB System, DUP: total=8.00MB, used=32.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=112.00GB, used=288.00KB Metadata: total=8.00MB, used=0.00 e5:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/redundant-root 85G 3,9G 77G 5% / udev 32G 4,0K 32G 1% /dev tmpfs13G 292K 13G 1% /run none5,0M 0 5,0M 0% /run/lock none 32G 0 32G 0% /run/shm /dev/sda1 228M 68M 148M 32% /boot /dev/sdc5,5T 1,7M 5,3T 1% /data/osd.0 /dev/sdd5,5T 1,7M 5,3T 1% /data/osd.1 /dev/sde5,5T 1,6M 5,3T 1% /data/osd.2 /dev/sdf5,5T 1,7M 5,3T 1% /data/osd.3 Ceph command. mkcephfs -c /etc/ceph/ceph.conf -a temp dir is /tmp/mkcephfs.2kN76CD9ut preparing monmap in /tmp/mkcephfs.2kN76CD9ut/monmap /usr/bin/monmaptool --create --clobber --add a 192.168.125.10:6789 --print /tmp/mkcephfs.2kN76CD9ut/monmap /usr/bin/monmaptool: monmap file /tmp/mkcephfs.2kN76CD9ut/monmap /usr/bin/monmaptool:
Re: [PATCH v3 0/3] Btrfs-progs: support get/reset device stats via ioctl
On Thu, 2012-05-17 at 10:44 +0200, Stefan Behrens wrote: On 05/16/2012 19:03, Andrei Popa wrote: It would be nice if this function could show the file names affected by errors, in case of a single, non-redundant drive, btrfs-progs should show what files are affected by errors. Then, an admin could restore only those files from backup. On Wed, 2012-05-16 at 18:50 +0200, Stefan Behrens wrote: btrfs device stats is used to retrieve and print the device stats. btrfs device stats -z is used to atomically retrieve, reset and print the stats. In case of disk errors, it is recommended to run scrub on that disk. It checks the in-use disk contents for errors, repairs errors where possible, and the scrub tool does print the paths and filenames of errored files into the kernel log. In an automated script or for the usual user it would be easier to get the output from btrfs-progs scrub command with the affected files, instead from kernel log. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How can I create useful bug reports? Here are 50 lines of messages including Call Trace from a system crash.
Hi, I am using btrfs at home on my root system because I want to be able to send useful bug reports when things go wrong. And I have 3 questions: What kernel should I be using? And how do I create good bug reports? Is a Call Trace that I find in /var/log/messages enough, or do I need to install some debug packages and run some tools? Can someone also tell me how to find device error counts? (like what ZFS's zpool status shows under the read and write columns, not scrub/checksums on data, but the device errors) I am using version 3.4.0-rc7-1-default which I got using openSUSE KOTD. Is that a good choice? It would be convenient to use these openSUSE repositories. Someone in the #btrfs IRC channel told me to use this: git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git I managed to crash my system today with a USB stick that is defective. (dd to the direct device hangs and causes it to change device names, eg. from sdb to sdc.) I would like to properly report this so it can be fixed. A bad disk should not take down the system. All I did was create a btrfs fs, dd a file onto the disk which caused an Input/Output error, and then try to run btrfs scrub start and btrfs filesystem show to try to find an error count like you would find in a zpool status using the ZFS file system. Here are 50 lines from /var/log/messages, starting with what I assume is the first error. May 17 10:30:01 peterlaptop kernel: [ 3119.537086] lost page write due to I/O error on sdb1 May 17 10:30:01 peterlaptop kernel: [ 3119.537100] lost page write due to I/O error on sdb1 May 17 10:30:01 peterlaptop kernel: [ 3119.537105] BTRFS error (device sdb1) in write_all_supers:2890: IO failure (1 errors while writing supers) May 17 10:30:01 peterlaptop kernel: [ 3119.537108] btrfs: commit super ret -5 May 17 10:30:01 peterlaptop kernel: [ 3119.542081] [ cut here ] May 17 10:30:01 peterlaptop kernel: [ 3119.542135] WARNING: at /home/abuild/rpmbuild/BUILD/kernel-default-3.4.rc7/linux-3.4-rc7/fs/btrfs/extent-tree.c:124 btrfs_put_block_group+0x5a/0x60 [btrfs]() May 17 10:30:01 peterlaptop kernel: [ 3119.542139] Hardware name: HP Compaq nx8220 (PY522ET#ABD) May 17 10:30:01 peterlaptop kernel: [ 3119.542141] Modules linked in: loop nls_iso8859_1 nls_cp437 reiserfs minix hfs vfat fat usb_storage uas xt_tcpudp xt_pkttype xt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse snd_intel8x0m snd_intel8x0 snd_ac97_codec sr_mod sg ppdev firewire_ohci iTCO_wdt iTCO_vendor_support ac97_bus snd_pcm ipw2200 snd_timer firewire_core cdrom hp_wmi tifm_7xx1 tifm_core parport_pc parport joydev pcmcia snd sparse_keymap libipw tg3 pcspkr irda wmi sdhci_pci serio_raw yenta_socket pcmcia_rsrc pcmcia_core sdhci microcode mmc_core video cfg80211 rfkill button ac soundcore crc_ccitt container battery snd_page_alloc lib80211 crc_itu_t autofs4 btrfs zlib_deflate usbhid hid a May 17 10:30:01 peterlaptop kernel: ta_generic uhci_hcd ehci_hcd ata_piix ahci libahci radeon ttm drm_kms_helper libata rtc_cmos fan thermal drm i2c_algo_bit i2c_core processor thermal_sys hwmon usbcore usb_common May 17 10:30:01 peterlaptop kernel: [ 3119.542221] Pid: 7558, comm: umount Tainted: GW3.4.0-rc7-1-default #1 May 17 10:30:01 peterlaptop kernel: [ 3119.542224] Call Trace: May 17 10:30:01 peterlaptop kernel: [ 3119.542239] [c0205359] try_stack_unwind+0x199/0x1b0 May 17 10:30:01 peterlaptop kernel: [ 3119.542248] [c02041d7] dump_trace+0x47/0xf0 May 17 10:30:01 peterlaptop kernel: [ 3119.542253] [c02053bb] show_trace_log_lvl+0x4b/0x60 May 17 10:30:01 peterlaptop kernel: [ 3119.542258] [c02053e8] show_trace+0x18/0x20 May 17 10:30:01 peterlaptop kernel: [ 3119.542264] [c06952f1] dump_stack+0x6d/0x72 May 17 10:30:01 peterlaptop kernel: [ 3119.542271] [c0231058] warn_slowpath_common+0x78/0xb0 May 17 10:30:01 peterlaptop kernel: [ 3119.542276] [c02310ab] warn_slowpath_null+0x1b/0x20 May 17 10:30:01 peterlaptop kernel: [ 3119.542295] [f82ab0ba] btrfs_put_block_group+0x5a/0x60 [btrfs] May 17 10:30:01 peterlaptop kernel: [ 3119.542350] [f82b4dbe] btrfs_free_block_groups+0x7e/0x2f0 [btrfs] May 17 10:30:01 peterlaptop kernel: [ 3119.542412] [f82c02a4] close_ctree+0x184/0x390 [btrfs] May 17 10:30:01 peterlaptop kernel: [ 3119.542475] [c03236aa] generic_shutdown_super+0x4a/0xc0 May 17 10:30:01 peterlaptop kernel: [ 3119.542481] [c0323799] kill_anon_super+0x9/0x20 May 17 10:30:01 peterlaptop kernel: [ 3119.542498] [f829d3ac] btrfs_kill_super+0xc/0x70 [btrfs] May 17 10:30:01 peterlaptop kernel: [ 3119.542516] [c0323214] deactivate_locked_super+0x44/0x70 May 17 10:30:01 peterlaptop kernel: [ 3119.542522] [c033abd9]
Re: Ceph on btrfs 3.4rc
Hi Josef, somehow I still get the kernel Bug messages, I used your patch from the 16th against rc7. -martin Am 16.05.2012 21:20, schrieb Josef Bacik: Hrm ok so I finally got some time to try and debug it and let the test run a good long while (5 hours almost) and I couldn't hit either the original bug or the one you guys were hitting. So either my extra little bit of locking did the trick or I get to keep my Worst reproducer ever award. Can you guys give this one a whirl and if it panics send the entire dmesg since it should spit out a WARN_ON() to let me know what I thought was the problem was it. Thanks, [ 2868.813236] [ cut here ] [ 2868.813297] kernel BUG at fs/btrfs/inode.c:2220! [ 2868.813355] invalid opcode: [#2] SMP [ 2868.813479] CPU 2 [ 2868.813516] Modules linked in: btrfs zlib_deflate libcrc32c ext2 bonding coretemp ghash_clmulni_intel aesni_intel cryptd aes_x86_64 microcode psmouse serio_raw sb_edac edac_core joydev mei(C) ses ioatdma enclosure mac_hid lp parport isci libsas scsi_transport_sas usbhid hid ixgbe igb megaraid_sas dca mdio [ 2868.814871] [ 2868.814925] Pid: 5325, comm: ceph-osd Tainted: G D C 3.4.0-rc7+ #10 Supermicro X9SRi/X9SRi [ 2868.815108] RIP: 0010:[a02212f2] [a02212f2] btrfs_orphan_del+0xe2/0xf0 [btrfs] [ 2868.815236] RSP: 0018:880296e89d18 EFLAGS: 00010282 [ 2868.815294] RAX: fffe RBX: 88101ef3c390 RCX: 00562497 [ 2868.815355] RDX: 00562496 RSI: 88101ef1 RDI: ea00407bc400 [ 2868.815416] RBP: 880296e89d58 R08: 60ef8fd0 R09: a01f8c6a [ 2868.815476] R10: R11: 011d R12: 880fdf602790 [ 2868.815537] R13: 880fdf602400 R14: 0001 R15: 0001 [ 2868.815598] FS: 7f07d5512700() GS:88107fc4() knlGS: [ 2868.815675] CS: 0010 DS: ES: CR0: 80050033 [ 2868.815734] CR2: 0ab16000 CR3: 00082a6b2000 CR4: 000407e0 [ 2868.815796] DR0: DR1: DR2: [ 2868.815858] DR3: DR6: 0ff0 DR7: 0400 [ 2868.815920] Process ceph-osd (pid: 5325, threadinfo 880296e88000, task 8810170616e0) [ 2868.815997] Stack: [ 2868.816049] 0c07 88101ef12960 880296e89d38 88101ef12960 [ 2868.816262] 880fdf602400 88101ef3c390 880b4ce2f260 [ 2868.816485] 880296e89e08 a0225628 88101ef3c390 [ 2868.816694] Call Trace: [ 2868.816755] [a0225628] btrfs_truncate+0x4d8/0x650 [btrfs] [ 2868.816817] [81188afd] ? path_lookupat+0x6d/0x750 [ 2868.816880] [a0227021] btrfs_setattr+0xc1/0x1b0 [btrfs] [ 2868.816940] [811955c3] notify_change+0x183/0x320 [ 2868.816998] [8117889e] do_truncate+0x5e/0xa0 [ 2868.817056] [81178a24] sys_truncate+0x144/0x1b0 [ 2868.817115] [8165fd29] system_call_fastpath+0x16/0x1b [ 2868.817173] Code: e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 66 0f 1f 44 00 00 80 bb 60 fe ff ff 84 75 b4 eb ae 0f 1f 44 00 00 48 89 df e8 50 73 fe ff eb b8 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec [ 2868.819501] RIP [a02212f2] btrfs_orphan_del+0xe2/0xf0 [btrfs] [ 2868.819602] RSP 880296e89d18 [ 2868.819703] ---[ end trace 94d17b770b376c84 ]--- [ 3249.857453] [ cut here ] [ 3249.857481] kernel BUG at fs/btrfs/inode.c:2220! [ 3249.857506] invalid opcode: [#3] SMP [ 3249.857534] CPU 0 [ 3249.857538] Modules linked in: btrfs zlib_deflate libcrc32c ext2 bonding coretemp ghash_clmulni_intel aesni_intel cryptd aes_x86_64 microcode psmouse serio_raw sb_edac edac_core joydev mei(C) ses ioatdma enclosure mac_hid lp parport isci libsas scsi_transport_sas usbhid hid ixgbe igb megaraid_sas dca mdio [ 3249.857721] [ 3249.857740] Pid: 5384, comm: ceph-osd Tainted: G D C 3.4.0-rc7+ #10 Supermicro X9SRi/X9SRi [ 3249.857791] RIP: 0010:[a02212f2] [a02212f2] btrfs_orphan_del+0xe2/0xf0 [btrfs] [ 3249.857847] RSP: 0018:880abe8b5d18 EFLAGS: 00010282 [ 3249.857873] RAX: fffe RBX: 8807eb8b6670 RCX: 0077a084 [ 3249.857902] RDX: 0077a083 RSI: 88101ee497e0 RDI: ea00407b9240 [ 3249.857931] RBP: 880abe8b5d58 R08: 60ef8fd0 R09: a01f8c6a [ 3249.857959] R10: R11: 0153 R12: 880d56825390 [ 3249.857988] R13: 880d56825000 R14: 0001 R15: 0001 [ 3249.858017] FS: 7f06bd13b700() GS:88107fc0() knlGS: [ 3249.858062] CS: 0010 DS: ES: CR0: 80050033 [ 3249.858088] CR2: 043d2000 CR3: 000e7ebe5000 CR4: 000407f0 [ 3249.858117] DR0: DR1: DR2: [ 3249.858146] DR3: DR6: 0ff0 DR7:
Re: How can I create useful bug reports? Here are 50 lines of messages including Call Trace from a system crash.
On Thu, May 17, 2012 at 12:26:42PM +0200, Peter Maloney wrote: I am using btrfs at home on my root system because I want to be able to send useful bug reports when things go wrong. And I have 3 questions: What kernel should I be using? One of: - josef's btrfs-next[1], - Chris's main repo[2], or - kernel.org mainline -rc kernels[3]. The latter two will generally be carrying identical btrfs code. The first one is rather more experimental. And how do I create good bug reports? Is a Call Trace that I find in /var/log/messages enough, or do I need to install some debug packages and run some tools? If you have a backtrace in /var/log/messages, yes, that's a good start. Generally, state what you did to get the error, whether it's repeatable, what kernel version you're using, and any error messages you got. If there's extra info needed, whoever picks it up will ask. Can someone also tell me how to find device error counts? (like what ZFS's zpool status shows under the read and write columns, not scrub/checksums on data, but the device errors) We don't have those right now -- Stefan Behrens posted a patch here yesterday to keep track of them. :) I am using version 3.4.0-rc7-1-default which I got using openSUSE KOTD. Is that a good choice? It would be convenient to use these openSUSE repositories. Yes, that's reasonable. Someone in the #btrfs IRC channel told me to use this: git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git I managed to crash my system today with a USB stick that is defective. (dd to the direct device hangs and causes it to change device names, eg. from sdb to sdc.) I would like to properly report this so it can be fixed. A bad disk should not take down the system. Proper error handling is an ongoing work. It's a lot better than it used to be (back in 2.6.32 days, if you ran out of space, the whole system could come down :) ), but there's still quite a few things left to deal with. USB is distinctly unreliable, and seems to cause more problems than most other block stacks right now. Hugo. [1] git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git [2] git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git [3] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- There's more than one way to do it is not a commandment. It --- is a dire warning. signature.asc Description: Digital signature
[PATCH 1/5] Btrfs: stop defrag the files automatically when doin readonly remount or umount
If we remount the fs to be readonly or umount it, we should not continue defraging the files, it is because - the auto defragment will introduce lots of dirty pages, it breaks the rule of a readonly file system. - it make the time of remount/umount become longer. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/disk-io.c | 12 +++- fs/btrfs/file.c|3 ++- fs/btrfs/super.c |5 + 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 20196f4..9a571f7 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1529,6 +1529,9 @@ static int cleaner_kthread(void *arg) do { vfs_check_frozen(root-fs_info-sb, SB_FREEZE_WRITE); + if (!down_read_trylock(root-fs_info-sb-s_umount)) + goto skip; + if (!(root-fs_info-sb-s_flags MS_RDONLY) mutex_trylock(root-fs_info-cleaner_mutex)) { btrfs_run_delayed_iputs(root); @@ -1536,7 +1539,8 @@ static int cleaner_kthread(void *arg) mutex_unlock(root-fs_info-cleaner_mutex); btrfs_run_defrag_inodes(root-fs_info); } - + up_read(root-fs_info-sb-s_umount); +skip: if (!try_to_freeze()) { set_current_state(TASK_INTERRUPTIBLE); if (!kthread_should_stop()) @@ -3049,13 +3053,11 @@ int close_ctree(struct btrfs_root *root) btrfs_scrub_cancel(root); - /* wait for any defraggers to finish */ - wait_event(fs_info-transaction_wait, - (atomic_read(fs_info-defrag_running) == 0)); - /* clear out the rbtree of defraggable inodes */ btrfs_run_defrag_inodes(fs_info); + BUG_ON(atomic_read(fs_info-defrag_running)); + /* * Here come 2 situations when btrfs is broken to flip readonly: * diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index d83260d..23364c1 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -230,7 +230,8 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) first_ino = defrag-ino + 1; rb_erase(defrag-rb_node, fs_info-defrag_inodes); - if (btrfs_fs_closing(fs_info)) + if (btrfs_fs_closing(fs_info) || + (fs_info-sb-s_flags MS_RDONLY)) goto next_free; spin_unlock(fs_info-defrag_inodes_lock); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 84571d7..7deb00e 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1151,6 +1151,11 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = btrfs_commit_super(root); if (ret) goto restore; + + /* clear out the rbtree of defraggable inodes */ + btrfs_run_defrag_inodes(fs_info); + + BUG_ON(atomic_read(fs_info-defrag_running)); } else { if (fs_info-fs_devices-rw_devices == 0) ret = -EACCES; -- 1.7.6.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] Btrfs: count the chunks which will be relocated at first
the balance function should count the chunks which will be relocated at first, and then relocate those chunks one by one. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 759d024..91da8a2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2580,7 +2580,7 @@ again: chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); - if (!counting) { + if (counting) { spin_lock(fs_info-balance_lock); bctl-stat.considered++; spin_unlock(fs_info-balance_lock); -- 1.7.6.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] Btrfs: pause/recover the space balance when doing remount
pause the space balance threads when remounting the fs to be readonly, and recover it when remounting it from r/o to r/w Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/super.c |9 - fs/btrfs/volumes.c |8 +++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 7deb00e..ea17f0a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1148,6 +1148,9 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) if (*flags MS_RDONLY) { sb-s_flags |= MS_RDONLY; + /* pause restriper - we want to resume on remount to r/w */ + btrfs_pause_balance(root-fs_info); + ret = btrfs_commit_super(root); if (ret) goto restore; @@ -1174,7 +1177,10 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) if (ret) goto restore; - sb-s_flags = ~MS_RDONLY; + if (sb-s_flags MS_RDONLY) { + sb-s_flags = ~MS_RDONLY; + btrfs_recover_balance(fs_info-tree_root); + } } return 0; @@ -1190,6 +1196,7 @@ restore: fs_info-alloc_start = old_alloc_start; fs_info-thread_pool_size = old_thread_pool_size; fs_info-metadata_ratio = old_metadata_ratio; + return ret; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 91da8a2..c536d52 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2833,7 +2833,13 @@ static int balance_kthread(void *data) mutex_lock(fs_info-volume_mutex); mutex_lock(fs_info-balance_mutex); - set_balance_control(bctl); + if (fs_info-balance_ctl) { + kfree(bctl); + bctl = fs_info-balance_ctl; + bctl-flags = bctl-flags | BTRFS_BALANCE_RESUME; + } else { + set_balance_control(bctl); + } if (btrfs_test_opt(fs_info-tree_root, SKIP_BALANCE)) { printk(KERN_INFO btrfs: force skipping balance\n); -- 1.7.6.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] Btrfs: cancel the scrub when remounting a fs to ro
If the filesystem is mounted to readonly, we should not run scrub. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/super.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index ea17f0a..817b3a7 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1151,6 +1151,8 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) /* pause restriper - we want to resume on remount to r/w */ btrfs_pause_balance(root-fs_info); + btrfs_scrub_cancel(root); + ret = btrfs_commit_super(root); if (ret) goto restore; -- 1.7.6.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] Btrfs: fix memory leak in btrfs_pause_balance()
We forget to free fs_info-balance_ctl in the btrfs_pause_balance() when umounting the fs. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c536d52..fd7fe80 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2937,6 +2937,9 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info) ret = -ENOTCONN; } + if (btrfs_fs_closing(fs_info) fs_info-balance_ctl) + unset_balance_control(fs_info); + mutex_unlock(fs_info-balance_mutex); return ret; } -- 1.7.6.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Btrfs: do not resize a seeding device
Seeding devices are not supposed to change any more. Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com --- fs/btrfs/ioctl.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f056469..ec2245d 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1303,6 +1303,13 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, ret = -EINVAL; goto out_free; } + if (device-fs_devices device-fs_devices-seeding) { + printk(KERN_INFO btrfs: resizer unable to apply on + seeding device %s\n, device-name); + ret = -EACCES; + goto out_free; + } + if (!strcmp(sizestr, max)) new_size = device-bdev-bd_inode-i_size; else { -- 1.6.5.2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Btrfs: resize all devices when we dont assign a specific device id
This patch fixes two bugs: When we do not assigne a device id for the resizer, - it will only take one device to resize, which is supposed to apply on all available devices. - it will take 'id 1' device as default, and this will cause a bug as we may have removed the 'id 1' device from the filesystem. After this patch, we can find all available devices by searching the chunk tree and resize them: $ mkfs.btrfs /dev/sdb7 $ mount /dev/sdb7 /mnt/btrfs/ $ btrfs dev add /dev/sdb8 /mnt/btrfs/ $ btrfs fi resize -100m /mnt/btrfs/ then we can get from dmesg: btrfs: new size for /dev/sdb7 is 980844544 btrfs: new size for /dev/sdb8 is 980844544 $ btrfs fi resize max /mnt/btrfs then we can get from dmesg: btrfs: new size for /dev/sdb7 is 1085702144 btrfs: new size for /dev/sdb8 is 1085702144 $ btrfs fi resize 1:-100m /mnt/btrfs then we can get from dmesg: btrfs: resizing devid 1 btrfs: new size for /dev/sdb7 is 980844544 $ btrfs fi resize 1:-100m /mnt/btrfs then we can get from dmesg: btrfs: resizing devid 2 btrfs: new size for /dev/sdb8 is 980844544 Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com --- fs/btrfs/ioctl.c | 101 -- 1 files changed, 83 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ec2245d..d9a4fa8 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1250,12 +1250,51 @@ out_ra: return ret; } +static struct btrfs_device *get_avail_device(struct btrfs_root *root, u64 devid) +{ + struct btrfs_key key; + struct btrfs_path *path; + struct btrfs_dev_item *dev_item; + struct btrfs_device *device = NULL; + int ret; + + path = btrfs_alloc_path(); + if (!path) + return ERR_PTR(-ENOMEM); + + key.objectid = BTRFS_DEV_ITEMS_OBJECTID; + key.offset = devid; + key.type = BTRFS_DEV_ITEM_KEY; + + ret = btrfs_search_slot(NULL, root-fs_info-chunk_root, key, + path, 0, 0); + if (ret 0) { + device = ERR_PTR(ret); + goto out; + } + btrfs_item_key_to_cpu(path-nodes[0], key, path-slots[0]); + if (key.objectid != BTRFS_DEV_ITEMS_OBJECTID || + key.type != BTRFS_DEV_ITEM_KEY) { + device = NULL; + goto out; + } + dev_item = btrfs_item_ptr(path-nodes[0], path-slots[0], + struct btrfs_dev_item); + devid = btrfs_device_id(path-nodes[0], dev_item); + + device = btrfs_find_device(root, devid, NULL, NULL); +out: + btrfs_free_path(path); + return device; +} + static noinline int btrfs_ioctl_resize(struct btrfs_root *root, void __user *arg) { - u64 new_size; + u64 new_size = 0; u64 old_size; - u64 devid = 1; + u64 orig_new_size = 0; + u64 devid = (-1ULL); struct btrfs_ioctl_vol_args *vol_args; struct btrfs_trans_handle *trans; struct btrfs_device *device = NULL; @@ -1263,6 +1302,8 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, char *devstr = NULL; int ret = 0; int mod = 0; + int scan_all = 1; + int use_max = 0; if (root-fs_info-sb-s_flags MS_RDONLY) return -EROFS; @@ -1295,8 +1336,31 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, devid = simple_strtoull(devstr, end, 10); printk(KERN_INFO btrfs: resizing devid %llu\n, (unsigned long long)devid); + scan_all = 0; } - device = btrfs_find_device(root, devid, NULL, NULL); + + if (!strcmp(sizestr, max)) { + use_max = 1; + } else { + if (sizestr[0] == '-') { + mod = -1; + sizestr++; + } else if (sizestr[0] == '+') { + mod = 1; + sizestr++; + } + orig_new_size = memparse(sizestr, NULL); + if (orig_new_size == 0) { + ret = -EINVAL; + goto out_free; + } + } + + if (devid (-1ULL)) + device = btrfs_find_device(root, devid, NULL, NULL); + else + device = get_avail_device(root, 0); +again: if (!device) { printk(KERN_INFO btrfs: resizer unable to find device %llu\n, (unsigned long long)devid); @@ -1310,22 +1374,10 @@ static noinline int btrfs_ioctl_resize(struct btrfs_root *root, goto out_free; } - if (!strcmp(sizestr, max)) + if (use_max) new_size = device-bdev-bd_inode-i_size; - else { - if (sizestr[0] == '-') { - mod = -1; - sizestr++; - } else if (sizestr[0] == '+') { -
Re: btrfs: initial readahead code and prototypes
On Thu, May 17, 2012 at 03:31:50PM +0200, Arne Jansen wrote: The assumption here is that if err == 0, eb is always != NULL. There's even a tiny comment above the function stating this: 107 /* in case of err, eb might be NULL */ Ah, right. I missed the comment. This code changes significantly with the patch btrfs: extend readahead interface Where it is written in a more obvious way. Cool. regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] btrfs: extend readahead interface
On 05/09/12 16:48, David Sterba wrote: On Thu, Apr 12, 2012 at 05:54:38PM +0200, Arne Jansen wrote: @@ -97,30 +119,87 @@ struct reada_machine_work { +/* + * this is the default callback for readahead. It just descends into the + * tree within the range given at creation. if an error occurs, just cut + * this part of the tree + */ +static void readahead_descend(struct btrfs_root *root, struct reada_control *rc, + u64 wanted_generation, struct extent_buffer *eb, + u64 start, int err, struct btrfs_key *top, + void *ctx) +{ + int nritems; + u64 generation; + int level; + int i; + + BUG_ON(err == -EAGAIN); /* FIXME: not yet implemented, don't cancel +* readahead with default callback */ + + if (err || eb == NULL) { + /* +* this is the error case, the extent buffer has not been +* read correctly. We won't access anything from it and +* just cleanup our data structures. Effectively this will +* cut the branch below this node from read ahead. +*/ + return; + } + + level = btrfs_header_level(eb); + if (level == 0) { + /* +* if this is a leaf, ignore the content. +*/ + return; + } + + nritems = btrfs_header_nritems(eb); + generation = btrfs_header_generation(eb); + + /* +* if the generation doesn't match, just ignore this node. +* This will cut off a branch from prefetch. Alternatively one could +* start a new (sub-) prefetch for this branch, starting again from +* root. +*/ + if (wanted_generation != generation) + return; I think I saw passing wanted_generation = 0 somewheree, but cannot find it now again. Is it an expected value for the default RA callback, meaning eg. 'any generation I find' ? No. This here is just the default callback. You've seen wanted_generation = 0 in the droptree code, where a custom callback is set that doesn't check the generation. + + for (i = 0; i nritems; i++) { + u64 n_gen; + struct btrfs_key key; + struct btrfs_key next_key; + u64 bytenr; + + btrfs_node_key_to_cpu(eb,key, i); + if (i + 1 nritems) + btrfs_node_key_to_cpu(eb,next_key, i + 1); + else + next_key = *top; + bytenr = btrfs_node_blockptr(eb, i); + n_gen = btrfs_node_ptr_generation(eb, i); + + if (btrfs_comp_cpu_keys(key,rc-key_end) 0 + btrfs_comp_cpu_keys(next_key,rc-key_start) 0) + reada_add_block(rc, bytenr,next_key, + level - 1, n_gen, ctx); + } +} @@ -142,65 +221,21 @@ static int __readahead_hook(struct btrfs_root *root, struct extent_buffer *eb, re-scheduled_for = NULL; spin_unlock(re-lock); - if (err == 0) { - nritems = level ? btrfs_header_nritems(eb) : 0; - generation = btrfs_header_generation(eb); - /* -* FIXME: currently we just set nritems to 0 if this is a leaf, -* effectively ignoring the content. In a next step we could -* trigger more readahead depending from the content, e.g. -* fetch the checksums for the extents in the leaf. -*/ - } else { + /* +* call hooks for all registered readaheads +*/ + list_for_each_entry(rec,list, list) { + btrfs_tree_read_lock(eb); /* -* this is the error case, the extent buffer has not been -* read correctly. We won't access anything from it and -* just cleanup our data structures. Effectively this will -* cut the branch below this node from read ahead. +* we set the lock to blocking, as the callback might want to +* sleep on allocations. What about a more finer control given to the callbacks? The blocking lock may be unnecessary if the callback does not sleep. I thought about that, but it would add a bit more complexity. So I decided for the simpler version in the first run. There is definitely room for optimization here. My idea is to add a field to 'struct reada_uptodate_ctx', preset with BTRFS_READ_LOCK by default, but let the RA user to set it to its needs. The struct is only used in the special case the extent is already uptodate, to pass the parameters to the worker in this case. The user has no influence on that. It could either be stored per request in struct reada_extctl or per reada in struct reada_control. But this would also not be optimal. There better way
Btrfs storage advice
Hi btrfs list, I am looking for some counsel regarding how to best (and most safely) utilize extra space on my btrfs installation. I set up a btrfs installation about 6 months ago. I wanted to test the system while waiting for mainline acceptance and support. The machine being used has 13 1Tb drives. 12 as a btrfs collection (stripe data, mirror metadata) and 1 ext4 as a system drive. We are running kernel 3.2.0-rc4. I know that it is not the latest, but it has been extremely stable for our needs. Currently the system holds backup files. 2 other filesystems are nfs mounts on the machine and backups are created by rsyncing these mounts onto btrfs. The btrfs copies are also snapshotted, so 2 copies exist of backup data. I have added the output of btrfs fi show and btrfs fi df below so you can see the layout, as well as a standard df -h. As will be readily apparent, my nfs disks are approaching storage limits. Due to financial constraints I must use the space on the btrfs system for nfs storage. My first thought is to take 3 or 4 T as a subvol and export it as nfs. I have not heard of anyone else exporting btrfs, is it possible? Next idea is to split several drives off the btrfs system. I have removed drives and replaced them as experiments with the fs but had much less data on them when I was trying that. I have read many times on the list, about size issues with btrfs, and filesystems reporting full when they were far from it. As my system has been very stable just r/w data and creating and removing subvols, I am reluctant to change the disk layout, but we will do what we have to. Also, if I split disks out they could be mirrored, like our other nfs systems. However, I can stand a small amount of filesystem downtime. Therefore to maximize space we may look at not mirroring the segment but just mount a backup snapshot if a main fs drive goes out. Final question is what about backup space. Regardless of how I structure the new storage segment, it will need to be backed up with the rest of the system. Once again, I am between maximizing available storage and leaving breathing room for btrfs. As I currently backup over 4T on btrfs perhaps I should only allocate 2T for new storage thus creating 2T storage, 6+T backup and 1+T breathing room. I am not in a panic situation, but I will need to create the new storage over the next 2 months. I would really appreciate any feed back and comments concerning this operation. Thanks in advance. Jim Maloney [root@btrfs ~]# btrfs fi show failed to read /dev/sr0 Label: none uuid: c21f1221-a224-4ba4-92e5-cdea0fa6d0f9 Total devices 12 FS bytes used 4.62TB devid 12 size 930.99GB used 414.75GB path /dev/sdl devid 11 size 930.99GB used 414.75GB path /dev/sdk devid 10 size 930.99GB used 414.99GB path /dev/sdj devid9 size 930.99GB used 414.99GB path /dev/sdi devid5 size 930.99GB used 414.99GB path /dev/sde devid2 size 930.99GB used 414.74GB path /dev/sdb devid1 size 930.99GB used 414.76GB path /dev/sda devid7 size 930.99GB used 414.99GB path /dev/sdg devid3 size 930.99GB used 414.74GB path /dev/sdc devid4 size 930.99GB used 414.74GB path /dev/sdd devid6 size 930.99GB used 414.99GB path /dev/sdf devid8 size 930.99GB used 414.99GB path /dev/sdh [root@btrfs ~]# btrfs fi df /btrfs Data, RAID0: total=4.54TB, used=4.50TB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=324.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=164.25GB, used=122.97GB Metadata: total=8.00MB, used=0.00 [root@btrfs ~]# df -h FilesystemSize Used Avail Use% Mounted on /dev/sdm2 196G 49G 138G 26% / tmpfs 16G 0 16G 0% /dev/shm /dev/sdm1 2.0G 137M 1.8G 8% /boot /dev/sdm5 1.2T 19G 1.1T 2% /var /dev/sda 11T 4.8T 6.1T 44% /btrfs 10.2.0.42:/data/sites 2.6T 2.1T 388G 85% /nfs2/data/sites 10.2.0.40:/data/sites 2.6T 2.3T 218G 92% /nfs1/data/sites -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ceph on btrfs 3.4rc
On Thu, May 17, 2012 at 12:29:32PM +0200, Martin Mailand wrote: Hi Josef, somehow I still get the kernel Bug messages, I used your patch from the 16th against rc7. Was there anything above those messages? There should have been a WARN_ON() or something. If not thats fine, I just need to know one way or the other so I can figure out what to do next. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
SSD format/mount parameters questions
For using SSDs: Are there any format/mount parameters that should be set for using btrfs on SSDs (other than the ssd mount option)? General questions: How long is the 'delay' for the delayed alloc? Are file allocations aligned to 4kiB boundaries, or larger? What byte value is used to pad unused space? (Aside: For some, the erased state reads all 0x00, and for others the erased state reads all 0xff.) Background: I've got a mix of various 120/128GB SSDs to newly set up. I will be using ext4 on the critical ones, but also wish to compare with btrfs... The mix includes some SSDs with the Sandforce controller that implements its own data compression and data deduplication. How well does btrfs fit with those compared to other non-data-compression controllers? Regards, Martin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] Btrfs: count the chunks which will be relocated at first
On Thu, May 17, 2012 at 07:56:53PM +0800, Miao Xie wrote: the balance function should count the chunks which will be relocated at first, and then relocate those chunks one by one. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 759d024..91da8a2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2580,7 +2580,7 @@ again: chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); - if (!counting) { + if (counting) { spin_lock(fs_info-balance_lock); bctl-stat.considered++; spin_unlock(fs_info-balance_lock); __btrfs_balance() already calculates the approximate number of chunks that will be relocated and stores that value in bctl-stat.expected. The stat.considered counter OTOH is supposed to reflect the number of chunks processed through balance filters and it is meant to be updated at relocation pass, so AFAICS if (!counting) is the right test. What exactly are you trying to fix here ? Thanks, Ilya -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/5] Btrfs: pause/recover the space balance when doing remount
On Thu, May 17, 2012 at 07:57:40PM +0800, Miao Xie wrote: pause the space balance threads when remounting the fs to be readonly, and recover it when remounting it from r/o to r/w Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/super.c |9 - fs/btrfs/volumes.c |8 +++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 7deb00e..ea17f0a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1148,6 +1148,9 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) if (*flags MS_RDONLY) { sb-s_flags |= MS_RDONLY; + /* pause restriper - we want to resume on remount to r/w */ + btrfs_pause_balance(root-fs_info); + ret = btrfs_commit_super(root); if (ret) goto restore; @@ -1174,7 +1177,10 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) if (ret) goto restore; - sb-s_flags = ~MS_RDONLY; + if (sb-s_flags MS_RDONLY) { + sb-s_flags = ~MS_RDONLY; + btrfs_recover_balance(fs_info-tree_root); + } } return 0; @@ -1190,6 +1196,7 @@ restore: fs_info-alloc_start = old_alloc_start; fs_info-thread_pool_size = old_thread_pool_size; fs_info-metadata_ratio = old_metadata_ratio; + return ret; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 91da8a2..c536d52 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2833,7 +2833,13 @@ static int balance_kthread(void *data) mutex_lock(fs_info-volume_mutex); mutex_lock(fs_info-balance_mutex); - set_balance_control(bctl); + if (fs_info-balance_ctl) { + kfree(bctl); + bctl = fs_info-balance_ctl; + bctl-flags = bctl-flags | BTRFS_BALANCE_RESUME; + } else { + set_balance_control(bctl); + } if (btrfs_test_opt(fs_info-tree_root, SKIP_BALANCE)) { printk(KERN_INFO btrfs: force skipping balance\n); This is a known bug. There is a deeper problem here, related to the fact that we restore balancing state not early enough and that we don't restore it on ro mounts at all. I have a patch in the works to fix that problem, and it also fixes this one the right way. I'm backed up with other things right now, but I'll post it as soon as I get a chance. Thanks, Ilya -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ceph on btrfs 3.4rc
Hi Josef, no there was nothing above. Here the is another dmesg output. Was there anything above those messages? There should have been a WARN_ON() or something. If not thats fine, I just need to know one way or the other so I can figure out what to do next. Thanks, Josef -martin [ 63.027277] Btrfs loaded [ 63.027485] device fsid 266726e1-439f-4d89-a374-7ef92d355daf devid 1 transid 4 /dev/sdc [ 63.027750] btrfs: setting nodatacow [ 63.027752] btrfs: enabling auto defrag [ 63.027753] btrfs: disk space caching is enabled [ 63.027754] btrfs flagging fs with big metadata feature [ 63.036347] device fsid 070e2c6c-2ea5-478d-bc07-7ce3a954e2e4 devid 1 transid 4 /dev/sdd [ 63.036624] btrfs: setting nodatacow [ 63.036626] btrfs: enabling auto defrag [ 63.036627] btrfs: disk space caching is enabled [ 63.036628] btrfs flagging fs with big metadata feature [ 63.045628] device fsid 6f7b82a9-a1b7-40c6-8b00-2c2a44481066 devid 1 transid 4 /dev/sde [ 63.045910] btrfs: setting nodatacow [ 63.045912] btrfs: enabling auto defrag [ 63.045913] btrfs: disk space caching is enabled [ 63.045914] btrfs flagging fs with big metadata feature [ 63.831278] device fsid 46890b76-45c2-4ea2-96ee-2ea88e29628b devid 1 transid 4 /dev/sdf [ 63.831577] btrfs: setting nodatacow [ 63.831579] btrfs: enabling auto defrag [ 63.831579] btrfs: disk space caching is enabled [ 63.831580] btrfs flagging fs with big metadata feature [ 1521.820412] [ cut here ] [ 1521.820424] kernel BUG at fs/btrfs/inode.c:2220! [ 1521.820433] invalid opcode: [#1] SMP [ 1521.820448] CPU 4 [ 1521.820452] Modules linked in: btrfs zlib_deflate libcrc32c ext2 ses enclosure bonding coretemp ghash_clmulni_intel aesni_intel cryptd aes_x86_64 psmouse microcode serio_raw sb_edac edac_core mei(C) joydev ioatdma mac_hid lp parport isci libsas scsi_transport_sas usbhid hid ixgbe igb dca megaraid_sas mdio [ 1521.820562] [ 1521.820567] Pid: 3095, comm: ceph-osd Tainted: G C 3.4.0-rc7+ #10 Supermicro X9SRi/X9SRi [ 1521.820591] RIP: 0010:[a02532f2] [a02532f2] btrfs_orphan_del+0xe2/0xf0 [btrfs] [ 1521.820616] RSP: 0018:881013da9d18 EFLAGS: 00010282 [ 1521.820626] RAX: fffe RBX: 881013a3b7f0 RCX: 00395dcf [ 1521.820640] RDX: 00395dce RSI: 88101df77480 RDI: ea004077ddc0 [ 1521.820654] RBP: 881013da9d58 R08: 60ef800010d0 R09: a022ac6a [ 1521.820667] R10: R11: 010a R12: 88101e378790 [ 1521.820681] R13: 88101e378400 R14: 0001 R15: 0001 [ 1521.820695] FS: 7faa45d30700() GS:88107fc8() knlGS: [ 1521.820710] CS: 0010 DS: ES: CR0: 80050033 [ 1521.820738] CR2: 7fe0efba6010 CR3: 001016fec000 CR4: 000407e0 [ 1521.820767] DR0: DR1: DR2: [ 1521.820796] DR3: DR6: 0ff0 DR7: 0400 [ 1521.820825] Process ceph-osd (pid: 3095, threadinfo 881013da8000, task 881013da44a0) [ 1521.820870] Stack: [ 1521.820889] 0c05 88101df9c230 881013da9d38 88101df9c230 [ 1521.820939] 88101e378400 881013a3b7f0 880c6880f840 [ 1521.820988] 881013da9e08 a0257628 881013a3b7f0 [ 1521.821038] Call Trace: [ 1521.821066] [a0257628] btrfs_truncate+0x4d8/0x650 [btrfs] [ 1521.821096] [81188afd] ? path_lookupat+0x6d/0x750 [ 1521.821128] [a0259021] btrfs_setattr+0xc1/0x1b0 [btrfs] [ 1521.821156] [811955c3] notify_change+0x183/0x320 [ 1521.821183] [8117889e] do_truncate+0x5e/0xa0 [ 1521.821209] [81178a24] sys_truncate+0x144/0x1b0 [ 1521.821237] [8165fd29] system_call_fastpath+0x16/0x1b [ 1521.821265] Code: e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 66 0f 1f 44 00 00 80 bb 60 fe ff ff 84 75 b4 eb ae 0f 1f 44 00 00 48 89 df e8 50 73 fe ff eb b8 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec [ 1521.821458] RIP [a02532f2] btrfs_orphan_del+0xe2/0xf0 [btrfs] [ 1521.821492] RSP 881013da9d18 [ 1521.821758] ---[ end trace aee4c5fe92ee2a67 ]--- [ 6888.637508] btrfs: truncated 1 orphans [ 7641.701736] [ cut here ] [ 7641.701764] kernel BUG at fs/btrfs/inode.c:2220! [ 7641.701789] invalid opcode: [#2] SMP [ 7641.701816] CPU 3 [ 7641.701819] Modules linked in: btrfs zlib_deflate libcrc32c ext2 ses enclosure bonding coretemp ghash_clmulni_intel aesni_intel cryptd aes_x86_64 psmouse microcode serio_raw sb_edac edac_core mei(C) joydev ioatdma mac_hid lp parport isci libsas scsi_transport_sas usbhid hid ixgbe igb dca megaraid_sas mdio [ 7641.702000] [ 7641.702030] Pid: 3064, comm: ceph-osd Tainted: G D C 3.4.0-rc7+ #10 Supermicro X9SRi/X9SRi [ 7641.702081] RIP: 0010:[a02532f2] [a02532f2]
Re: [PATCH 5/5] Btrfs: fix memory leak in btrfs_pause_balance()
On Thu, May 17, 2012 at 07:58:53PM +0800, Miao Xie wrote: We forget to free fs_info-balance_ctl in the btrfs_pause_balance() when umounting the fs. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c536d52..fd7fe80 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2937,6 +2937,9 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info) ret = -ENOTCONN; } + if (btrfs_fs_closing(fs_info) fs_info-balance_ctl) + unset_balance_control(fs_info); + mutex_unlock(fs_info-balance_mutex); return ret; } It is kfree()'d in free_fs_info(), which should be called on unmount. Am I missing something here ? Thanks, Ilya -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: initial readahead code and prototypes
On 05/17/12 15:46, Dan Carpenter wrote: On Thu, May 17, 2012 at 03:31:50PM +0200, Arne Jansen wrote: The assumption here is that if err == 0, eb is always != NULL. There's even a tiny comment above the function stating this: 107 /* in case of err, eb might be NULL */ Ah, right. I missed the comment. Thanks for doing this kind of sanity checking :) -Arne This code changes significantly with the patch btrfs: extend readahead interface Where it is written in a more obvious way. Cool. regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
trim malfunction in linux 3.3.6
Hello. I've been running Ubuntu 12.04 kernel and btrfs on two partitions of two GPT partitioned SSDs. Rootfs was btrfs subvol @ and homes were at @home. When I was batch trimming with fstrim / using Ubuntu's standard kernel 3.2.0 - everything was fine. Then I compiled vanilla 3.3.6 kernel ad tried to fstrim again, fs got severely damaged. It seems that batch trim miscalculates ranges and trims some occupied space. Can't say if GPT or other partitioning details matter. I will try to provide any info possible, but fs is trimmed badly, and I need this machine to be up and running, so will have to mkfs.btrfs again and use 3.2.0 kernel. Steps that caused corruption: 1. Created partitions on two (say /dev/sd[ab]) SSD drives with about 1G offset from the beginning (first partition is ext4 for /boot) 2. mkfs.btrfs /dev/sd[ab]2 3. created subvolumes @ and @home for mountpoints / and /home respectively 4. installed xubuntu 12.04 5. fstrim / 6. everything is ok 7. compiled and installed vanilla 3.3.6 kernel 8. reboot into 3.3.6 9. btrfs scrub - ok 10. fstrim / 11. fs got baaadly corrupted -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ceph on btrfs 3.4rc
On Thu, May 17, 2012 at 05:12:55PM +0200, Martin Mailand wrote: Hi Josef, no there was nothing above. Here the is another dmesg output. Hrm ok give this a try and hopefully this is it, still couldn't reproduce. Thanks, Josef diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 3771b85..559e716 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -57,9 +57,6 @@ struct btrfs_inode { /* used to order data wrt metadata */ struct btrfs_ordered_inode_tree ordered_tree; - /* for keeping track of orphaned inodes */ - struct list_head i_orphan; - /* list of all the delalloc inodes in the FS. There are times we need * to write all the delalloc pages to disk, and this list is used * to walk them all. @@ -153,6 +150,7 @@ struct btrfs_inode { unsigned dummy_inode:1; unsigned in_defrag:1; unsigned delalloc_meta_reserved:1; + unsigned has_orphan_item:1; /* * always compress this one file diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ba8743b..72cdf98 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1375,7 +1375,7 @@ struct btrfs_root { struct list_head root_list; spinlock_t orphan_lock; - struct list_head orphan_list; + atomic_t orphan_inodes; struct btrfs_block_rsv *orphan_block_rsv; int orphan_item_inserted; int orphan_cleanup_state; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 19f5b45..25dba7a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1153,7 +1153,6 @@ static void __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize, root-orphan_block_rsv = NULL; INIT_LIST_HEAD(root-dirty_list); - INIT_LIST_HEAD(root-orphan_list); INIT_LIST_HEAD(root-root_list); spin_lock_init(root-orphan_lock); spin_lock_init(root-inode_lock); @@ -1166,6 +1165,7 @@ static void __setup_root(u32 nodesize, u32 leafsize, u32 sectorsize, atomic_set(root-log_commit[0], 0); atomic_set(root-log_commit[1], 0); atomic_set(root-log_writers, 0); + atomic_set(root-orphan_inodes, 0); root-log_batch = 0; root-log_transid = 0; root-last_log_commit = 0; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 54ae3df..7cc1c96 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2104,12 +2104,12 @@ void btrfs_orphan_commit_root(struct btrfs_trans_handle *trans, struct btrfs_block_rsv *block_rsv; int ret; - if (!list_empty(root-orphan_list) || + if (atomic_read(root-orphan_inodes) || root-orphan_cleanup_state != ORPHAN_CLEANUP_DONE) return; spin_lock(root-orphan_lock); - if (!list_empty(root-orphan_list)) { + if (atomic_read(root-orphan_inodes)) { spin_unlock(root-orphan_lock); return; } @@ -2166,8 +2166,8 @@ int btrfs_orphan_add(struct btrfs_trans_handle *trans, struct inode *inode) block_rsv = NULL; } - if (list_empty(BTRFS_I(inode)-i_orphan)) { - list_add(BTRFS_I(inode)-i_orphan, root-orphan_list); + if (!BTRFS_I(inode)-has_orphan_item) { + BTRFS_I(inode)-has_orphan_item = 1; #if 0 /* * For proper ENOSPC handling, we should do orphan @@ -2180,6 +2180,7 @@ int btrfs_orphan_add(struct btrfs_trans_handle *trans, struct inode *inode) insert = 1; #endif insert = 1; + atomic_inc(root-orphan_inodes); } if (!BTRFS_I(inode)-orphan_meta_reserved) { @@ -2198,6 +2199,9 @@ int btrfs_orphan_add(struct btrfs_trans_handle *trans, struct inode *inode) if (insert = 1) { ret = btrfs_insert_orphan_item(trans, root, btrfs_ino(inode)); if (ret ret != -EEXIST) { + spin_lock(root-orphan_lock); + BTRFS_I(inode)-has_orphan_item = 0; + spin_unlock(root-orphan_lock); btrfs_abort_transaction(trans, root, ret); return ret; } @@ -2227,13 +2231,21 @@ int btrfs_orphan_del(struct btrfs_trans_handle *trans, struct inode *inode) int release_rsv = 0; int ret = 0; + /* +* evict_inode gets called without holding the i_mutex so we need to +* take the orphan lock to make sure we are safe in messing with these. +*/ spin_lock(root-orphan_lock); - if (!list_empty(BTRFS_I(inode)-i_orphan)) { - list_del_init(BTRFS_I(inode)-i_orphan); - delete_item = 1; + if (BTRFS_I(inode)-has_orphan_item) { + if (trans) { + BTRFS_I(inode)-has_orphan_item = 0; + delete_item = 1; + } else { + WARN_ON(1); + } }
Re: Ceph on btrfs 3.4rc
2012/5/17 Josef Bacik jo...@redhat.com: On Thu, May 17, 2012 at 05:12:55PM +0200, Martin Mailand wrote: Hi Josef, no there was nothing above. Here the is another dmesg output. Hrm ok give this a try and hopefully this is it, still couldn't reproduce. Thanks, Josef Well, I hate to say it, but the new patch doesn't seem to change much... Regards, Christian [ 123.507444] Btrfs loaded [ 202.683630] device fsid 2aa7531c-0e3c-4955-8542-6aed7ab8c1a2 devid 1 transid 4 /dev/sda [ 202.693704] btrfs: use lzo compression [ 202.697999] btrfs: enabling inode map caching [ 202.702989] btrfs: enabling auto defrag [ 202.707190] btrfs: disk space caching is enabled [ 202.712721] btrfs flagging fs with big metadata feature [ 207.839761] device fsid f81ff6a1-c333-4daf-989f-a28139f15f08 devid 1 transid 4 /dev/sdb [ 207.849681] btrfs: use lzo compression [ 207.853987] btrfs: enabling inode map caching [ 207.858970] btrfs: enabling auto defrag [ 207.863173] btrfs: disk space caching is enabled [ 207.868635] btrfs flagging fs with big metadata feature [ 210.857328] device fsid 9b905faa-f4fa-4626-9cae-2cd0287b30f7 devid 1 transid 4 /dev/sdc [ 210.867265] btrfs: use lzo compression [ 210.871560] btrfs: enabling inode map caching [ 210.876550] btrfs: enabling auto defrag [ 210.880757] btrfs: disk space caching is enabled [ 210.886228] btrfs flagging fs with big metadata feature [ 214.296287] device fsid f7990e4c-90b0-4691-9502-92b60538574a devid 1 transid 4 /dev/sdd [ 214.306510] btrfs: use lzo compression [ 214.310855] btrfs: enabling inode map caching [ 214.315905] btrfs: enabling auto defrag [ 214.320174] btrfs: disk space caching is enabled [ 214.325706] btrfs flagging fs with big metadata feature [ 1337.937379] [ cut here ] [ 1337.942526] kernel BUG at fs/btrfs/inode.c:2224! [ 1337.947671] invalid opcode: [#1] SMP [ 1337.952255] CPU 5 [ 1337.954300] Modules linked in: btrfs zlib_deflate libcrc32c xfs exportfs sunrpc bonding ipv6 sg pcspkr serio_raw iTCO_wdt iTCO_vendor_support iomemory_vsl(PO) ixgbe dca mdio i7core_edac edac_core hpsa squashfs [last unloaded: scsi_wait_scan] [ 1337.978570] [ 1337.980230] Pid: 6812, comm: ceph-osd Tainted: P O 3.3.5-1.fits.1.el6.x86_64 #1 HP ProLiant DL180 G6 [ 1337.991592] RIP: 0010:[a035675c] [a035675c] btrfs_orphan_del+0x14c/0x150 [btrfs] [ 1338.001897] RSP: 0018:8805e1171d38 EFLAGS: 00010282 [ 1338.007815] RAX: fffe RBX: 88061c3c8400 RCX: 00b37f48 [ 1338.015768] RDX: 00b37f47 RSI: 8805ec2a1cf0 RDI: ea0017b0a840 [ 1338.023724] RBP: 8805e1171d68 R08: 60f9d88028a0 R09: a033016a [ 1338.031675] R10: R11: 0004 R12: 8805de7f57a0 [ 1338.039629] R13: 0001 R14: 0001 R15: 8805ec2a5280 [ 1338.047584] FS: 7f4bffc6e700() GS:8806272a() knlGS: [ 1338.056600] CS: 0010 DS: ES: CR0: 80050033 [ 1338.063003] CR2: ff600400 CR3: 0005e34c3000 CR4: 06e0 [ 1338.070954] DR0: DR1: DR2: [ 1338.078909] DR3: DR6: 0ff0 DR7: 0400 [ 1338.086865] Process ceph-osd (pid: 6812, threadinfo 8805e117, task 88060fa81940) [ 1338.096268] Stack: [ 1338.098509] 8805e1171d68 8805ec2a5280 88051235b920 [ 1338.106795] 88051235b920 0008 8805e1171e08 a036043c [ 1338.115082] 00011000 [ 1338.123367] Call Trace: [ 1338.126111] [a036043c] btrfs_truncate+0x5bc/0x640 [btrfs] [ 1338.133213] [a03605b6] btrfs_setattr+0xf6/0x1a0 [btrfs] [ 1338.140105] [811816fb] notify_change+0x18b/0x2b0 [ 1338.146320] [81276541] ? selinux_inode_permission+0xd1/0x130 [ 1338.153699] [81165f44] do_truncate+0x64/0xa0 [ 1338.159527] [81172669] ? inode_permission+0x49/0x100 [ 1338.166128] [81166197] sys_truncate+0x137/0x150 [ 1338.172244] [8158b1e9] system_call_fastpath+0x16/0x1b [ 1338.178936] Code: 89 e7 e8 88 7d fe ff eb 89 66 0f 1f 44 00 00 be a4 08 00 00 48 c7 c7 59 49 3b a0 45 31 ed e8 5c 78 cf e0 45 31 f6 e9 30 ff ff ff 0f 0b eb fe 55 48 89 e5 48 83 ec 40 48 89 5d d8 4c 89 65 e0 4c [ 1338.200623] RIP [a035675c] btrfs_orphan_del+0x14c/0x150 [btrfs] [ 1338.208317] RSP 8805e1171d38 [ 1338.212681] ---[ end trace 86be14f0f863ea79 ]--- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ceph on btrfs 3.4rc
Hi Josef, I hit exact the same bug as Christian with your last patch. -martin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] Btrfs: count the chunks which will be relocated at first
On Thu, 17 May 2012 17:58:56 +0300, Ilya Dryomov wrote: On Thu, May 17, 2012 at 07:56:53PM +0800, Miao Xie wrote: the balance function should count the chunks which will be relocated at first, and then relocate those chunks one by one. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 759d024..91da8a2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2580,7 +2580,7 @@ again: chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); -if (!counting) { +if (counting) { spin_lock(fs_info-balance_lock); bctl-stat.considered++; spin_unlock(fs_info-balance_lock); __btrfs_balance() already calculates the approximate number of chunks that will be relocated and stores that value in bctl-stat.expected. The stat.considered counter OTOH is supposed to reflect the number of chunks processed through balance filters and it is meant to be updated at relocation pass, so AFAICS if (!counting) is the right test. What exactly are you trying to fix here ? In fact this number reflect the number of all the chunks that may be relocated. So since we can know the approximate number of chunks that will be relocated before the relocation start, why can not we know it at the beginning? And beside that, as a user, I am very strange that this counter is changed every time I get the status of the balance, it should be the fixed number since it reflect the number of all the chunks that may be relocated. Thanks Miao -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/5] Btrfs: fix memory leak in btrfs_pause_balance()
On thu, 17 May 2012 18:20:04 +0300, Ilya Dryomov wrote: On Thu, May 17, 2012 at 07:58:53PM +0800, Miao Xie wrote: We forget to free fs_info-balance_ctl in the btrfs_pause_balance() when umounting the fs. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c536d52..fd7fe80 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2937,6 +2937,9 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info) ret = -ENOTCONN; } +if (btrfs_fs_closing(fs_info) fs_info-balance_ctl) +unset_balance_control(fs_info); + mutex_unlock(fs_info-balance_mutex); return ret; } It is kfree()'d in free_fs_info(), which should be called on unmount. Am I missing something here ? It is my mistake. Sorry. BTW I think freeing it in btrfs_pause_balance() is better because it is relative to the balance, or the readability will become worse. Thanks Miao -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html