Re: [SOLVED] BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys! After a week without experiencing the problem, I think we can mark this problem as solved. I want to thanks all the devs on this list. You were always very helpful. For anyone who is still experiencing the reported problem, upgrade to kernel 4.7.3 and I think you will be fine :) Best regards and thank you all, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Josef, Em qui, 2016-09-22 às 13:49 -0400, Josef Bacik escreveu: > That patch fixed a problem where we would screw up the ENOSPC > accounting, and > would slowly leak space into one of the counters. So eventually (or > often in > your case) you'd hit ENOSPC, but have plenty of space available. If > you > unmounted and mounted again, or simply rebooted, everything would > have been > fine. You can still use the fs, the accounting is purely in memory > so it's not > like your FS is permanently screwed. Thanks, Thank you very much for the explanation. I am very glad it is finally fixed here :) Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Josef, Em qui, 2016-09-22 às 10:39 -0400, Josef Bacik escreveu: > This is what fixed it. I thought it was in 4.7 which is why I > started paying > attention, but I guess I was wrong. Glad your problem is > resolved. Thanks, Do you have any explanations why the problem solved by the patch was causing me the ENOSPC? Also, is it necessary to format my partition or should I consider it good for use after the installation of the new kernel? Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em qui, 2016-09-22 às 09:41 -0400, Austin S. Hemmelgarn escreveu: > Most likely the kernel upgrade fixed things. It's possible that the > large allocation is impacting something and making it work, but I > don't > think that that is very likely. The patches related to btrfs I could find in kernel 4.7.2 and 4.7.3 changelog are: commit 8d32aaa89067225d4202a362dc201280e2514952 Author: Chris Mason Date: Tue Jul 19 05:52:36 2016 -0700 Btrfs: fix delalloc accounting after copy_from_user faults commit f495a60eb6351bf2f29fdbc1854375df9fe4022b Author: Paolo Valente Date: Wed Jul 27 07:22:05 2016 +0200 block: add missing group association in bio-cloning functions Fixes: da2f0f74cf7d ("Btrfs: add support for blkio controllers") commit ff3235105fc7e4ecf04eb308940821d4a098c08d Author: Jeff Mahoney Date: Wed Aug 17 21:58:33 2016 -0400 btrfs: don't create or leak aliased root while cleaning up orphans commit 64563a38fde57a26f4d68d488d0d4918f843547c Author: Jeff Mahoney Date: Mon Aug 15 12:10:33 2016 -0400 btrfs: properly track when rescan worker is running commit 69b69167965e108a775ef20decabcc76fbe4fc08 Author: Jeff Mahoney Date: Mon Aug 8 22:08:06 2016 -0400 btrfs: waiting on qgroup rescan should not always be interruptible Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Guys, Something very strange happened. I have not seen the problem since Monday, which is pretty much the first time ever I work more than 3 days without seeing it. Ok, it can be a coincidence. Notice that I did not change anything related to my work behavior. However, I did do two things: _ Update the kernel to 4.7.2; and _ Created 50 dummy files with 3.0 GiB each. Can anyone, please, tell me if these things seems to be correlated? Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Chris, Em Qua, 2016-09-14 às 16:25 -0600, Chris Murphy escreveu: > All I can think of is the file system has gotten into a unique state > through a combination of events. I'm still suspicious that qgroups is > contributing to the problem even after being disabled. The workload > you're talking about is completely ordinary and trivial. This seems reasonable. However, I formatted the computer and after two days, if I remember correctly, I started to see the problems again. I'm still thinking it should be also related to my HDD (7200 RPM). In all my other computers, everything is fine and I use SSD. > The openSUSE layout is basically impossible to backup and restore, > there's astrometric tons of snapshots, there's no recursive btrfs > send/receive to try and migrate it to a new file system intact, so > you'd pretty much just have to reinstall it no matter what. If it > were > me, reinstall with Btrfs same as now, and first thing before anything > else I'd disable quotas. Or yeah, it's completely reasonable for you > to move to a different file system, it's really a coin toss for ext4 > vs XFS, but at least XFS now checksums metadata and the journal by > default so if I thought about it at the time of the installation I'd > do that. Thanks! > Yeah FWIW, the devs seem to prefer the output from 'grep . -IR > /sys/fs/btrfs//allocation/' so for these kinds of problems > I'd > report that. Yeah, unfortunately I forgot this one today :( > If you *really* want to, you could grab a Fedora Rawhide nightly that > has kernel 4.8 rc6 on it, with debug stuff enabled. If it face > plants, > it should catch useful stuff for Josef. If it doesn't, maybe it fixes > enough things that you can get back to work for a while longer until > a > long term fix becomes available. The only way to know for sure is to > test it. But it's completely sane to just switch to XFS and get back > to work also. > > Current > https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-201 > 60914.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst- > x86_64-Rawhide-20160914.n.0.iso.n.0.iso > > Use 'dd if=ISO of=USBstick bs=256K' that will boot anything, BIOS or > UEFI. At the menu, choose Troubleshooting, then the Rescue option, at > the next text menu choose 3 to get to a shell. And from there you can > mount with enospc_debug, and do a balance of the file system. To get > logs off the system, use a 2nd USB stick, or if you have wired > ethernet use scp, or if you know nmcli you can maybe get the wireless > up by command line. This seems good. However, I just have access to that machine during my working period, and I just does not have time to test this, sorry :( Nevertheless, when you mentioned the `dd` command, I had a great idea that can help me to live with this problem until I have access to kernel 4.8. I will use `dd` to create, let's say, 100 files with 3 GiB each in my /home directory. Hence, when I see ENOSPC, I will just need to delete some of these files. I think this should work. Thanks for all the advices Chris! Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys, The problem happened again, but now it was way more serious. I was doing a big Tumbleweed update (4680 packages) and I got the ENOSPC during the update. To avoid being left with a broken system, as it has already happened in the past, I, unfortunately, needed to delete data that I really was not planning to. This is a disaster, because I have more than 1 TiB of **free space**. After deleting 7GiB of data, I could run rebalance and the update finished successfully. However, the ENOSPC happened 3 more times (!) and I always needed to run rebalance to keep the update going. Sometimes, during the rebalance, I saw the message: [28736.688266] BTRFS info (device sda6): relocating block group 389998968832 flags 34 [28737.376302] BTRFS info (device sda6): found 4 extents [28737.712815] BTRFS info (device sda6): relocating block group 343760961536 flags 36 [28738.010030] BTRFS info (device sda6): relocating block group 343224090624 flags 36 [28738.343461] BTRFS info (device sda6): relocating block group 342687219712 flags 36 [28738.660023] BTRFS info (device sda6): relocating block group 342150348800 flags 36 [28738.665241] use_block_rsv: 11 callbacks suppressed [28738.665247] [ cut here ] [28738.665290] WARNING: CPU: 10 PID: 639 at ../fs/btrfs/extent- tree.c:8097 btrfs_alloc_tree_block+0x3f1/0x4c0 [btrfs] [28738.665292] BTRFS: block rsv returned -28 [28738.665295] Modules linked in: dm_mod fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4 iptable_raw xt_CT snd_hda_codec_hdmi snd_hda_codec_realtek nvidia_drm(PO) snd_hda_codec_generic snd_hda_intel nvidia_modeset(PO) snd_hda_codec snd_hda_core snd_hwdep iptable_filter nvidia(PO) joydev drm_kms_helper intel_rapl drm fb_sys_fops iTCO_wdt mei_wdt syscopyarea snd_pcm snd_timer iTCO_vendor_support sysfillrect sb_edac snd i2c_i801 mei_me lpc_ich edac_core sysimgblt ip6table_mangle x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel soundcore mei aes_x86_64 [28738.665359] lrw gf128mul glue_helper ablk_helper cryptd e1000e hp_wmi ioatdma fjes nf_conntrack_netbios_ns ptp shpchp pps_core sparse_keymap pcspkr mfd_core nf_conntrack_broadcast rfkill tpm_infineon tpm_tis dca tpm nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables btrfs xor raid6_pq hid_generic usbhid crc32c_intel serio_raw xhci_pci ehci_pci sr_mod firewire_ohci xhci_hcd ehci_hcd cdrom firewire_core crc_itu_t usbcore isci usb_common libsas ata_generic mpt3sas raid_class scsi_transport_sas wmi button sg [28738.665419] CPU: 10 PID: 639 Comm: systemd-journal Tainted: PW O4.7.1-1-default #1 [28738.665421] Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 [28738.665425] 81393104 88080bc63a68 [28738.665430] 8107ca1e 8804eaa73300 88080bc63ab8 4000 [28738.665434] 88017be9a000 880f51b31760 8107ca8f [28738.665438] Call Trace: [28738.665464] [] dump_trace+0x5e/0x320 [28738.665472] [] show_stack_log_lvl+0x10c/0x180 [28738.665478] [] show_stack+0x21/0x40 [28738.665486] [] dump_stack+0x5c/0x78 [28738.665496] [] __warn+0xbe/0xe0 [28738.665503] [] warn_slowpath_fmt+0x4f/0x60 [28738.665529] [] btrfs_alloc_tree_block+0x3f1/0x4c0 [btrfs] [28738.665560] [] btrfs_copy_root+0xf2/0x280 [btrfs] [28738.665593] [] create_reloc_root+0x171/0x1e0 [btrfs] [28738.665623] [] btrfs_init_reloc_root+0x8f/0xa0 [btrfs] [28738.665652] [] record_root_in_trans+0xb2/0x110 [btrfs] [28738.665679] [] btrfs_record_root_in_trans+0x41/0x70 [btrfs] [28738.665704] [] start_transaction+0xa0/0x4f0 [btrfs] [28738.665732] [] btrfs_dirty_inode+0x33/0xc0 [btrfs] [28738.665741] [] file_update_time+0x99/0xf0 [28738.665770] [] btrfs_page_mkwrite+0xa3/0x450 [btrfs] [28738.665779] [] do_page_mkwrite+0x69/0xc0 [28738.665785] [] handle_pte_fault+0xf4/0x1760 [28738.665792] [] handle_mm_fault+0x29e/0x5a0 [28738.665798] [] __do_page_fault+0x1e0/0x510 [28738.665809] [] page_fault+0x28/0x30 [28738.669296] DWARF2 unwinder stuck at page_fault+0x28/0x30 [28738.669300] Leftover inexact backtrace: [28738.669327] ---[ end trace 8ef9cfba38cc9bfc ]--- Look what happened to my METADATA during the update: 1) When the problem occured: # btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 63.07GiB Device unallocated: 1.20TiB Device missing: 0.00B Used: 50.21GiB Free (estimated): 1.20TiB (min: 612.49GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 400.00MiB (used:
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Josef, Em Ter, 2016-09-13 às 17:01 -0400, Josef Bacik escreveu: > I just started paying attention to this, the last kernel I saw you > were running > was 4.7. Have you tried a recent kernel, like chris's tree? > > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > for-linus-4.8 > > is what I would like you to try if not. Thanks, > > Josef Unfortunately, since this is a production machine, I am not allowed to install unreleased kernels. If this is the only solution, I will need to wait for 4.8 or search if anyone has already backported the BTRFS patches for 4.7. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys, One more time I saw the problem. It begins to happen on a daily basis now. Unfortunately the `enospc_debug` flag did not help. I did not see any new information in the logs. This time, only a hard reset worked. I could not even reboot using gnome panel. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi! Em Ter, 2016-09-13 às 11:17 +0800, Wang Xiaoguang escreveu: > It maybe a irrelevant question, but do you have compression enabled? > > Regards, > Xiaoguang Wang No, I do not have compression enabled. I'm using openSUSE's default configuration. By the way, I was wrongly mounting the filesystem with `enospc_debug`. It turns out that I modified the fstab in a backup directory, sorry :) Now, I did it correctly so, hopefully, we will have much more information about the problem the next time I see it! Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi all! Em Seg, 2016-09-05 às 16:49 +0800, Qu Wenruo escreveu: > Just like what Wang has mentioned, would you please paste all the > output > of the contents of /sys/fs/btrfs//allocation? > > It's recommended to use "grep . -IR " to get all the data as > it > will show the file name. So, one more time, I see the problem. This time I was just using Firefox and I cannot recover using `btrfs balance`. I think that, one more time, I will need to reboot this machine. This problem is really causing me a lot of troubles :( I have disabled the quotas and the first error message after the problem was: [ 2444.592255] [ cut here ] [ 2444.592314] WARNING: CPU: 4 PID: 289 at ../fs/btrfs/extent- tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] [ 2444.592317] Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw nvidia_drm(PO) ipt_REJECT nf_reject_ipv4 snd_hda_codec_hdmi nvidia_modeset(PO) intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp nvidia(PO) coretemp snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic iptable_raw drm_kms_helper snd_hda_intel drm xt_CT snd_hda_codec snd_hda_core snd_hwdep kvm_intel snd_pcm snd_timer joydev mei_wdt fb_sys_fops iTCO_vendor_support i2c_i801 lpc_ich kvm syscopyarea snd sysfillrect irqbypass mei_me hp_wmi sysimgblt iptable_filter crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper [ 2444.592386] cryptd soundcore mei sparse_keymap rfkill e1000e shpchp pcspkr ioatdma mfd_core tpm_infineon tpm_tis dca tpm fjes ptp pps_core ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables btrfs xor raid6_pq hid_generic usbhid crc32c_intel serio_raw xhci_pci ehci_pci xhci_hcd ehci_hcd firewire_ohci sr_mod firewire_core cdrom crc_itu_t usbcore isci usb_common libsas ata_generic mpt3sas raid_class scsi_transport_sas wmi button sg [ 2444.592447] CPU: 4 PID: 289 Comm: kworker/u65:7 Tainted: PW O4.7.1-1-default #1 [ 2444.592450] Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 [ 2444.592458] Workqueue: writeback wb_workfn (flush-btrfs-1) [ 2444.592462] 81393104 [ 2444.592468] 8107ca1e 88080de6d800 9000 88080c437a00 [ 2444.592472] 880634b379ac 9000 88080dcfb73c a02af98e [ 2444.592477] Call Trace: [ 2444.592499] [] dump_trace+0x5e/0x320 [ 2444.592507] [] show_stack_log_lvl+0x10c/0x180 [ 2444.592514] [] show_stack+0x21/0x40 [ 2444.592523] [] dump_stack+0x5c/0x78 [ 2444.592531] [] __warn+0xbe/0xe0 [ 2444.592561] [] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] [ 2444.592602] [] btrfs_clear_bit_hook+0x296/0x380 [btrfs] [ 2444.592642] [] clear_state_bit+0x55/0x1d0 [btrfs] [ 2444.592676] [] __clear_extent_bit+0x13d/0x3f0 [btrfs] [ 2444.592707] [] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] [ 2444.592739] [] cow_file_range+0x299/0x440 [btrfs] [ 2444.592768] [] run_delalloc_range+0x392/0x3b0 [btrfs] [ 2444.592801] [] writepage_delalloc.isra.40+0x100/0x170 [btrfs] [ 2444.592834] [] __extent_writepage+0xc3/0x340 [btrfs] [ 2444.592864] [] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] [ 2444.592894] [] extent_writepages+0x4e/0x60 [btrfs] [ 2444.592900] [] __writeback_single_inode+0x3d/0x3b0 [ 2444.592907] [] writeback_sb_inodes+0x20a/0x440 [ 2444.592914] [] __writeback_inodes_wb+0x87/0xb0 [ 2444.592921] [] wb_writeback+0x28d/0x330 [ 2444.592927] [] wb_workfn+0x222/0x3f0 [ 2444.592934] [] process_one_work+0x1ed/0x4e0 [ 2444.592942] [] worker_thread+0x47/0x4c0 [ 2444.592947] [] kthread+0xbd/0xe0 [ 2444.592954] [] ret_from_fork+0x1f/0x40 [ 2444.596679] DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 [ 2444.596683] Leftover inexact backtrace: [ 2444.596689] [] ? kthread_worker_fn+0x170/0x170 I will also provide the information requested by Qu: grep . -IR /sys/fs/btrfs/e9efaa0c-d477-4249-830f- ee5956768b29/allocation allocation/data/flags:1 allocation/data/bytes_pinned:0 allocation/data/bytes_may_use:0 allocation/data/total_bytes_pinned:202973265920 allocation/data/bytes_reserved:0 allocation/data/bytes_used:45623730176 allocation/data/single/used_bytes:45623730176 allocation/data/single/total_bytes:46179287040 allocation/data/total_bytes:46179287040 allocation/data/disk_total:46179287040 allocation/data/disk_used:45623730176 allocation/metadata/dup/used_bytes:1120698368 allocation/metadata/dup/total_bytes:6979321856 allocation/metadata/flags:4 allocation/metadata/bytes_pinned:0 allocation/metadata/bytes_may_use:88521768960 allocation/metadata/total_bytes_pinned:-44285952 allocation/metadata/bytes_res
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Chris, Em Sex, 2016-09-02 às 21:41 -0600, Chris Murphy escreveu: > I suggest removing the hardware, and the proprietary driver, and > retest the system with the existing Tumbleweed 4.7.0 kernel; and if > that still fails, then try the Leap 4.4 kernel. > > Proprietary kernels can do all kinds of crazy things they shouldn't > so > it's entirely possible that driver is a factor in the problem. Actually it is just a module that I load. It is only loaded when I need to work with it. However, I can assure this is not the problem because I installed the board one month ago +-, but I have been seeing ENOSPC since the beginning of the year IIRC. I am using Tumbleweed default kernel right now, but I just can try Leap when 42.2 is released. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys! Em Sex, 2016-09-02 às 16:39 -0600, Chris Murphy escreveu: > Worth a shot, considering the opensuse/SLE 4.4 kernel has a shittonne > of backports. It seems unlikely to me opensuse intends to not support > your hardware (skylake?) Actually it is a peripheral we use to program embedded systems here and the (proprietary) driver requires kernel >= 4.6. I barely use it. I am really thinking to transfer it to another machine just to be able to change my kernel. I will post here one thing I already posted on openSUSE mailing list: I think I forgot to mention one very important thing: I have been using Tumbleweed+BTRFS on this machine for a very very very long time. I think I installed it just after it changed to the current model. By that time, I was using the same machine but without one peripheral that requires a "new" kernel (HDD, processor, RAM, everything was the same). AFAIK, the first time I saw that problem was this year. So, I think it must be a regression after some kernel / btrfs-progs update. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi! Em Sex, 2016-09-02 às 15:34 -0600, Chris Murphy escreveu: > Except for your software build case, I have about the same workload > you have with two machines, one SSD one HDD, using 4.7.0 for a month, > and then 4.7.2 for the last week. I haven't had any enospc on these > two systems. > > I think for you the path of least resistance that also permits > further > testing is to see if you can track down the leap 42.2 beta kernel > which is 4.4.19-1-default. I'm not easily finding that particular > one, > but I did find something a bit more recent: > http://download.opensuse.org/repositories/Kernel:/openSUSE-42.2/stand > ard/x86_64/ Unfortunately, it will not be possible since my actual hardware depends on kernel >= 4.6 :( Just now, I saw the problem again. For the first time, it happened twice in a small period. I was copying the e-mail from one IMAP server to my local HD. I use offlineimap, but this time it changed the backend to sqlite and started to create tons of database files, I think. My HDD IO stayed at 60/70% for a very long period. Hence, let's do a review of situations in which I saw the problem: 1) Local builds using `osc`; 2) During `zypper dup`; 3) When offlineimap created tons of database files; 4) During rsync-ing /home; 4) During usage of a virtual machine (the disk image was in an EXT4 partition). I think we can conclude that this problem is tightly coupled with actions that require a lot of writing to the HDD. Here is the specification of my HDD: hdparm -I /dev/sda /dev/sda: ATA device, with non-removable media Model Number: ST2000DM001-1CH164 Serial Number: W1E73CF5 Firmware Revision: HP34 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 Standards: Used: unknown (minor revision code 0x001f) Supported: 9 8 7 6 5 Likely used: 9 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBAuser addressable sectors: 268435455 LBA48 user addressable sectors: 3907029168 Logical Sector size: 512 bytes Physical Sector size: 4096 bytes Logical Sector-0 offset: 0 bytes device size with M = 1024*1024: 1907729 MBytes device size with M = 1000*1000: 2000398 MBytes (2000 GB) cache/buffer size = unknown Form Factor: 3.5 inch Nominal Media Rotation Rate: 7200 Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = ? Advanced power management level: 128 Recommended acoustic management value: 208, current value: 0 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: *SMART feature set Security Mode feature set *Power Management feature set *Write cache *Look-ahead *WRITE_BUFFER command *READ_BUFFER command *DOWNLOAD_MICROCODE *Advanced Power Management feature set Power-Up In Standby feature set *SET_FEATURES required to spinup after power up *48-bit Address feature set *Device Configuration Overlay feature set *Mandatory FLUSH_CACHE *FLUSH_CACHE_EXT *SMART error logging *SMART self-test *General Purpose Logging feature set *64-bit World wide name *WRITE_UNCORRECTABLE_EXT command *{READ,WRITE}_DMA_EXT_GPL commands *Segmented DOWNLOAD_MICROCODE *Gen1 signaling speed (1.5Gb/s) *Gen2 signaling speed (3.0Gb/s) *Gen3 signaling speed (6.0Gb/s) *Native Command Queueing (NCQ) *Phy event counters *READ_LOG_DMA_EXT equivalent to READ_LOG_EXT *DMA Setup Auto-Activate optimization Device-initiated interface power management *Software settings preservation *SMART Command Transport (SCT) feature set *SCT Read/Write Long (AC1), obsolete *SCT Error Recovery Control (AC3) *SCT Features Control (AC4) *SCT Data Tables (AC5) unknown 206[12] (vendor specific) unknown 206[13] (vendor specific) Secu
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi again guys! After I rebooted the computer, I still can't run balance on metatada: btrfs balance start -musage=1 / ERROR: error during balancing '/': No space left on device There may be more info in syslog - try dmesg | tail dmesg shows: [ 2022.530285] BTRFS info (device sda6): relocating block group 128509280256 flags 36 [ 2023.355206] BTRFS info (device sda6): relocating block group 127972409344 flags 36 [ 2024.265313] BTRFS info (device sda6): relocating block group 127435538432 flags 36 [ 2025.646712] BTRFS info (device sda6): relocating block group 126898667520 flags 36 [ 2026.794791] BTRFS info (device sda6): relocating block group 126361796608 flags 36 [ 2028.023517] BTRFS info (device sda6): relocating block group 125824925696 flags 36 [ 2028.881287] BTRFS info (device sda6): relocating block group 125288054784 flags 36 [ 2029.739342] BTRFS info (device sda6): relocating block group 124751183872 flags 36 [ 2030.631990] BTRFS info (device sda6): relocating block group 124214312960 flags 36 [ 2031.523176] BTRFS info (device sda6): relocating block group 123677442048 flags 36 [ 2032.407859] BTRFS info (device sda6): relocating block group 123140571136 flags 36 [ 2033.806672] BTRFS info (device sda6): relocating block group 122603700224 flags 36 [ 2035.237712] BTRFS info (device sda6): relocating block group 122066829312 flags 36 [ 2038.257268] BTRFS info (device sda6): relocating block group 122033274880 flags 34 [ 2039.911443] BTRFS info (device sda6): relocating block group 121496403968 flags 36 [ 2040.958106] BTRFS info (device sda6): relocating block group 120959533056 flags 36 [ 2041.841051] BTRFS info (device sda6): relocating block group 120422662144 flags 36 [ 2042.828359] BTRFS info (device sda6): relocating block group 119885791232 flags 36 [ 2044.297744] BTRFS info (device sda6): relocating block group 119348920320 flags 36 [ 2045.684932] BTRFS info (device sda6): relocating block group 118812049408 flags 36 [ 2046.761787] BTRFS info (device sda6): relocating block group 118275178496 flags 36 [ 2048.200756] BTRFS info (device sda6): relocating block group 117738307584 flags 36 [ 2049.806986] BTRFS info (device sda6): relocating block group 117201436672 flags 36 [ 2051.170470] BTRFS info (device sda6): relocating block group 116664565760 flags 36 [ 2051.910536] BTRFS info (device sda6): relocating block group 116127694848 flags 36 [ 2052.678395] BTRFS info (device sda6): relocating block group 115590823936 flags 36 [ 2053.737959] BTRFS info (device sda6): relocating block group 106363355136 flags 36 [ 2054.852065] BTRFS info (device sda6): relocating block group 105826484224 flags 36 [ 2055.911187] BTRFS info (device sda6): relocating block group 105222504448 flags 36 [ 2057.047407] BTRFS info (device sda6): 4 enospc errors during balance and I have: btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 80.07GiB Device unallocated: 1.18TiB Device missing: 0.00B Used: 41.95GiB Free (estimated): 1.18TiB (min: 603.95GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 352.00MiB (used: 576.00KiB) Data,single: Size:40.01GiB, Used:39.95GiB /dev/sda6 40.01GiB Metadata,DUP: Size:20.00GiB, Used:1.00GiB /dev/sda6 40.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.18TiB Hope this brings new information! Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys! Jeff was right. I had the problem again today and quotas are disabled now. I couldn't get any useful message in log this time. Look at the metadata: btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 43.07GiB Device unallocated: 1.21TiB Device missing: 0.00B Used: 41.94GiB Free (estimated): 1.21TiB (min: 622.46GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 352.00MiB (used: 0.00B) Data,single: Size:40.01GiB, Used:39.94GiB /dev/sda6 40.01GiB Metadata,DUP: Size:1.50GiB, Used:1.00GiB /dev/sda6 3.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.21TiB Any ideas to help me? Regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Jeff, Em Sex, 2016-09-02 às 10:48 -0400, Jeff Mahoney escreveu: > Sorry, I miscommunicated there. The WARN_ON is annoying. It's the > underlying issue that's causing you to lose work that is the one that > concerns me. > Oh, OK, I see, sorry about that :) Thus, if disabling quotas does not help to fix my problem, is there any workaround you can think of to avoid the problem you suggested in the previous e-mail? Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Jeff, Em Sex, 2016-09-02 às 10:26 -0400, Jeff Mahoney escreveu: > I explained what I think Ronan's issue is in another part of the > thread > just now. I don't think that's a severe issue at > all. Annoying? Sure, > but I'm more concerned with the underlying ENOSPC issue. Without > more > info, I don't know what the cause of it is and when it was > introduced. Sorry, but I really need to humbly disagree with you. Look to what has already happened to me when the problem occurred (which is almost every day): 1) Firefox crash; 2) Libreoffice crash (auto-save stop working); 3) Can't save my work in any text editor (vim, neovim, gedit, etc.); 4) Sometimes I can't even log as root (in TTY or by `su`); 5) Sometimes only a hard-reset solves the problem; 6) I was left with a broken operational system when the problem occurred during a `zypper dup`. I just can't tell you how much work I lost during those situations. So, I think we cannot call this issue just annoying. I think it is very severe. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Jeff, Em Qui, 2016-09-01 às 13:12 -0400, Jeff Mahoney escreveu: > It's not. We use qgroups because that's the only way we can track > how > much space each subvolume is using, regardless of whether anyone > wants > to do enforcement. When it's working properly, snapper can make use > of > that information to make informed decisions on how much space will > actually be released when removing old snapshots. > Given that, what am I loosing by disabling qgroups here? Will I still be able to recover my machine using snapshots (this saved my two or three times)? Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi Jeff, Em Qui, 2016-09-01 às 13:43 -0400, Jeff Mahoney escreveu: > Absolutely. It doesn't affect the ability to take, retain, or > recover > using snapshots. It only affects the ability to see how much space a > particular snapshot is using on disk, both from the user wanting to > know > and snapper using it to make retention decisions. Snapper can handle > qgroups not being there. > Thanks for the prompt answer. I'm glad because space is not a concern here, at least now :) Hence, I have plenty time to wait for a proper fix. Until there, I will try to keep my snapshot count low. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em Qui, 2016-09-01 às 09:21 -0400, Austin S. Hemmelgarn escreveu: > Yes, you can just run `btrfs quota disable /` and it should > work. This > ironically reiterates that one of the bigger problems with BTRFS is > that > distros are enabling unstable and known broken features by default > on > install. I was pretty much dumbfounded when I first learned that > OpenSUSE is enabling BTRFS qgroups by default since they are known > to > not work reliably and cause all kinds of issues. Thanks Austin! I executed the command and now I get: btrfs qgroup show / ERROR: can't perform the search - No such file or directory ERROR: can't list qgroups: No such file or directory as expected. Now I will wait for +- 1 week to see if the problem will occur and, if not, I will send an e-mail to openSUSE factory mailing list to start a discussion if it is better to not enable qgroups by default. Best regards and thanks everyone for the help, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi! Em Qua, 2016-08-31 às 17:09 -0600, Chris Murphy escreveu: > OK so Ronan, I'm gonna guess the simplest work around for your > problem > is to disable quota support, and see if the problem happens again. > Look at the output of the command proposed by Jeff: btrfs qgroup show / qgroupid rfer excl 0/5 16.00KiB 16.00KiB 0/25716.00KiB 16.00KiB 0/25816.30MiB 16.30MiB 0/25911.65GiB309.67MiB 0/260 2.34MiB 2.34MiB 0/26116.00KiB 16.00KiB 0/26213.19GiB 13.19GiB 0/26316.00KiB 16.00KiB 0/26460.00KiB 60.00KiB 0/265 480.00KiB480.00KiB 0/26616.00KiB 16.00KiB 0/267 2.00GiB 2.00GiB 0/26816.00KiB 16.00KiB 0/26916.00KiB 16.00KiB 0/27016.00KiB 16.00KiB 0/27116.00KiB 16.00KiB 0/27216.00KiB 16.00KiB 0/27316.00KiB 16.00KiB 0/27416.00KiB 16.00KiB 0/275 205.78MiB205.78MiB 0/27616.00KiB 16.00KiB 0/27748.00KiB 48.00KiB 0/278 328.41MiB328.41MiB 0/283 3.92GiB 26.63MiB 0/285 3.93GiB 4.10MiB 0/294 7.84GiB100.59MiB 0/330 7.98GiB 6.61MiB 0/332 8.32GiB 69.17MiB 0/353 9.53GiB 49.46MiB 0/35510.51GiB235.39MiB 0/41511.54GiB 3.38MiB 0/41611.54GiB896.00KiB 0/41711.57GiB 2.68MiB 0/41811.57GiB160.00KiB 0/41911.54GiB 2.40MiB 0/42011.54GiB192.00KiB 0/42111.62GiB 4.61MiB 0/42211.83GiB212.93MiB 0/42711.64GiB 1.27MiB 0/42811.65GiB 4.25MiB 1/0 16.11GiB 4.77GiB 255/262 13.19GiB 13.19GiB This system was installed with Tumbleweed ISO and I did not change anything in btrfs options. Hence, it seems that openSUSE is enabling quotas by default. Now, I need to disable it and avoid triggering the problem. What is the best way I can do this? Is it OK to do just: btrfs quota disable / ? Or do I need to format and recreate btrfs without quotas? > If it doesn't happen again then it sounds like the reproduce steps > are: > > a. enable quota support > b. do something metadata heavy workload that's also maybe hitting > fsync; from opensuse list the example that sometimes causes it: > > > osc co home:Ronis_BR/julia > cd home:Ronis_BR/julia > osc build --root=`pwd`/jail openSUSE_Tumbleweed x86_64 > > I wonder if it's easier to hit it on a hard drive, slower fsyncs? This sounds good! Actually, I'm using a 7200RPM hard driver. Thank you all very much for all the help, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys! And the problem happened again. This time, I was only using Mozilla Firefox. I could get the very first message after the error. I hope it brings more information: [28039.672199] [ cut here ] [28039.672253] WARNING: CPU: 3 PID: 31800 at ../fs/btrfs/qgroup.c:2667 btrfs_qgroup_free_meta+0x88/0x90 [btrfs] [28039.672255] Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4 iptable_raw xt_CT nvidia_drm(PO) nvidia_modeset(PO) iptable_filter nvidia(PO) ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast drm_kms_helper nf_conntrack_ipv4 drm nf_defrag_ipv4 fb_sys_fops snd_hda_codec_hdmi joydev snd_hda_codec_realtek ip_tables syscopyarea snd_hda_codec_generic xt_conntrack snd_hda_intel sysfillrect intel_rapl sb_edac edac_core snd_hda_codec hp_wmi x86_pkg_temp_thermal intel_powerclamp snd_hda_core snd_hwdep nf_conntrack sparse_keymap sysimgblt coretemp kvm_intel kvm rfkill irqbypass snd_pcm snd_timer crct10dif_pclmul [28039.672305] e1000e crc32_pclmul ghash_clmulni_intel snd aesni_intel ip6table_filter aes_x86_64 lrw gf128mul glue_helper ablk_helper iTCO_wdt iTCO_vendor_support mei_wdt ioatdma pcspkr cryptd ip6_tables ptp lpc_ich fjes i2c_i801 dca mfd_core soundcore pps_core shpchp tpm_infineon tpm_tis tpm mei_me mei x_tables btrfs xor raid6_pq hid_generic usbhid crc32c_intel serio_raw xhci_pci ehci_pci sr_mod firewire_ohci xhci_hcd ehci_hcd cdrom firewire_core crc_itu_t isci usbcore usb_common libsas ata_generic mpt3sas raid_class scsi_transport_sas wmi button sg [28039.672373] CPU: 3 PID: 31800 Comm: gnome-terminal- Tainted: PW O4.7.1-1-default #1 [28039.672375] Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 [28039.672378] 81393104 [28039.672382] 8107ca1e 881008780800 00014000 881008780800 [28039.672386] ffe4 88100b297c00 88053b7e3540 a02c9f58 [28039.672390] Call Trace: [28039.672406] [] dump_trace+0x5e/0x320 [28039.672413] [] show_stack_log_lvl+0x10c/0x180 [28039.672419] [] show_stack+0x21/0x40 [28039.672425] [] dump_stack+0x5c/0x78 [28039.672430] [] __warn+0xbe/0xe0 [28039.672461] [] btrfs_qgroup_free_meta+0x88/0x90 [btrfs] [28039.672492] [] start_transaction+0x3c3/0x4f0 [btrfs] [28039.672521] [] btrfs_create+0x38/0x1d0 [btrfs] [28039.672528] [] path_openat+0x139b/0x14a0 [28039.672535] [] do_filp_open+0x7e/0xe0 [28039.672541] [] do_sys_open+0x124/0x1f0 [28039.672547] [] entry_SYSCALL_64_fastpath+0x1e/0xa8 [28039.676186] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Best regards, Ronan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em Ter, 2016-08-30 às 10:44 -0600, Chris Murphy escreveu: > It sounds related to read-only snapshots to me. I wonder if this > system has something busy that's writing to a file, database, even > maybe something just spamming journald, and then there's a read-only > snapshot during the write, which then triggers the enospc. > I saw the problem yesterday after lunch time (13:00) and the last snapper snapshot was taken at 10:17: snapper list Tipo | # | Pre # | Data | Usuário | Limpeza | Descrição | Dados de usuário ---++---+--+--+-- ---+---+-- single | 0 | | | root | | current | single | 1 | | Ter 16 Ago 2016 15:07:25 BRT | root | | first root filesystem | single | 2 | | Ter 16 Ago 2016 15:15:57 BRT | root | number | after installation| important=yes pre| 4 | | Ter 16 Ago 2016 15:26:44 BRT | root | number | zypp(y2base) | important=yes post | 5 | 4 | Ter 16 Ago 2016 16:12:46 BRT | root | number | | important=yes pre| 29 | | Ter 16 Ago 2016 18:02:43 BRT | root | number | zypp(zypper) | important=yes post | 30 | 29| Ter 16 Ago 2016 18:07:34 BRT | root | number | | important=yes pre| 45 | | Seg 22 Ago 2016 13:59:45 BRT | root | number | zypp(zypper) | important=yes post | 46 | 45| Seg 22 Ago 2016 14:11:17 BRT | root | number | | important=yes pre| 89 | | Seg 29 Ago 2016 09:56:19 BRT | root | number | yast sw_single| pre| 90 | | Seg 29 Ago 2016 10:00:00 BRT | root | number | zypp(y2base) | important=no post | 91 | 90| Seg 29 Ago 2016 10:01:11 BRT | root | number | | important=no pre| 92 | | Seg 29 Ago 2016 10:07:01 BRT | root | number | zypp(y2base) | important=no post | 93 | 92| Seg 29 Ago 2016 10:07:10 BRT | root | number | | important=no pre| 94 | | Seg 29 Ago 2016 10:12:32 BRT | root | number | zypp(y2base) | important=no post | 95 | 94| Seg 29 Ago 2016 10:14:25 BRT | root | number | | important=no post | 96 | 89| Seg 29 Ago 2016 10:17:17 BRT | root | number | | > Ronan, if you're given a work around, then it's even less likely the > bug gets fixed. But if you can disable snapper snapshots entirely and > the problem doesn't happen; or if you can increase the frequency of > snapper snapshots and the problem happens more often, that might help > narrow it down to a point where it's more easily reproduced. If it's > not related, that's still useful to know. I agree with you. The problem is that since this is a production machine, it is kind very problematic to have so many reboots that occurs randomly. I will install something using zypper, which will trigger snapper, and see if the problem will be triggered. I will be out of the office this afternoon, so the machine will be on idle. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi! Em Ter, 2016-08-30 às 10:12 +0800, Wang Xiaoguang escreveu: > For metadata, "bytes_may_use" is about 80GB, it's very big, > I think this value is very abnormal. > > So this explains why you have huge unallocated space, you still > get ENOSPC error. In kernel btrfs, there is a function > should_alloc_chunk() > to determine whether to allocate new chunks(new device space) > num_bytes = total_bytes - bytes_readonly; it's 2147483648 > num_allocated = bytes_used + bytes_reserved; it's 977354752 > > if num_allocated < num_bytes * 0.8, it will not allocate new device > space :) even you > have huge unallocated space. > > I think the root reason is that bytes_may_use has some computation > error and > is not be converted to bytes_used or bytes_reserved. > > I just explain why you get ENOSPC error even with huge unallocated > space > from > codes :) > Thanks! At least we known why ENOSPC is happening. > Can you work out a reproducer for this ENOSPC error, then I can > dig into codes to figure out the true reason. Unfortunately I failed in every attempt to trigger the problem. It happens randomly and I could not figure out yet what was triggering it. First, I though it was related to a build process inside a chroot jail, but then I see the problem happening after the computer being idle for a long time (+- 1h). So, no clues yet :( Is there any workaround I can do? Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys, I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). Here is the information requested: /sys/fs/btrfs/$UUID/allocation/data ./bytes_may_use 0 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 36128374784 ./disk_total 37589352448 ./disk_used 36128374784 ./flags 1 ./total_bytes 37589352448 ./total_bytes_pinned 20339560448 ./single/total_bytes 37589352448 ./single/used_bytes 36128374784 /sys/fs/btrfs/$UUID/allocation/metadata ./bytes_may_use 84974452736 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 977354752 ./disk_total 4294967296 ./disk_used 1954709504 ./flags 4 ./total_bytes 2147483648 ./total_bytes_pinned -57851904 ./dup/total_bytes 2147483648 ./dup/used_bytes 977354752 # btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 39.07GiB Device unallocated: 1.22TiB Device missing: 0.00B Used: 35.29GiB Free (estimated): 1.22TiB (min: 625.93GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 320.00MiB (used: 0.00B) Data,single: Size:35.01GiB, Used:33.47GiB /dev/sda6 35.01GiB Metadata,DUP: Size:2.00GiB, Used:932.00MiB /dev/sda6 4.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.22TiB # btrfs fi df / Data, single: total=35.01GiB, used=33.47GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=2.00GiB, used=932.09MiB GlobalReserve, single: total=320.00MiB, used=0.0 I also saw the following information in `journalctl`: Ago 29 10:25:33 ronanarraes-osd kernel: [ cut here ]--- - Ago 29 10:25:33 ronanarraes-osd kernel: WARNING: CPU: 4 PID: 30424 at ../fs/btrfs/extent-tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 10:25:33 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 10:25:33 ronanarraes-osd kernel: CPU: 4 PID: 30424 Comm: kworker/u65:1 Tainted: P O4.7.1-1-default #1 Ago 29 10:25:33 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 10:25:33 ronanarraes-osd kernel: Workqueue: writeback wb_workfn (flush-btrfs-1) Ago 29 10:25:33 ronanarraes-osd kernel: 81393104 Ago 29 10:25:33 ronanarraes-osd kernel: 8107ca1e 88100027c800 1000 88082ff06400 Ago 29 10:25:33 ronanarraes-osd kernel: 88100c7af784 1000 8805bd60f6cc a025098e Ago 29 10:25:33 ronanarraes-osd kernel: Call Trace: Ago 29 10:25:33 ronanarraes-osd kernel: [] dump_trace+0x5e/0x320 Ago 29 10:25:33 ronanarraes-osd kernel: [] show_stack_log_lvl+0x10c/0x180 Ago 29 10:25:33 ronanarraes-osd kernel: [] show_stack+0x21/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: [] dump_stack+0x5c/0x78 Ago 29 10:25:33 ronanarraes-osd kernel: [] __warn+0xbe/0xe0 Ago 29 10:25:33 ronanarraes-osd kernel: [] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] btrfs_clear_bit_hook+0x296/0x380 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] clear_state_bit+0x55/0x1d0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] __clear_extent_bit+0x13d/0x3f0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] run_delalloc_nocow+0x962/0xba0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] run_delalloc_range+0x35f/0x3b0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] writepage_delalloc.isra.40+0x100/0x170 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] __extent_writepage+0xc3/0x340 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] extent_writepages+0x4e/0x60 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [] __writeback_single_inode+0x3d/0x3b0 Ago 29 10:25:33 ronanarraes-osd kernel: [] writeback_sb_inodes+0x20a/0x440 Ago 29 10:25:33 ronanarraes-osd kernel: [] __writeback_inodes_wb+0x87/0xb0 Ago 29 10:25:33 ronanarraes-osd kernel: [] wb_writeback+0x28d/0x330 Ago 29 10:25:33 ronanarraes-osd kernel: [] wb_workfn+0x222/0x3f0 Ago 29 10:25:33 ronanarraes-osd kernel: [] process_one_work+0x1ed/0x4e0 Ago 29 10:25:33 r
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi! Em Seg, 2016-08-29 às 20:12 +0800, Wang Xiaoguang escreveu: > When strange ENOSPC errors occur, I think "btrfs fi usage" > or "btrfs di df" do not help too much. Their output do not > reflect btrfs kernel current status :) > > Would you please provide attribute files' values in > /sys/fs/btrfs/$UUID/allocation/data > and /sys/fs/btrfs/$UUID/allocation/metadata when ENOSPC error occurs. > Sure! As soon as I see the error again, I will send this results. Now, I see that if I move my jail directory to a ext4 partition, then I do not see the problem anymore, but I need more test to validade this assumption. Best regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em Seg, 2016-08-22 às 14:49 -0600, Chris Murphy escreveu: > This is really weird. I'm running 4.7.0 (Fedora) and I'm not > experiencing problems, let alone this. What is this kernel's > provenance? Is it a plain mainline 4.7.0 that you built? I'm not > really sure what to recommend except maybe going back to 4.5.7 or > 4.6.7 as it's a production machine. Heck even 4.4.19 is OK for me in > this regard. > Well, I'm using the default openSUSE kernel here. And I have been seen this errors for sometimes. When I reported it, I was using v4.6.1. Hence, I think the version of btrfs-progs is not the problem. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
The same thing just happened again! And now it was also fixed automatically, but now I have: Metadata,DUP: Size:33.50GiB, Used:812.78MiB -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
New information guys! I formatted using the latest Tumbleweed snapshot (btrfs-progs v4.7+20160729) and I still have the same problem. I notice two things. First, when I see the "No space left on device", it is fixed when the Metadata space increases **a lot**. For example, when the error first occurred, I had: Metadata, DUP: total=2.00GiB, used=811.52MiB After waiting a while (could not run balance), it was automatically fixed and then I have: Metadata, DUP: total=9.50GiB, used=811.52MiB During the error, when I ran the balance command, I see these messages in `dmesg`: Ago 22 16:00:03 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 9323937792 flags 34 Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): 1 enospc errors during balance Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 36201037824 flags 34 Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6): 2 enospc errors during balance Ago 22 16:00:45 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 36234592256 flags 34 Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): 4 enospc errors during balance Ago 22 16:01:20 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 38415630336 flags 34 Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): 8 enospc errors during balance Does it add anything relevant to the problem? Regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em Seg, 2016-08-15 às 17:24 -0600, Chris Murphy escreveu: > On Mon, Aug 15, 2016 at 5:12 PM, Ronan Chagas > wrote: > > > > Hi guys! > > > > It happened again. The computer was completely unusable. The only > > useful > > message I saw was this one: > > > > http://img.ctrlv.in/img/16/08/16/57b24b0bb2243.jpg > > > > Does it help? > > > > I decided to format and reinstall tomorrow. This is a production > > machine and > > I have to fix this ASAP. > > Looks similar to this: > https://lkml.org/lkml/2016/3/28/230 > > Can you describe the workload happening at the time? I was copying my /home using rsyinc when this happened. Unfortunately I needed to format this machine because it is a production system. If I see any problems related to that, I will report to this mailing list. Regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space
Em Sex, 2016-08-12 às 12:02 -0600, Chris Murphy escreveu: > Tons of unallocated space. What kernel messages do you get for the > enospc? It sounds like this will be one of the mystery -28 error file > systems. So far as I recall the only work around is recreating the > file system. There are two additional things you can try: mount with > enospc_debug mount option and see if you can gather more information > about the problem. Or try a 4.8rc1 kernel which as a large number of > enospc changes. > > Unfortunately no log was written due to the lack of space :) Next time it happens, I will take a screenshot of the message. Do you think that if I reinstall my openSUSE it will be fixed? Regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS constantly reports "No space left on device" even with a huge unallocated space
Hi guys, I'm facing a daily problem with BTRFS. Almost everyday, I get the message "No space left on device". Sometimes I can recover by balancing the system but sometimes even balancing does not work due to the lack of space. In this case, only a hard reset works if I can't delete some files. The problem is that I have a huge unallocated space as you can see here: # btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 119.07GiB Device unallocated: 1.14TiB Device missing: 0.00B Used: 115.08GiB Free (estimated): 1.14TiB (min: 586.21GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:113.01GiB, Used:111.19GiB /dev/sda6 113.01GiB Metadata,DUP: Size:3.00GiB, Used:1.94GiB /dev/sda6 6.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.14TiB It is not easy to trigger the problem. But I do find some correlation between two things: 1) When I started to create jails to build openSUSE packages locally, then the problem happens more often. In these jails, some directories like /dev/, /dev/pts, /proc, are mounted inside the jail. 2) When I open my KVM, I also see this problem more often. Notice, however, that the KVM disk is stored in another EXT4 partition. I would be glad if anyone can help me to fix it. In the following, I'm providing more information about my system: # uname -a Linux ronanarraes-osd 4.7.0-1-default #1 SMP PREEMPT Mon Jul 25 08:42:47 UTC 2016 (89a2ada) x86_64 x86_64 x86_64 GNU/Linux # btrfs --version btrfs-progs v4.6.1+20160714 # btrfs fi show Label: none uuid: 80381f7f-8cef-4bd8-bdbc-3487253ee566 Total devices 1 FS bytes used 113.13GiB devid1 size 1.26TiB used 119.07GiB path /dev/sda6 # btrfs fi df / Data, single: total=113.01GiB, used=111.19GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=3.00GiB, used=1.94GiB GlobalReserve, single: total=512.00MiB, used=0.00B Regards, Ronan Arraes -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html