Re: [SOLVED] BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-23 Thread Ronan Arraes Jardim Chagas
Hi guys!

After a week without experiencing the problem, I think we can mark this
problem as solved. I want to thanks all the devs on this list. You were
always very helpful. For anyone who is still experiencing the reported
problem, upgrade to kernel 4.7.3 and I think you will be fine :)

Best regards and thank you all,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-22 Thread Ronan Arraes Jardim Chagas
Hi Josef,

Em qui, 2016-09-22 às 13:49 -0400, Josef Bacik escreveu:
> That patch fixed a problem where we would screw up the ENOSPC
> accounting, and 
> would slowly leak space into one of the counters.  So eventually (or
> often in 
> your case) you'd hit ENOSPC, but have plenty of space available.  If
> you 
> unmounted and mounted again, or simply rebooted, everything would
> have been 
> fine.  You can still use the fs, the accounting is purely in memory
> so it's not 
> like your FS is permanently screwed.  Thanks,


Thank you very much for the explanation. I am very glad it is finally
fixed here :)

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-22 Thread Ronan Arraes Jardim Chagas
Hi Josef,

Em qui, 2016-09-22 às 10:39 -0400, Josef Bacik escreveu:
> This is what fixed it.  I thought it was in 4.7 which is why I
> started paying 
> attention, but I guess I was wrong.  Glad your problem is
> resolved.  Thanks,

Do you have any explanations why the problem solved by the patch was
causing me the ENOSPC? Also, is it necessary to format my partition or
should I consider it good for use after the installation of the new
kernel?

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-22 Thread Ronan Arraes Jardim Chagas
Em qui, 2016-09-22 às 09:41 -0400, Austin S. Hemmelgarn escreveu:
> Most likely the kernel upgrade fixed things.  It's possible that the 
> large allocation is impacting something and making it work, but I
> don't 
> think that that is very likely.

The patches related to btrfs I could find in kernel 4.7.2 and 4.7.3
changelog are:

commit 8d32aaa89067225d4202a362dc201280e2514952
Author: Chris Mason 
Date:   Tue Jul 19 05:52:36 2016 -0700

Btrfs: fix delalloc accounting after copy_from_user faults

commit f495a60eb6351bf2f29fdbc1854375df9fe4022b
Author: Paolo Valente 
Date:   Wed Jul 27 07:22:05 2016 +0200

block: add missing group association in bio-cloning functions
    Fixes: da2f0f74cf7d ("Btrfs: add support for blkio controllers")

commit ff3235105fc7e4ecf04eb308940821d4a098c08d
Author: Jeff Mahoney 
Date:   Wed Aug 17 21:58:33 2016 -0400

btrfs: don't create or leak aliased root while cleaning up orphans

commit 64563a38fde57a26f4d68d488d0d4918f843547c
Author: Jeff Mahoney 
Date:   Mon Aug 15 12:10:33 2016 -0400

btrfs: properly track when rescan worker is running

commit 69b69167965e108a775ef20decabcc76fbe4fc08
Author: Jeff Mahoney 
Date:   Mon Aug 8 22:08:06 2016 -0400

btrfs: waiting on qgroup rescan should not always be interruptible

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-22 Thread Ronan Arraes Jardim Chagas
Guys,

Something very strange happened. I have not seen the problem since
Monday, which is pretty much the first time ever I work more than 3
days without seeing it.

Ok, it can be a coincidence. Notice that I did not change anything
related to my work behavior. However, I did do two things:

_ Update the kernel to 4.7.2; and
_ Created 50 dummy files with 3.0 GiB each.

Can anyone, please, tell me if these things seems to be correlated?

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-14 Thread Ronan Arraes Jardim Chagas
Hi Chris,

Em Qua, 2016-09-14 às 16:25 -0600, Chris Murphy escreveu:
> All I can think of is the file system has gotten into a unique state
> through a combination of events. I'm still suspicious that qgroups is
> contributing to the problem even after being disabled. The workload
> you're talking about is completely ordinary and trivial.

This seems reasonable. However, I formatted the computer and after two
days, if I remember correctly, I started to see the problems again. I'm
still thinking it should be also related to my HDD (7200 RPM). In all
my other computers, everything is fine and I use SSD.

> The openSUSE layout is basically impossible to backup and restore,
> there's astrometric tons of snapshots, there's no recursive btrfs
> send/receive to try and migrate it to a new file system intact, so
> you'd pretty much just have to reinstall it no matter what. If it
> were
> me, reinstall with Btrfs same as now, and first thing before anything
> else I'd disable quotas. Or yeah, it's completely reasonable for you
> to move to a different file system, it's really a coin toss for ext4
> vs XFS, but at least XFS now checksums metadata and the journal by
> default so if I thought about it at the time of the installation I'd
> do that.

Thanks! 

> Yeah FWIW, the devs seem to prefer the output from 'grep . -IR
> /sys/fs/btrfs//allocation/' so for these kinds of problems
> I'd
> report that.

Yeah, unfortunately I forgot this one today :(

> If you *really* want to, you could grab a Fedora Rawhide nightly that
> has kernel 4.8 rc6 on it, with debug stuff enabled. If it face
> plants,
> it should catch useful stuff for Josef. If it doesn't, maybe it fixes
> enough things that you can get back to work for a while longer until
> a
> long term fix becomes available. The only way to know for sure is to
> test it. But it's completely sane to just switch to XFS and get back
> to work also.
> 
> Current
> https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-201
> 60914.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-
> x86_64-Rawhide-20160914.n.0.iso.n.0.iso
> 
> Use 'dd if=ISO of=USBstick bs=256K' that will boot anything, BIOS or
> UEFI. At the menu, choose Troubleshooting, then the Rescue option, at
> the next text menu choose 3 to get to a shell. And from there you can
> mount with enospc_debug, and do a balance of the file system. To get
> logs off the system, use a 2nd USB stick, or if you have wired
> ethernet use scp, or if you know nmcli you can maybe get the wireless
> up by command line.

This seems good. However, I just have access to that machine during my
working period, and I just does not have time to test this, sorry :(

Nevertheless, when you mentioned the `dd` command, I had a great idea
that can help me to live with this problem until I have access to
kernel 4.8. I will use `dd` to create, let's say, 100 files with 3 GiB
each in my /home directory. Hence, when I see ENOSPC, I will just need
to delete some of these files. I think this should work.

Thanks for all the advices Chris!

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-14 Thread Ronan Arraes Jardim Chagas
Hi guys,

The problem happened again, but now it was way more serious. I was
doing a big Tumbleweed update (4680 packages) and I got the ENOSPC
during the update. To avoid being left with a broken system, as it has
already happened in the past, I, unfortunately, needed to delete data
that I really was not planning to. This is a disaster, because I have
more than 1 TiB of **free space**.

After deleting 7GiB of data, I could run rebalance and the update
finished successfully. However, the ENOSPC happened 3 more times (!)
and I always needed to run rebalance to keep the update going.

Sometimes, during the rebalance, I saw the message:

[28736.688266] BTRFS info (device sda6): relocating block group
389998968832 flags 34
[28737.376302] BTRFS info (device sda6): found 4 extents
[28737.712815] BTRFS info (device sda6): relocating block group
343760961536 flags 36
[28738.010030] BTRFS info (device sda6): relocating block group
343224090624 flags 36
[28738.343461] BTRFS info (device sda6): relocating block group
342687219712 flags 36
[28738.660023] BTRFS info (device sda6): relocating block group
342150348800 flags 36
[28738.665241] use_block_rsv: 11 callbacks suppressed
[28738.665247] [ cut here ]
[28738.665290] WARNING: CPU: 10 PID: 639 at ../fs/btrfs/extent-
tree.c:8097 btrfs_alloc_tree_block+0x3f1/0x4c0 [btrfs]
[28738.665292] BTRFS: block rsv returned -28
[28738.665295] Modules linked in: dm_mod fuse nf_log_ipv6 xt_pkttype
nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft
iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4
iptable_raw xt_CT snd_hda_codec_hdmi snd_hda_codec_realtek
nvidia_drm(PO) snd_hda_codec_generic snd_hda_intel nvidia_modeset(PO)
snd_hda_codec snd_hda_core snd_hwdep iptable_filter nvidia(PO) joydev
drm_kms_helper intel_rapl drm fb_sys_fops iTCO_wdt mei_wdt syscopyarea
snd_pcm snd_timer iTCO_vendor_support sysfillrect sb_edac snd i2c_i801
mei_me lpc_ich edac_core sysimgblt ip6table_mangle x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel soundcore mei aes_x86_64
[28738.665359]  lrw gf128mul glue_helper ablk_helper cryptd e1000e
hp_wmi ioatdma fjes nf_conntrack_netbios_ns ptp shpchp pps_core
sparse_keymap pcspkr mfd_core nf_conntrack_broadcast rfkill
tpm_infineon tpm_tis dca tpm nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables btrfs xor
raid6_pq hid_generic usbhid crc32c_intel serio_raw xhci_pci ehci_pci
sr_mod firewire_ohci xhci_hcd ehci_hcd cdrom firewire_core crc_itu_t
usbcore isci usb_common libsas ata_generic mpt3sas raid_class
scsi_transport_sas wmi button sg
[28738.665419] CPU: 10 PID: 639 Comm: systemd-journal Tainted:
PW  O4.7.1-1-default #1
[28738.665421] Hardware name: Hewlett-Packard HP Z820 Workstation/158B,
BIOS J63 v03.65 12/19/2013
[28738.665425]   81393104 88080bc63a68

[28738.665430]  8107ca1e 8804eaa73300 88080bc63ab8
4000
[28738.665434]   88017be9a000 880f51b31760
8107ca8f
[28738.665438] Call Trace:
[28738.665464]  [] dump_trace+0x5e/0x320
[28738.665472]  [] show_stack_log_lvl+0x10c/0x180
[28738.665478]  [] show_stack+0x21/0x40
[28738.665486]  [] dump_stack+0x5c/0x78
[28738.665496]  [] __warn+0xbe/0xe0
[28738.665503]  [] warn_slowpath_fmt+0x4f/0x60
[28738.665529]  [] btrfs_alloc_tree_block+0x3f1/0x4c0
[btrfs]
[28738.665560]  [] btrfs_copy_root+0xf2/0x280 [btrfs]
[28738.665593]  [] create_reloc_root+0x171/0x1e0
[btrfs]
[28738.665623]  [] btrfs_init_reloc_root+0x8f/0xa0
[btrfs]
[28738.665652]  [] record_root_in_trans+0xb2/0x110
[btrfs]
[28738.665679]  []
btrfs_record_root_in_trans+0x41/0x70 [btrfs]
[28738.665704]  [] start_transaction+0xa0/0x4f0
[btrfs]
[28738.665732]  [] btrfs_dirty_inode+0x33/0xc0
[btrfs]
[28738.665741]  [] file_update_time+0x99/0xf0
[28738.665770]  [] btrfs_page_mkwrite+0xa3/0x450
[btrfs]
[28738.665779]  [] do_page_mkwrite+0x69/0xc0
[28738.665785]  [] handle_pte_fault+0xf4/0x1760
[28738.665792]  [] handle_mm_fault+0x29e/0x5a0
[28738.665798]  [] __do_page_fault+0x1e0/0x510
[28738.665809]  [] page_fault+0x28/0x30
[28738.669296] DWARF2 unwinder stuck at page_fault+0x28/0x30

[28738.669300] Leftover inexact backtrace:

[28738.669327] ---[ end trace 8ef9cfba38cc9bfc ]---

Look what happened to my METADATA during the update:

1) When the problem occured:

# btrfs fi usage /
Overall:
Device size:   1.26TiB
Device allocated:     63.07GiB
Device unallocated:    1.20TiB
Device missing:  0.00B
Used:     50.21GiB
Free (estimated):      1.20TiB  (min: 612.49GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  400.00MiB  (used: 

Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-14 Thread Ronan Arraes Jardim Chagas
Hi Josef,

Em Ter, 2016-09-13 às 17:01 -0400, Josef Bacik escreveu:
> I just started paying attention to this, the last kernel I saw you
> were running 
> was 4.7.  Have you tried a recent kernel, like chris's tree?
> 
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
> for-linus-4.8
> 
> is what I would like you to try if not.  Thanks,
> 
> Josef

Unfortunately, since this is a production machine, I am not allowed to
install unreleased kernels. If this is the only solution, I will need
to wait for 4.8 or search if anyone has already backported the BTRFS
patches for 4.7.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-13 Thread Ronan Arraes Jardim Chagas
Hi guys,

One more time I saw the problem. It begins to happen on a daily basis
now. Unfortunately the `enospc_debug` flag did not help. I did not see
any new information in the logs. This time, only a hard reset worked. I
could not even reboot using gnome panel.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-13 Thread Ronan Arraes Jardim Chagas
Hi!

Em Ter, 2016-09-13 às 11:17 +0800, Wang Xiaoguang escreveu:
> It maybe a irrelevant question, but do you have compression enabled?
> 
> Regards,
> Xiaoguang Wang

No, I do not have compression enabled. I'm using openSUSE's default
configuration.

By the way, I was wrongly mounting the filesystem with `enospc_debug`.
It turns out that I modified the fstab in a backup directory, sorry :)
Now, I did it correctly so, hopefully, we will have much more
information about the problem the next time I see it!

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-08 Thread Ronan Arraes Jardim Chagas
Hi all!

Em Seg, 2016-09-05 às 16:49 +0800, Qu Wenruo escreveu:
> Just like what Wang has mentioned, would you please paste all the
> output 
> of the contents of /sys/fs/btrfs//allocation?
> 
> It's recommended to use "grep . -IR " to get all the data as
> it 
> will show the file name.

So, one more time, I see the problem. This time I was just using
Firefox and I cannot recover using `btrfs balance`. I think that, one
more time, I will need to reboot this machine. This problem is really
causing me a lot of troubles :(

I have disabled the quotas and the first error message after the
problem was:

[ 2444.592255] [ cut here ]
[ 2444.592314] WARNING: CPU: 4 PID: 289 at ../fs/btrfs/extent-
tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs]
[ 2444.592317] Modules linked in: fuse nf_log_ipv6 xt_pkttype
nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft
iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw nvidia_drm(PO) ipt_REJECT
nf_reject_ipv4 snd_hda_codec_hdmi nvidia_modeset(PO) intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp nvidia(PO) coretemp
snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic iptable_raw
drm_kms_helper snd_hda_intel drm xt_CT snd_hda_codec snd_hda_core
snd_hwdep kvm_intel snd_pcm snd_timer joydev mei_wdt fb_sys_fops
iTCO_vendor_support i2c_i801 lpc_ich kvm syscopyarea snd sysfillrect
irqbypass mei_me hp_wmi sysimgblt iptable_filter crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul
glue_helper ablk_helper
[ 2444.592386]  cryptd soundcore mei sparse_keymap rfkill e1000e shpchp
pcspkr ioatdma mfd_core tpm_infineon tpm_tis dca tpm fjes ptp pps_core
ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast
nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack
ip6table_filter ip6_tables x_tables btrfs xor raid6_pq hid_generic
usbhid crc32c_intel serio_raw xhci_pci ehci_pci xhci_hcd ehci_hcd
firewire_ohci sr_mod firewire_core cdrom crc_itu_t usbcore isci
usb_common libsas ata_generic mpt3sas raid_class scsi_transport_sas wmi
button sg
[ 2444.592447] CPU: 4 PID: 289 Comm: kworker/u65:7 Tainted:
PW  O4.7.1-1-default #1
[ 2444.592450] Hardware name: Hewlett-Packard HP Z820 Workstation/158B,
BIOS J63 v03.65 12/19/2013
[ 2444.592458] Workqueue: writeback wb_workfn (flush-btrfs-1)
[ 2444.592462]   81393104 

[ 2444.592468]  8107ca1e 88080de6d800 9000
88080c437a00
[ 2444.592472]  880634b379ac 9000 88080dcfb73c
a02af98e
[ 2444.592477] Call Trace:
[ 2444.592499]  [] dump_trace+0x5e/0x320
[ 2444.592507]  [] show_stack_log_lvl+0x10c/0x180
[ 2444.592514]  [] show_stack+0x21/0x40
[ 2444.592523]  [] dump_stack+0x5c/0x78
[ 2444.592531]  [] __warn+0xbe/0xe0
[ 2444.592561]  []
btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs]
[ 2444.592602]  [] btrfs_clear_bit_hook+0x296/0x380
[btrfs]
[ 2444.592642]  [] clear_state_bit+0x55/0x1d0 [btrfs]
[ 2444.592676]  [] __clear_extent_bit+0x13d/0x3f0
[btrfs]
[ 2444.592707]  []
extent_clear_unlock_delalloc+0x62/0x280 [btrfs]
[ 2444.592739]  [] cow_file_range+0x299/0x440 [btrfs]
[ 2444.592768]  [] run_delalloc_range+0x392/0x3b0
[btrfs]
[ 2444.592801]  []
writepage_delalloc.isra.40+0x100/0x170 [btrfs]
[ 2444.592834]  [] __extent_writepage+0xc3/0x340
[btrfs]
[ 2444.592864]  []
extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs]
[ 2444.592894]  [] extent_writepages+0x4e/0x60
[btrfs]
[ 2444.592900]  []
__writeback_single_inode+0x3d/0x3b0
[ 2444.592907]  [] writeback_sb_inodes+0x20a/0x440
[ 2444.592914]  [] __writeback_inodes_wb+0x87/0xb0
[ 2444.592921]  [] wb_writeback+0x28d/0x330
[ 2444.592927]  [] wb_workfn+0x222/0x3f0
[ 2444.592934]  [] process_one_work+0x1ed/0x4e0
[ 2444.592942]  [] worker_thread+0x47/0x4c0
[ 2444.592947]  [] kthread+0xbd/0xe0
[ 2444.592954]  [] ret_from_fork+0x1f/0x40
[ 2444.596679] DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40

[ 2444.596683] Leftover inexact backtrace:

[ 2444.596689]  [] ? kthread_worker_fn+0x170/0x170

I will also provide the information requested by Qu:

grep . -IR /sys/fs/btrfs/e9efaa0c-d477-4249-830f-
ee5956768b29/allocation
allocation/data/flags:1
allocation/data/bytes_pinned:0
allocation/data/bytes_may_use:0
allocation/data/total_bytes_pinned:202973265920
allocation/data/bytes_reserved:0
allocation/data/bytes_used:45623730176
allocation/data/single/used_bytes:45623730176
allocation/data/single/total_bytes:46179287040
allocation/data/total_bytes:46179287040
allocation/data/disk_total:46179287040
allocation/data/disk_used:45623730176
allocation/metadata/dup/used_bytes:1120698368
allocation/metadata/dup/total_bytes:6979321856
allocation/metadata/flags:4
allocation/metadata/bytes_pinned:0
allocation/metadata/bytes_may_use:88521768960
allocation/metadata/total_bytes_pinned:-44285952
allocation/metadata/bytes_res

Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi Chris,

Em Sex, 2016-09-02 às 21:41 -0600, Chris Murphy escreveu:
> I suggest removing the hardware, and the proprietary driver, and
> retest the system with the existing Tumbleweed 4.7.0 kernel; and if
> that still fails, then try the Leap 4.4 kernel.
> 
> Proprietary kernels can do all kinds of crazy things they shouldn't
> so
> it's entirely possible that driver is a factor in the problem.

Actually it is just a module that I load. It is only loaded when I need
to work with it. However, I can assure this is not the problem because
I installed the board one month ago +-, but I have been seeing ENOSPC
since the beginning of the year IIRC. I am using Tumbleweed default
kernel right now, but I just can try Leap when 42.2 is released.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi guys!

Em Sex, 2016-09-02 às 16:39 -0600, Chris Murphy escreveu:
> Worth a shot, considering the opensuse/SLE 4.4 kernel has a shittonne
> of backports. It seems unlikely to me opensuse intends to not support
> your hardware (skylake?)

Actually it is a peripheral we use to program embedded systems here and
the (proprietary) driver requires kernel >= 4.6. I barely use it. I am
really thinking to transfer it to another machine just to be able to
change my kernel.

I will post here one thing I already posted on openSUSE mailing list:

I think I forgot to mention one very important thing: I have been using
Tumbleweed+BTRFS on this machine for a very very very long time. I
think I installed it just after it changed to the current model. By
that time, I was using the same machine but without one peripheral that
requires a "new" kernel (HDD, processor, RAM, everything was the same).
AFAIK, the first time I saw that problem was this year. So, I think it
must be a regression after some kernel / btrfs-progs update.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi!

Em Sex, 2016-09-02 às 15:34 -0600, Chris Murphy escreveu:
> Except for your software build case, I have about the same workload
> you have with two machines, one SSD one HDD, using 4.7.0 for a month,
> and then 4.7.2 for the last week. I haven't had any enospc on these
> two systems.
> 
> I think for you the path of least resistance that also permits
> further
> testing is to see if you can track down the leap 42.2 beta kernel
> which is 4.4.19-1-default. I'm not easily finding that particular
> one,
> but I did find something a bit more recent:
> http://download.opensuse.org/repositories/Kernel:/openSUSE-42.2/stand
> ard/x86_64/

Unfortunately, it will not be possible since my actual hardware depends
on kernel >= 4.6 :(

Just now, I saw the problem again. For the first time, it happened
twice in a small period. I was copying the e-mail from one IMAP server
to my local HD. I use offlineimap, but this time it changed the backend
to sqlite and started to create tons of database files, I think. My HDD
IO stayed at 60/70% for a very long period.

Hence, let's do a review of situations in which I saw the problem:

1) Local builds using `osc`;
2) During `zypper dup`;
3) When offlineimap created tons of database files;
4) During rsync-ing /home;
4) During usage of a virtual machine (the disk image was in an EXT4
partition).

I think we can conclude that this problem is tightly coupled with
actions that require a lot of writing to the HDD. Here is the
specification of my HDD:

hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
Model Number:   ST2000DM001-1CH164  
Serial Number:  W1E73CF5
Firmware Revision:  HP34
Transport:  Serial, SATA 1.0a, SATA II Extensions, SATA
Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Used: unknown (minor revision code 0x001f) 
Supported: 9 8 7 6 5 
Likely used: 9
Configuration:
Logical max current
cylinders   16383   16383
heads   16  16
sectors/track   63  63
--
CHS current addressable sectors:   16514064
LBAuser addressable sectors:  268435455
LBA48  user addressable sectors: 3907029168
Logical  Sector size:   512 bytes
Physical Sector size:  4096 bytes
Logical Sector-0 offset:  0 bytes
device size with M = 1024*1024: 1907729 MBytes
device size with M = 1000*1000: 2000398 MBytes (2000 GB)
cache/buffer size  = unknown
Form Factor: 3.5 inch
Nominal Media Rotation Rate: 7200
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific
minimum
R/W multiple sector transfer: Max = 16  Current = ?
Advanced power management level: 128
Recommended acoustic management value: 208, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
 Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4 
 Cycle time: no flow control=120ns  IORDY flow
control=120ns
Commands/features:
Enabled Supported:
   *SMART feature set
Security Mode feature set
   *Power Management feature set
   *Write cache
   *Look-ahead
   *WRITE_BUFFER command
   *READ_BUFFER command
   *DOWNLOAD_MICROCODE
   *Advanced Power Management feature set
Power-Up In Standby feature set
   *SET_FEATURES required to spinup after power up
   *48-bit Address feature set
   *Device Configuration Overlay feature set
   *Mandatory FLUSH_CACHE
   *FLUSH_CACHE_EXT
   *SMART error logging
   *SMART self-test
   *General Purpose Logging feature set
   *64-bit World wide name
   *WRITE_UNCORRECTABLE_EXT command
   *{READ,WRITE}_DMA_EXT_GPL commands
   *Segmented DOWNLOAD_MICROCODE
   *Gen1 signaling speed (1.5Gb/s)
   *Gen2 signaling speed (3.0Gb/s)
   *Gen3 signaling speed (6.0Gb/s)
   *Native Command Queueing (NCQ)
   *Phy event counters
   *READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
   *DMA Setup Auto-Activate optimization
Device-initiated interface power management
   *Software settings preservation
   *SMART Command Transport (SCT) feature set
   *SCT Read/Write Long (AC1), obsolete
   *SCT Error Recovery Control (AC3)
   *SCT Features Control (AC4)
   *SCT Data Tables (AC5)
unknown 206[12] (vendor specific)
unknown 206[13] (vendor specific)
Secu

Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi again guys!

After I rebooted the computer, I still can't run balance on metatada:

btrfs balance start -musage=1 /
ERROR: error during balancing '/': No space left on device
There may be more info in syslog - try dmesg | tail

dmesg shows:

[ 2022.530285] BTRFS info (device sda6): relocating block group
128509280256 flags 36
[ 2023.355206] BTRFS info (device sda6): relocating block group
127972409344 flags 36
[ 2024.265313] BTRFS info (device sda6): relocating block group
127435538432 flags 36
[ 2025.646712] BTRFS info (device sda6): relocating block group
126898667520 flags 36
[ 2026.794791] BTRFS info (device sda6): relocating block group
126361796608 flags 36
[ 2028.023517] BTRFS info (device sda6): relocating block group
125824925696 flags 36
[ 2028.881287] BTRFS info (device sda6): relocating block group
125288054784 flags 36
[ 2029.739342] BTRFS info (device sda6): relocating block group
124751183872 flags 36
[ 2030.631990] BTRFS info (device sda6): relocating block group
124214312960 flags 36
[ 2031.523176] BTRFS info (device sda6): relocating block group
123677442048 flags 36
[ 2032.407859] BTRFS info (device sda6): relocating block group
123140571136 flags 36
[ 2033.806672] BTRFS info (device sda6): relocating block group
122603700224 flags 36
[ 2035.237712] BTRFS info (device sda6): relocating block group
122066829312 flags 36
[ 2038.257268] BTRFS info (device sda6): relocating block group
122033274880 flags 34
[ 2039.911443] BTRFS info (device sda6): relocating block group
121496403968 flags 36
[ 2040.958106] BTRFS info (device sda6): relocating block group
120959533056 flags 36
[ 2041.841051] BTRFS info (device sda6): relocating block group
120422662144 flags 36
[ 2042.828359] BTRFS info (device sda6): relocating block group
119885791232 flags 36
[ 2044.297744] BTRFS info (device sda6): relocating block group
119348920320 flags 36
[ 2045.684932] BTRFS info (device sda6): relocating block group
118812049408 flags 36
[ 2046.761787] BTRFS info (device sda6): relocating block group
118275178496 flags 36
[ 2048.200756] BTRFS info (device sda6): relocating block group
117738307584 flags 36
[ 2049.806986] BTRFS info (device sda6): relocating block group
117201436672 flags 36
[ 2051.170470] BTRFS info (device sda6): relocating block group
116664565760 flags 36
[ 2051.910536] BTRFS info (device sda6): relocating block group
116127694848 flags 36
[ 2052.678395] BTRFS info (device sda6): relocating block group
115590823936 flags 36
[ 2053.737959] BTRFS info (device sda6): relocating block group
106363355136 flags 36
[ 2054.852065] BTRFS info (device sda6): relocating block group
105826484224 flags 36
[ 2055.911187] BTRFS info (device sda6): relocating block group
105222504448 flags 36
[ 2057.047407] BTRFS info (device sda6): 4 enospc errors during balance

and I have:

btrfs fi usage /
Overall:
Device size:   1.26TiB
Device allocated:     80.07GiB
Device unallocated:    1.18TiB
Device missing:  0.00B
Used:     41.95GiB
Free (estimated):      1.18TiB  (min: 603.95GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  352.00MiB  (used: 576.00KiB)

Data,single: Size:40.01GiB, Used:39.95GiB
   /dev/sda6  40.01GiB

Metadata,DUP: Size:20.00GiB, Used:1.00GiB
   /dev/sda6  40.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB
   /dev/sda6  64.00MiB

Unallocated:
   /dev/sda6   1.18TiB

Hope this brings new information!

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi guys!

Jeff was right. I had the problem again today and quotas are disabled
now. I couldn't get any useful message in log this time. Look at the
metadata:

btrfs fi usage /
Overall:
Device size:   1.26TiB
Device allocated:     43.07GiB
Device unallocated:    1.21TiB
Device missing:  0.00B
Used:     41.94GiB
Free (estimated):      1.21TiB  (min: 622.46GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  352.00MiB  (used: 0.00B)

Data,single: Size:40.01GiB, Used:39.94GiB
   /dev/sda6  40.01GiB

Metadata,DUP: Size:1.50GiB, Used:1.00GiB
   /dev/sda6   3.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB
   /dev/sda6  64.00MiB

Unallocated:
   /dev/sda6   1.21TiB

Any ideas to help me?

Regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi Jeff,

Em Sex, 2016-09-02 às 10:48 -0400, Jeff Mahoney escreveu:
> Sorry, I miscommunicated there.  The WARN_ON is annoying.  It's the
> underlying issue that's causing you to lose work that is the one that
> concerns me.
> 

Oh, OK, I see, sorry about that :)

Thus, if disabling quotas does not help to fix my problem, is there any
workaround you can think of to avoid the problem you suggested in the
previous e-mail?

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-02 Thread Ronan Arraes Jardim Chagas
Hi Jeff,

Em Sex, 2016-09-02 às 10:26 -0400, Jeff Mahoney escreveu:
> I explained what I think Ronan's issue is in another part of the
> thread
> just now.  I don't think that's a severe issue at
> all.  Annoying?  Sure,
> but I'm more concerned with the underlying ENOSPC issue.  Without
> more
> info, I don't know what the cause of it is and when it was
> introduced.

Sorry, but I really need to humbly disagree with you. Look to what has
already happened to me when the problem occurred (which is almost every
day):

1) Firefox crash;
2) Libreoffice crash (auto-save stop working);
3) Can't save my work in any text editor (vim, neovim, gedit, etc.);
4) Sometimes I can't even log as root (in TTY or by `su`);
5) Sometimes only a hard-reset solves the problem;
6) I was left with a broken operational system when the problem
occurred during a `zypper dup`.

I just can't tell you how much work I lost during those situations. So,
I think we cannot call this issue just annoying. I think it is very
severe.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-01 Thread Ronan Arraes Jardim Chagas
Hi Jeff,

Em Qui, 2016-09-01 às 13:12 -0400, Jeff Mahoney escreveu:
> It's not.  We use qgroups because that's the only way we can track
> how
> much space each subvolume is using, regardless of whether anyone
> wants
> to do enforcement.  When it's working properly, snapper can make use
> of
> that information to make informed decisions on how much space will
> actually be released when removing old snapshots.
> 

Given that, what am I loosing by disabling qgroups here? Will I still
be able to recover my machine using snapshots (this saved my two or
three times)?

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-01 Thread Ronan Arraes Jardim Chagas
Hi Jeff,

Em Qui, 2016-09-01 às 13:43 -0400, Jeff Mahoney escreveu:
> Absolutely.  It doesn't affect the ability to take, retain, or
> recover
> using snapshots.  It only affects the ability to see how much space a
> particular snapshot is using on disk, both from the user wanting to
> know
> and snapper using it to make retention decisions.  Snapper can handle
> qgroups not being there.
> 

Thanks for the prompt answer. I'm glad because space is not a concern
here, at least now :) Hence, I have plenty time to wait for a proper
fix. Until there, I will try to keep my snapshot count low.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-01 Thread Ronan Arraes Jardim Chagas
Em Qui, 2016-09-01 às 09:21 -0400, Austin S. Hemmelgarn escreveu:
> Yes, you can just run `btrfs quota disable /` and it should
> work.  This 
> ironically reiterates that one of the bigger problems with BTRFS is
> that 
> distros are enabling unstable and known broken features by default
> on 
> install.  I was pretty much dumbfounded when I first learned that 
> OpenSUSE is enabling BTRFS qgroups by default since they are known
> to 
> not work reliably and cause all kinds of issues.

Thanks Austin! I executed the command and now I get:

btrfs qgroup show /
ERROR: can't perform the search - No such file or directory
ERROR: can't list qgroups: No such file or directory

as expected. Now I will wait for +- 1 week to see if the problem will
occur and, if not, I will send an e-mail to openSUSE factory mailing
list to start a discussion if it is better to not enable qgroups by
default.

Best regards and thanks everyone for the help,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-09-01 Thread Ronan Arraes Jardim Chagas
Hi!

Em Qua, 2016-08-31 às 17:09 -0600, Chris Murphy escreveu:
> OK so Ronan, I'm gonna guess the simplest work around for your
> problem
> is to disable quota support, and see if the problem happens again.
> 

Look at the output of the command proposed by Jeff:

btrfs qgroup show /
qgroupid rfer excl 
   
0/5  16.00KiB 16.00KiB 
0/25716.00KiB 16.00KiB 
0/25816.30MiB 16.30MiB 
0/25911.65GiB309.67MiB 
0/260 2.34MiB  2.34MiB 
0/26116.00KiB 16.00KiB 
0/26213.19GiB 13.19GiB 
0/26316.00KiB 16.00KiB 
0/26460.00KiB 60.00KiB 
0/265   480.00KiB480.00KiB 
0/26616.00KiB 16.00KiB 
0/267 2.00GiB  2.00GiB 
0/26816.00KiB 16.00KiB 
0/26916.00KiB 16.00KiB 
0/27016.00KiB 16.00KiB 
0/27116.00KiB 16.00KiB 
0/27216.00KiB 16.00KiB 
0/27316.00KiB 16.00KiB 
0/27416.00KiB 16.00KiB 
0/275   205.78MiB205.78MiB 
0/27616.00KiB 16.00KiB 
0/27748.00KiB 48.00KiB 
0/278   328.41MiB328.41MiB 
0/283 3.92GiB 26.63MiB 
0/285 3.93GiB  4.10MiB 
0/294 7.84GiB100.59MiB 
0/330 7.98GiB  6.61MiB 
0/332 8.32GiB 69.17MiB 
0/353 9.53GiB 49.46MiB 
0/35510.51GiB235.39MiB 
0/41511.54GiB  3.38MiB 
0/41611.54GiB896.00KiB 
0/41711.57GiB  2.68MiB 
0/41811.57GiB160.00KiB 
0/41911.54GiB  2.40MiB 
0/42011.54GiB192.00KiB 
0/42111.62GiB  4.61MiB 
0/42211.83GiB212.93MiB 
0/42711.64GiB  1.27MiB 
0/42811.65GiB  4.25MiB 
1/0  16.11GiB  4.77GiB 
255/262  13.19GiB 13.19GiB 

This system was installed with Tumbleweed ISO and I did not change
anything in btrfs options. Hence, it seems that openSUSE is enabling
quotas by default. Now, I need to disable it and avoid triggering the
problem. What is the best way I can do this? Is it OK to do just:

btrfs quota disable /

? Or do I need to format and recreate btrfs without quotas?

> If it doesn't happen again then it sounds like the reproduce steps
> are:
> 
> a. enable quota support
> b. do something metadata heavy workload that's also maybe hitting
> fsync; from opensuse list the example that sometimes causes it:
> 
> 
>   osc co home:Ronis_BR/julia
>   cd home:Ronis_BR/julia
>   osc build --root=`pwd`/jail openSUSE_Tumbleweed x86_64
> 
> I wonder if it's easier to hit it on a hard drive, slower fsyncs?

This sounds good! Actually, I'm using a 7200RPM hard driver.

Thank you all very much for all the help,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-31 Thread Ronan Arraes Jardim Chagas
Hi guys!

And the problem happened again. This time, I was only using Mozilla
Firefox. I could get the very first message after the error. I hope it
brings more information:

[28039.672199] [ cut here ]
[28039.672253] WARNING: CPU: 3 PID: 31800 at ../fs/btrfs/qgroup.c:2667
btrfs_qgroup_free_meta+0x88/0x90 [btrfs]
[28039.672255] Modules linked in: fuse nf_log_ipv6 xt_pkttype
nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft
iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4
iptable_raw xt_CT nvidia_drm(PO) nvidia_modeset(PO) iptable_filter
nvidia(PO) ip6table_mangle nf_conntrack_netbios_ns
nf_conntrack_broadcast drm_kms_helper nf_conntrack_ipv4 drm
nf_defrag_ipv4 fb_sys_fops snd_hda_codec_hdmi joydev
snd_hda_codec_realtek ip_tables syscopyarea snd_hda_codec_generic
xt_conntrack snd_hda_intel sysfillrect intel_rapl sb_edac edac_core
snd_hda_codec hp_wmi x86_pkg_temp_thermal intel_powerclamp snd_hda_core
snd_hwdep nf_conntrack sparse_keymap sysimgblt coretemp kvm_intel kvm
rfkill irqbypass snd_pcm snd_timer crct10dif_pclmul
[28039.672305]  e1000e crc32_pclmul ghash_clmulni_intel snd aesni_intel
ip6table_filter aes_x86_64 lrw gf128mul glue_helper ablk_helper
iTCO_wdt iTCO_vendor_support mei_wdt ioatdma pcspkr cryptd ip6_tables
ptp lpc_ich fjes i2c_i801 dca mfd_core soundcore pps_core shpchp
tpm_infineon tpm_tis tpm mei_me mei x_tables btrfs xor raid6_pq
hid_generic usbhid crc32c_intel serio_raw xhci_pci ehci_pci sr_mod
firewire_ohci xhci_hcd ehci_hcd cdrom firewire_core crc_itu_t isci
usbcore usb_common libsas ata_generic mpt3sas raid_class
scsi_transport_sas wmi button sg
[28039.672373] CPU: 3 PID: 31800 Comm: gnome-terminal- Tainted:
PW  O4.7.1-1-default #1
[28039.672375] Hardware name: Hewlett-Packard HP Z820 Workstation/158B,
BIOS J63 v03.65 12/19/2013
[28039.672378]   81393104 

[28039.672382]  8107ca1e 881008780800 00014000
881008780800
[28039.672386]  ffe4 88100b297c00 88053b7e3540
a02c9f58
[28039.672390] Call Trace:
[28039.672406]  [] dump_trace+0x5e/0x320
[28039.672413]  [] show_stack_log_lvl+0x10c/0x180
[28039.672419]  [] show_stack+0x21/0x40
[28039.672425]  [] dump_stack+0x5c/0x78
[28039.672430]  [] __warn+0xbe/0xe0
[28039.672461]  [] btrfs_qgroup_free_meta+0x88/0x90
[btrfs]
[28039.672492]  [] start_transaction+0x3c3/0x4f0
[btrfs]
[28039.672521]  [] btrfs_create+0x38/0x1d0 [btrfs]
[28039.672528]  [] path_openat+0x139b/0x14a0
[28039.672535]  [] do_filp_open+0x7e/0xe0
[28039.672541]  [] do_sys_open+0x124/0x1f0
[28039.672547]  []
entry_SYSCALL_64_fastpath+0x1e/0xa8
[28039.676186] DWARF2 unwinder stuck at
entry_SYSCALL_64_fastpath+0x1e/0xa8

Best regards,
Ronan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-30 Thread Ronan Arraes Jardim Chagas
Em Ter, 2016-08-30 às 10:44 -0600, Chris Murphy escreveu:
> It sounds related to read-only snapshots to me. I wonder if this
> system has something busy that's writing to a file, database, even
> maybe something just spamming journald, and then there's a read-only
> snapshot during the write, which then triggers the enospc.
> 

I saw the problem yesterday after lunch time (13:00) and the last
snapper snapshot was taken at 10:17:

snapper list
Tipo   | #  | Pre # | Data | Usuário | Limpeza
| Descrição   | Dados de usuário
---++---+--+--+--
---+---+--
single | 0  |   |  |
root | | current   |  
single | 1  |   | Ter 16 Ago 2016 15:07:25 BRT |
root | | first root filesystem |  
single | 2  |   | Ter 16 Ago 2016 15:15:57 BRT | root |
number  | after installation| important=yes
pre| 4  |   | Ter 16 Ago 2016 15:26:44 BRT | root |
number  | zypp(y2base)  | important=yes
post   | 5  | 4 | Ter 16 Ago 2016 16:12:46 BRT | root |
number  |   | important=yes
pre| 29 |   | Ter 16 Ago 2016 18:02:43 BRT | root |
number  | zypp(zypper)  | important=yes
post   | 30 | 29| Ter 16 Ago 2016 18:07:34 BRT | root |
number  |   | important=yes
pre| 45 |   | Seg 22 Ago 2016 13:59:45 BRT | root |
number  | zypp(zypper)  | important=yes
post   | 46 | 45| Seg 22 Ago 2016 14:11:17 BRT | root |
number  |   | important=yes
pre| 89 |   | Seg 29 Ago 2016 09:56:19 BRT | root |
number  | yast sw_single|  
pre| 90 |   | Seg 29 Ago 2016 10:00:00 BRT | root |
number  | zypp(y2base)  | important=no 
post   | 91 | 90| Seg 29 Ago 2016 10:01:11 BRT | root |
number  |   | important=no 
pre| 92 |   | Seg 29 Ago 2016 10:07:01 BRT | root |
number  | zypp(y2base)  | important=no 
post   | 93 | 92| Seg 29 Ago 2016 10:07:10 BRT | root |
number  |   | important=no 
pre| 94 |   | Seg 29 Ago 2016 10:12:32 BRT | root |
number  | zypp(y2base)  | important=no 
post   | 95 | 94| Seg 29 Ago 2016 10:14:25 BRT | root |
number  |   | important=no 
post   | 96 | 89| Seg 29 Ago 2016 10:17:17 BRT | root |
number  |   |                 

> Ronan, if you're given a work around, then it's even less likely the
> bug gets fixed. But if you can disable snapper snapshots entirely and
> the problem doesn't happen; or if you can increase the frequency of
> snapper snapshots and the problem happens more often, that might help
> narrow it down to a point where it's more easily reproduced. If it's
> not related, that's still useful to know.

I agree with you. The problem is that since this is a production
machine, it is kind very problematic to have so many reboots that
occurs randomly.

I will install something using zypper, which will trigger snapper, and
see if the problem will be triggered. I will be out of the office this
afternoon, so the machine will be on idle.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-30 Thread Ronan Arraes Jardim Chagas
Hi!

Em Ter, 2016-08-30 às 10:12 +0800, Wang Xiaoguang escreveu:
> For metadata, "bytes_may_use" is about 80GB, it's very big,
> I think this value is very abnormal.
> 
> So this explains why you have huge unallocated space, you still
> get ENOSPC error. In kernel btrfs, there is a function
> should_alloc_chunk()
> to determine whether to allocate new chunks(new device space)
>   num_bytes = total_bytes - bytes_readonly; it's 2147483648
>   num_allocated = bytes_used + bytes_reserved; it's 977354752
> 
> if num_allocated < num_bytes * 0.8, it will not allocate new device 
> space :) even you
> have huge unallocated space.
> 
> I think the root reason is that bytes_may_use has some computation
> error and
> is not be converted to bytes_used or bytes_reserved.
> 
> I just explain why you get ENOSPC error even with huge unallocated
> space 
> from
> codes :)
> 

Thanks! At least we known why ENOSPC is happening.

> Can you work out a reproducer for this ENOSPC error, then I can
> dig into codes to figure out the true reason.

Unfortunately I failed in every attempt to trigger the problem. It
happens randomly and I could not figure out yet what was triggering it.
First, I though it was related to a build process inside a chroot jail,
but then I see the problem happening after the computer being idle for
a long time (+- 1h). So, no clues yet :(

Is there any workaround I can do?

Best regards,
Ronan Arraes


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-29 Thread Ronan Arraes Jardim Chagas
Hi guys,

I just have the problem again. Now, it happens during the lunch time
when the machine was idle. Only the system processes were running. It
was not the first time that I saw this problem just after lunch when
the machine stayed idle for a long period (+- 1h). 

Here is the information requested:

/sys/fs/btrfs/$UUID/allocation/data

./bytes_may_use
0
./bytes_pinned
0
./bytes_reserved
0
./bytes_used
36128374784
./disk_total
37589352448
./disk_used
36128374784
./flags
1
./total_bytes
37589352448
./total_bytes_pinned
20339560448
./single/total_bytes
37589352448
./single/used_bytes
36128374784

/sys/fs/btrfs/$UUID/allocation/metadata

./bytes_may_use
84974452736
./bytes_pinned
0
./bytes_reserved
0
./bytes_used
977354752
./disk_total
4294967296
./disk_used
1954709504
./flags
4
./total_bytes
2147483648
./total_bytes_pinned
-57851904
./dup/total_bytes
2147483648
./dup/used_bytes
977354752

# btrfs fi usage /
Overall:
Device size:   1.26TiB
Device allocated:     39.07GiB
Device unallocated:    1.22TiB
Device missing:  0.00B
Used:     35.29GiB
Free (estimated):      1.22TiB  (min: 625.93GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  320.00MiB  (used: 0.00B)

Data,single: Size:35.01GiB, Used:33.47GiB
   /dev/sda6  35.01GiB

Metadata,DUP: Size:2.00GiB, Used:932.00MiB
   /dev/sda6   4.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB
   /dev/sda6  64.00MiB

Unallocated:
   /dev/sda6   1.22TiB

# btrfs fi df /
Data, single: total=35.01GiB, used=33.47GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=2.00GiB, used=932.09MiB
GlobalReserve, single: total=320.00MiB, used=0.0

I also saw the following information in `journalctl`:

Ago 29 10:25:33 ronanarraes-osd kernel: [ cut here ]---
-
Ago 29 10:25:33 ronanarraes-osd kernel: WARNING: CPU: 4 PID: 30424 at
../fs/btrfs/extent-tree.c:4303
btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel: Modules linked in: fuse
nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit
af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6
xt_tcpudp nf_
Ago 29 10:25:33 ronanarraes-osd kernel:  mei_wdt sysimgblt
iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw sparse_keymap
Ago 29 10:25:33 ronanarraes-osd kernel: CPU: 4 PID: 30424 Comm:
kworker/u65:1 Tainted: P   O4.7.1-1-default #1
Ago 29 10:25:33 ronanarraes-osd kernel: Hardware name: Hewlett-Packard
HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013
Ago 29 10:25:33 ronanarraes-osd kernel: Workqueue: writeback wb_workfn
(flush-btrfs-1)
Ago 29 10:25:33 ronanarraes-osd kernel:  
81393104  
Ago 29 10:25:33 ronanarraes-osd kernel:  8107ca1e
88100027c800 1000 88082ff06400
Ago 29 10:25:33 ronanarraes-osd kernel:  88100c7af784
1000 8805bd60f6cc a025098e
Ago 29 10:25:33 ronanarraes-osd kernel: Call Trace:
Ago 29 10:25:33 ronanarraes-osd kernel:  []
dump_trace+0x5e/0x320
Ago 29 10:25:33 ronanarraes-osd kernel:  []
show_stack_log_lvl+0x10c/0x180
Ago 29 10:25:33 ronanarraes-osd kernel:  []
show_stack+0x21/0x40
Ago 29 10:25:33 ronanarraes-osd kernel:  []
dump_stack+0x5c/0x78
Ago 29 10:25:33 ronanarraes-osd kernel:  []
__warn+0xbe/0xe0
Ago 29 10:25:33 ronanarraes-osd kernel:  []
btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
btrfs_clear_bit_hook+0x296/0x380 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
clear_state_bit+0x55/0x1d0 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
__clear_extent_bit+0x13d/0x3f0 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
extent_clear_unlock_delalloc+0x62/0x280 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
run_delalloc_nocow+0x962/0xba0 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
run_delalloc_range+0x35f/0x3b0 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
writepage_delalloc.isra.40+0x100/0x170 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
__extent_writepage+0xc3/0x340 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
extent_writepages+0x4e/0x60 [btrfs]
Ago 29 10:25:33 ronanarraes-osd kernel:  []
__writeback_single_inode+0x3d/0x3b0
Ago 29 10:25:33 ronanarraes-osd kernel:  []
writeback_sb_inodes+0x20a/0x440
Ago 29 10:25:33 ronanarraes-osd kernel:  []
__writeback_inodes_wb+0x87/0xb0
Ago 29 10:25:33 ronanarraes-osd kernel:  []
wb_writeback+0x28d/0x330
Ago 29 10:25:33 ronanarraes-osd kernel:  []
wb_workfn+0x222/0x3f0
Ago 29 10:25:33 ronanarraes-osd kernel:  []
process_one_work+0x1ed/0x4e0
Ago 29 10:25:33 r

Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-29 Thread Ronan Arraes Jardim Chagas
Hi!

Em Seg, 2016-08-29 às 20:12 +0800, Wang Xiaoguang escreveu:
> When strange ENOSPC errors occur, I think "btrfs fi usage"
> or "btrfs di df" do not help too much. Their output do not
> reflect btrfs kernel current status :)
> 
> Would you please provide attribute files' values in 
> /sys/fs/btrfs/$UUID/allocation/data
> and /sys/fs/btrfs/$UUID/allocation/metadata when ENOSPC error occurs.
> 

Sure! As soon as I see the error again, I will send this results. Now,
I see that if I move my jail directory to a ext4 partition, then I do
not see the problem anymore, but I need more test to validade this
assumption.

Best regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-22 Thread Ronan Arraes Jardim Chagas
Em Seg, 2016-08-22 às 14:49 -0600, Chris Murphy escreveu:
> This is really weird. I'm running 4.7.0 (Fedora) and I'm not
> experiencing problems, let alone this. What is this kernel's
> provenance? Is it a plain mainline 4.7.0 that you built? I'm not
> really sure what to recommend except maybe going back to 4.5.7 or
> 4.6.7 as it's a production machine. Heck even 4.4.19 is OK for me in
> this regard.
> 

Well, I'm using the default openSUSE kernel here. And I have been seen
this errors for sometimes. When I reported it, I was using v4.6.1.
Hence, I think the version of btrfs-progs is not the problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-22 Thread Ronan Arraes Jardim Chagas
The same thing just happened again! And now it was also fixed
automatically, but now I have:

Metadata,DUP: Size:33.50GiB, Used:812.78MiB
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-22 Thread Ronan Arraes Jardim Chagas
New information guys! I formatted using the latest Tumbleweed snapshot
(btrfs-progs v4.7+20160729) and I still have the same problem.

I notice two things. First, when I see the "No space left on device",
it is fixed when the Metadata space increases **a lot**. For example,
when the error first occurred, I had:

Metadata, DUP: total=2.00GiB, used=811.52MiB

After waiting a while (could not run balance), it was automatically
fixed and then I have:

Metadata, DUP: total=9.50GiB, used=811.52MiB

During the error, when I ran the balance command, I see these messages
in `dmesg`:

Ago 22 16:00:03 ronanarraes-osd kernel: BTRFS info (device sda6):
relocating block group 9323937792 flags 34
Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): found
1 extents
Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): 1
enospc errors during balance
Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6):
relocating block group 36201037824 flags 34
Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6): 2
enospc errors during balance
Ago 22 16:00:45 ronanarraes-osd kernel: BTRFS info (device sda6):
relocating block group 36234592256 flags 34
Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): found
1 extents
Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): 4
enospc errors during balance
Ago 22 16:01:20 ronanarraes-osd kernel: BTRFS info (device sda6):
relocating block group 38415630336 flags 34
Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): found
1 extents
Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): 8
enospc errors during balance

Does it add anything relevant to the problem?

Regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-16 Thread Ronan Arraes Jardim Chagas
Em Seg, 2016-08-15 às 17:24 -0600, Chris Murphy escreveu:
> On Mon, Aug 15, 2016 at 5:12 PM, Ronan Chagas 
> wrote:
> > 
> > Hi guys!
> > 
> > It happened again. The computer was completely unusable. The only
> > useful
> > message I saw was this one:
> > 
> > http://img.ctrlv.in/img/16/08/16/57b24b0bb2243.jpg
> > 
> > Does it help?
> > 
> > I decided to format and reinstall tomorrow. This is a production
> > machine and
> > I have to fix this ASAP.
> 
> Looks similar to this:
> https://lkml.org/lkml/2016/3/28/230
> 
> Can you describe the workload happening at the time?

I was copying my /home using rsyinc when this happened. Unfortunately I
needed to format this machine because it is a production system. If I
see any problems related to that, I will report to this mailing list.

Regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-12 Thread Ronan Arraes Jardim Chagas
Em Sex, 2016-08-12 às 12:02 -0600, Chris Murphy escreveu:
> Tons of unallocated space. What kernel messages do you get for the
> enospc? It sounds like this will be one of the mystery -28 error file
> systems. So far as I recall the only work around is recreating the
> file system. There are two additional things you can try: mount with
> enospc_debug mount option and see if you can gather more information
> about the problem. Or try a 4.8rc1 kernel which as a large number of
> enospc changes.
> 
> 

Unfortunately no log was written due to the lack of space :)
Next time it happens, I will take a screenshot of the message. Do you
think that if I reinstall my openSUSE it will be fixed?

Regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-12 Thread Ronan Arraes Jardim Chagas
Hi guys,

I'm facing a daily problem with BTRFS. Almost everyday, I get the
message "No space left on device". Sometimes I can recover by balancing
the system but sometimes even balancing does not work due to the lack
of space. In this case, only a hard reset works if I can't delete some
files. The problem is that I have a huge unallocated space as you can
see here:

# btrfs fi usage /
Overall:
Device size:   1.26TiB
Device allocated:    119.07GiB
Device unallocated:    1.14TiB
Device missing:  0.00B
Used:    115.08GiB
Free (estimated):      1.14TiB  (min: 586.21GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:113.01GiB, Used:111.19GiB
   /dev/sda6 113.01GiB

Metadata,DUP: Size:3.00GiB, Used:1.94GiB
   /dev/sda6   6.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB
   /dev/sda6  64.00MiB

Unallocated:
   /dev/sda6   1.14TiB

It is not easy to trigger the problem. But I do find some correlation
between two things:

1) When I started to create jails to build openSUSE packages locally,
then the problem happens more often. In these jails, some directories
like /dev/, /dev/pts, /proc, are mounted inside the jail.

2) When I open my KVM, I also see this problem more often. Notice,
however, that the KVM disk is stored in another EXT4 partition.

I would be glad if anyone can help me to fix it. In the following, I'm
providing more information about my system:

# uname -a
Linux ronanarraes-osd 4.7.0-1-default #1 SMP PREEMPT Mon Jul 25
08:42:47 UTC 2016 (89a2ada) x86_64 x86_64 x86_64 GNU/Linux

# btrfs --version
btrfs-progs v4.6.1+20160714

# btrfs fi show
Label: none  uuid: 80381f7f-8cef-4bd8-bdbc-3487253ee566
Total devices 1 FS bytes used 113.13GiB
devid1 size 1.26TiB used 119.07GiB path /dev/sda6

# btrfs fi df /
Data, single: total=113.01GiB, used=111.19GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=3.00GiB, used=1.94GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Regards,
Ronan Arraes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html