Re: Entirely unexpected ENOSPC?

2009-03-09 Thread Yien Zheng

 This is just a hunch, but maybe the handling of spare files (such as
 .vdi) is not ideal or not what we're used to with extN. Normally
 skipped blocks do not count towards the disk full size, but given
 btrfs' nature they may count as a reservation that would certainly
 cause an early ENOSPC.


I don't think this is the case, since my .vdi is a dynamically
expanding file, and isn't a sparse file that wants to reserve more
space than it actually is taking up.  So the vdi file is reported to
be 5G by du, and it is indeed taking up 5G (and not 3G).

 You can try to narrow down the problem using qemu-img or dd. Example:

 qemu-img create -f raw test.img 16G

 See if it counts towards the disk fill count and ENOSPC threshold. See
 if it even completes at all :P


I tried a test like this for kicks after deleting my vdi file.  I
tried a while loop:

while [ 1 ]; dd if=/dev/zero of=file.`date +%s` count=2097152; done

My machine subsequently froze.  Repeating the experiment unvieled that
eventually the system get stuck on pdflush taking up 100% CPU.  At
this point, I had to turn off my laptop, as a soft reset was not
possible, even with a halt command.

At this point I'm wondering if this is a anomaly or if it has anything
to do with using an SSD.  It seems the pre-2.7.29-rc7 code had a hard
stop at 85%.  But the recent patch doesn't seem to have solve the
issue for me.  Is there another issue that makes btrfs want to reserve
2G free?  I see another email with someone growing their filesystem
from 48G to 70G because they ran out of space on their 50G disk, which
should still have 2G free.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Entirely unexpected ENOSPC?

2009-03-09 Thread Hugo Mills
On Mon, Mar 09, 2009 at 07:08:16AM -0600, Yien Zheng wrote:
 At this point I'm wondering if this is a anomaly or if it has anything
 to do with using an SSD.  It seems the pre-2.7.29-rc7 code had a hard
 stop at 85%.  But the recent patch doesn't seem to have solve the
 issue for me.  Is there another issue that makes btrfs want to reserve
 2G free?  I see another email with someone growing their filesystem
 from 48G to 70G because they ran out of space on their 50G disk, which
 should still have 2G free.

   Not quite -- I was some 5G free on a 50G filesystem, without
errors. I expanded the filesystem online to 70G because I knew I would
run out within the next few hours. Despite the expansion, it still ran
out at (just short of) 50G.

   Unless you've resized your filesystem online, I think we're seeing
different problems.

   Hugo.

-- 
=== Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Do not meddle in the affairs of system administrators,  for ---   
  they are subtle,  and quick to anger.  


signature.asc
Description: Digital signature


Re: Entirely unexpected ENOSPC?

2009-03-07 Thread Yien Zheng
On Wed, 04 Mar 2009 14:48:43 -0600, Hugo Mills hugo-l...@carfax.org.uk wrote:

 On Wed, Mar 04, 2009 at 01:50:53PM -0500, Josef Bacik wrote:
 On Wed, Mar 04, 2009 at 06:06:19PM +, Hugo Mills wrote:
 Last night, this event jammed up a good chunk of my server:
 
  Mar  4 01:51:36 vlad kernel: btrfs searching for 1716224 bytes,
 num_bytes 1716224, loop 2, allowed_alloc 1
  Mar  4 01:51:36 vlad kernel: btrfs searching for 860160 bytes,
 num_bytes 860160, loop 2, allowed_alloc 1
  [lots of this...]
  Mar  4 01:55:52 vlad kernel: btrfs searching for 4096 bytes,
 num_bytes 4096, loop 2, allowed_alloc 1
  Mar  4 01:55:52 vlad kernel: btrfs allocation failed flags 1, wanted
 4096
  Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full
  Mar  4 01:55:52 vlad kernel: block group 12582912 has 8388608 bytes,
 8388608 used 0 pinned 0 reserved
  Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than
 bytes is
  Mar  4 01:55:52 vlad kernel: block group 1103101952 has 1073741824
 bytes, 1073741824 used 0 pinned 0 reserved
  Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than
 bytes is
  [30 more lines of this]

 So yeah thats expected, you ran out of space.  The key thing is this

 Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full

 If space_info has 0 free and is full, then there is no space to
 allocate for it
 and its completely used.  I'd recommend switching to the -rc7 kernel
 since that
 has things in place to keep this from happening as often.  Thanks,

I'll do that.

However, what's confusing me is that the filesystem was reported as
 less than half full (17/41GiB used) at the time that it decided it had
 no space. Is there any likely explanation for that behaviour?

I've used btrfsctl to resize it online several times: shrink by
 1GiB, then enlarge by 12, 10, 10GiB. Might that have been a factor?

Hugo.


I just started playing with btrfs on my SSD drive last week and
encountered the out of space problem using VirtualBox .vdi disks on
the btrfs partition.  I initially used the backport to ubuntu posted
by Filip Brčić with my 2.6.27-7-generic kernel (from Linux Mint 6 KDE
CE RC1).

I downloaded and compiled the latest git version ( 2.6.29-rc7) with
the ENOSPC patches, but still run out of disk space quite prematurely.
 With 2.6.27-7 based on btrfs 0.17, I was running out of disk space at
with 1.9G free.  Now with the patched git in 2.6.29-rc7, it's running
out with 1.7G free:

df -h /mnt/btrfs/
FilesystemSize  Used Avail Use% Mounted on
/dev/sdc1  13G   11G  1.7G  87% /mnt/btrfs

This is the same result as from the btrfs unstable repository based on
2.6.29-rc3 which I also tried from git clone
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git.
 It looks like the btrfs code from this repository is identical to
rc7, but I was hoping some other kernel changes in rc7 made the
situation better as Josef implied.

I supposed this is not really an ENOSPC, but it's running out of space
much earlier than I would expect.

Here's my dmesg output:

[  884.445441] no space left, need 8192, 2760704 delalloc bytes,
10717552640 bytes_used, 0 bytes_reserved, 0 bytes_pinned, 0
bytes_readonly, 0 may use10720313344 total
[  912.026372] btrfs searching for 524288 bytes, num_bytes 524288,
loop 2, allowed_alloc 1
[  912.026389] btrfs searching for 262144 bytes, num_bytes 262144,
loop 2, allowed_alloc 1
[  912.026403] btrfs searching for 131072 bytes, num_bytes 131072,
loop 2, allowed_alloc 1
[  912.026426] btrfs searching for 458752 bytes, num_bytes 458752,
loop 2, allowed_alloc 1
[  912.026439] btrfs searching for 229376 bytes, num_bytes 229376,
loop 2, allowed_alloc 1
[...more lines like this]
[ 1363.318175] no space left, need 8192, 81920 delalloc bytes,
10720231424 bytes_used, 0 bytes_reserved, 0 bytes_pinned, 0
bytes_readonly, 0 may use10720313344 total
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Entirely unexpected ENOSPC?

2009-03-07 Thread Yien Zheng
I just tried umounting the partition and got this:

[ 1395.028651] btrfs searching for 69632 bytes, num_bytes 69632, loop
2, allowed_alloc 1
[ 1395.028661] btrfs allocation failed flags 1, wanted 69632
[ 1395.028667] space_info has 81920 free, is full
[ 1395.028673] space_info total=10720313344, pinned=2678784,
delalloc=0, may_use=0, used=10717552640
[ 1395.028681] block group 12582912 has 8388608 bytes, 8388608 used 0
pinned 0 reserved
[ 1395.028687] 0 blocks of free space at or bigger than bytes is
[ 1395.028694] block group 1103101952 has 1073741824 bytes, 1072877568
used 864256 pinned 0 reserved
[ 1395.028700] 0 blocks of free space at or bigger than bytes is
[ 1395.028706] block group 2176843776 has 1073741824 bytes, 1073741824
used 0 pinned 0 reserved
[ 1395.028712] 0 blocks of free space at or bigger than bytes is
[ 1395.028718] block group 3250585600 has 1073741824 bytes, 1073709056
used 32768 pinned 0 reserved
[ 1395.028724] 0 blocks of free space at or bigger than bytes is
[ 1395.028730] block group 4324327424 has 1073741824 bytes, 1073741824
used 0 pinned 0 reserved
[ 1395.028736] 0 blocks of free space at or bigger than bytes is
[ 1395.028742] block group 5398069248 has 1073741824 bytes, 1073741824
used 0 pinned 0 reserved
[ 1395.028748] 0 blocks of free space at or bigger than bytes is
[ 1395.028754] block group 6471811072 has 1073741824 bytes, 1073741824
used 0 pinned 0 reserved
[ 1395.028760] 0 blocks of free space at or bigger than bytes is
[ 1395.028767] block group 7545552896 has 1073741824 bytes, 1073618944
used 122880 pinned 0 reserved
[ 1395.028772] 0 blocks of free space at or bigger than bytes is
[ 1395.028779] block group 8619294720 has 1073741824 bytes, 1073639424
used 102400 pinned 0 reserved
[ 1395.028785] 0 blocks of free space at or bigger than bytes is
[ 1395.028791] block group 9693036544 has 1073741824 bytes, 1073549312
used 192512 pinned 0 reserved
[ 1395.028797] 0 blocks of free space at or bigger than bytes is
[ 1395.028804] block group 10766778368 has 1048248320 bytes,
1046802432 used 1363968 pinned 0 reserved
[ 1395.028811] 0 blocks of free space at or bigger than bytes is
[ 1395.028858] [ cut here ]
[ 1395.028863] kernel BUG at fs/btrfs/extent-tree.c:3412!
[ 1395.028869] invalid opcode:  [#1] SMP
[ 1395.028876] last sysfs file:
/sys/devices/pci:00/:00:1e.0/:0b:02.0/rf_kill
[ 1395.028882] Modules linked in: arc4 ecb lib80211_crypt_wep
af_packet radeon drm i2c_core sco bridge rfcomm stp bnep l2cap
bluetooth vboxnetflt vboxdrv ppdev acpi_cpufreq cpufreq_ondemand
cpufreq_conservative cpufreq_userspace cpufreq_powersave cpufreq_stats
freq_table sbs container sbshc pci_slot iptable_filter ip_tables
x_tables loop btrfs zlib_deflate crc32c libcrc32c lp pcmcia joydev
thinkpad_acpi rfkill led_class nvram snd_intel8x0 snd_ac97_codec
ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy evdev
snd_seq_oss psmouse snd_seq_midi snd_rawmidi serio_raw
snd_seq_midi_event pcspkr snd_seq snd_timer snd_seq_device
yenta_socket rsrc_nonstatic pcmcia_core ipw2200 libipw lib80211 snd
iTCO_wdt iTCO_vendor_support soundcore snd_page_alloc battery ac video
output parport_pc parport nsc_ircc irda crc_ccitt button shpchp
pci_hotplug intel_agp agpgart ext3 jbd mbcache sd_mod crc_t10dif
usb_storage libusual sg ahci pata_acpi ata_generic ata_piix libata
scsi_mod ehci_hcd uhci_hcd usbcore tg3 libphy thermal processor fan
fuse
[ 1395.029058]
[ 1395.029064] Pid: 4015, comm: btrfs-delalloc- Not tainted
(2.6.29-rc7-custom #2) 2686DHU
[ 1395.029071] EIP: 0060:[f8055289] EFLAGS: 00010257 CPU: 0
[ 1395.029108] EIP is at __btrfs_reserve_extent+0x2d9/0x450 [btrfs]
[ 1395.029114] EAX: f6e6535c EBX: f5879300 ECX:  EDX: 0001
[ 1395.029120] ESI: 00011000 EDI:  EBP: f5a9fe8c ESP: f5a9fe04
[ 1395.029126]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[ 1395.029132] Process btrfs-delalloc- (pid: 4015, ti=f5a9e000
task=f6f7b100 task.ti=f5a9e000)
[ 1395.029137] Stack:
[ 1395.029141]  f80960dc 81c0 0002 3e7b  3e64f000
 0014d000
[ 1395.029154]       
 0001
[ 1395.029168]       00011000
 0001
[ 1395.029183] Call Trace:
[ 1395.029202]  [f8055472] ? btrfs_reserve_extent+0x72/0xb0 [btrfs]
[ 1395.029240]  [f80659dd] ? submit_compressed_extents+0x17d/0x4c0 [btrfs]
[ 1395.029282]  [f8065db3] ? async_cow_submit+0x93/0xa0 [btrfs]
[ 1395.029322]  [f8087f3c] ? run_ordered_completions+0x6c/0xc0 [btrfs]
[ 1395.029363]  [f8088110] ? worker_loop+0x90/0x1e0 [btrfs]
[ 1395.029401]  [f8088080] ? worker_loop+0x0/0x1e0 [btrfs]
[ 1395.029436]  [c0146dcc] ? kthread+0x3c/0x70
[ 1395.029448]  [c0146d90] ? kthread+0x0/0x70
[ 1395.029456]  [c0103fa7] ? kernel_thread_helper+0x7/0x10
[ 1395.029465] Code: 53 50 8d 82 74 ff ff ff 89 45 e8 8b 80 8c 00 00
00 0f 18 00 90 83 c3 5039 d3 89 5d ec 0f 85 da 00 00 00 8b 45 f0 e8 f7
5e 0f c8 0f 0b eb fe 8d 76 00 

Re: Entirely unexpected ENOSPC?

2009-03-04 Thread Josef Bacik
On Wed, Mar 04, 2009 at 06:06:19PM +, Hugo Mills wrote:
Last night, this event jammed up a good chunk of my server:
 
 Mar  4 01:51:36 vlad kernel: btrfs searching for 1716224 bytes, num_bytes 
 1716224, loop 2, allowed_alloc 1
 Mar  4 01:51:36 vlad kernel: btrfs searching for 860160 bytes, num_bytes 
 860160, loop 2, allowed_alloc 1
 [lots of this...]
 Mar  4 01:55:52 vlad kernel: btrfs searching for 4096 bytes, num_bytes 4096, 
 loop 2, allowed_alloc 1
 Mar  4 01:55:52 vlad kernel: btrfs allocation failed flags 1, wanted 4096
 Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full
 Mar  4 01:55:52 vlad kernel: block group 12582912 has 8388608 bytes, 8388608 
 used 0 pinned 0 reserved
 Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than bytes is
 Mar  4 01:55:52 vlad kernel: block group 1103101952 has 1073741824 bytes, 
 1073741824 used 0 pinned 0 reserved
 Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than bytes is
 [30 more lines of this]

So yeah thats expected, you ran out of space.  The key thing is this

Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full

If space_info has 0 free and is full, then there is no space to allocate for it
and its completely used.  I'd recommend switching to the -rc7 kernel since that
has things in place to keep this from happening as often.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Entirely unexpected ENOSPC?

2009-03-04 Thread Hugo Mills
On Wed, Mar 04, 2009 at 01:50:53PM -0500, Josef Bacik wrote:
 On Wed, Mar 04, 2009 at 06:06:19PM +, Hugo Mills wrote:
 Last night, this event jammed up a good chunk of my server:
  
  Mar  4 01:51:36 vlad kernel: btrfs searching for 1716224 bytes, num_bytes 
  1716224, loop 2, allowed_alloc 1
  Mar  4 01:51:36 vlad kernel: btrfs searching for 860160 bytes, num_bytes 
  860160, loop 2, allowed_alloc 1
  [lots of this...]
  Mar  4 01:55:52 vlad kernel: btrfs searching for 4096 bytes, num_bytes 
  4096, loop 2, allowed_alloc 1
  Mar  4 01:55:52 vlad kernel: btrfs allocation failed flags 1, wanted 4096
  Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full
  Mar  4 01:55:52 vlad kernel: block group 12582912 has 8388608 bytes, 
  8388608 used 0 pinned 0 reserved
  Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than bytes 
  is
  Mar  4 01:55:52 vlad kernel: block group 1103101952 has 1073741824 bytes, 
  1073741824 used 0 pinned 0 reserved
  Mar  4 01:55:52 vlad kernel: 0 blocks of free space at or bigger than bytes 
  is
  [30 more lines of this]
 
 So yeah thats expected, you ran out of space.  The key thing is this
 
 Mar  4 01:55:52 vlad kernel: space_info has 0 free, is full
 
 If space_info has 0 free and is full, then there is no space to allocate for 
 it
 and its completely used.  I'd recommend switching to the -rc7 kernel since 
 that
 has things in place to keep this from happening as often.  Thanks,

   I'll do that.

   However, what's confusing me is that the filesystem was reported as
less than half full (17/41GiB used) at the time that it decided it had
no space. Is there any likely explanation for that behaviour?

   I've used btrfsctl to resize it online several times: shrink by
1GiB, then enlarge by 12, 10, 10GiB. Might that have been a factor?

   Hugo.

-- 
=== Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- How do you become King?  You stand in the marketplace and ---
  announce you're going to tax everyone. If you get out  
   alive, you're King.   


signature.asc
Description: Digital signature