subject:"RE\: ENOSPC errors during raid1 rebalance"

Re: ENOSPC errors during raid1 rebalance

2014-03-13 Thread Eugene Crosser

Hello,

I want to report that I have the same problem as Michael Russo, except in my
case there is definitely *a lot* of free space.

I had ext4 on a 1Tb LVM mirror. The filesystem was 96% full, with many multi-Gb
files. I successfully converted it into btrfs, removed ext2_saved subvolume, but
did *not* defragment or balance.

Then I added two fresh 4Tb disks to the filesystem, and tried to convert it to
raid1. My plan was to then delete the original LVM disk and have all my data
migrated to the new 4Tb disks under btrfs mirror.

But balancing cannot complete with the same symptoms:

[...]
[12746.391828] block group has cluster?: no
[12746.391830] 0 blocks of free space at or bigger than bytes is
[12747.420098] btrfs: 35 enospc errors during balance

root@pccross:~# btrfs fi sh
Label: 'export'  uuid: 02f39e9d-9115-4a79-9015-a3a9decb87cf
Total devices 3 FS bytes used 798.15GB
devid3 size 3.64TB used 855.03GB path /dev/sdd1
devid2 size 3.64TB used 855.00GB path /dev/sdc1
devid1 size 891.51GB used 175.48GB path /dev/md5

Btrfs v0.20-rc1

root@pccross:~# btrfs fi df /export
Data, RAID1: total=849.00GB, used=650.25GB
Data: total=175.48GB, used=145.64GB
System: total=32.00MB, used=136.00KB
Metadata, RAID1: total=6.00GB, used=2.21GB

root@pccross:~# uname -a
Linux pccross 3.13.0-17-generic #37-Ubuntu SMP Mon Mar 10 21:44:01 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux

Attempt "btrfs device delete" fails with the same "no space" diagnostic.

I am running defragmentation on all files bigger than 1Gb now, and see what
happens. If that does not help, is there any other advice? I can collect
debugging data if needed.

Thanks,

Eugene



signature.asc
Description: OpenPGP digital signature

RE: ENOSPC errors during raid1 rebalance

2014-03-07 Thread Mike Russo

Alright! After doing:

cd /mymedia; find . -type f | while read file; do mv -v "$file" /dev/shm; 
f2=`basename "$file"`; mv -v "/dev/shm/$f2" "$file"; done 

I finally moved whatever files out of the "single" allocation and back onto the 
new RAID1 profile:

oot@ossy:~# /usr/src/btrfs-progs/btrfs fi df /mymedia
Data, RAID1: total=1.23TiB, used=1000.43GiB
Data, single: total=10.00GiB, used=0.00
System, RAID1: total=32.00MiB, used=184.00KiB
Metadata, RAID1: total=3.00GiB, used=1.34GiB

And then the rebalance finally was able to move those two block groups and I 
got no errors:

root@ossy:~# btrfs balance start -dconvert=raid1,soft /mymedia
Done, had to relocate 2 out of 1264 chunks

And now I'm totally RAID1:

root@ossy:~# /usr/src/btrfs-progs/btrfs fi df /mymedia
Data, RAID1: total=1.23TiB, used=1000.43GiB
System, RAID1: total=32.00MiB, used=184.00KiB
Metadata, RAID1: total=3.00GiB, used=1.34GiB

Yay!! If I want to get back the space I can do another defragment but I don't 
really care right now.  Thanks everyone for all your help on this! Hopefully 
this thread will assist anyone else in the future if this occurs again. 

Sincerely,

Michael Russo, Systems Engineer
PaperSolve, Inc.
268 Watchogue Road
Staten Island, NY 10314

Randomly generated quote of the last 5 minutes:
A good plan today is better than a perfect plan tomorrow.
-- Patton
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: ENOSPC errors during raid1 rebalance

2014-03-07 Thread Mike Russo

Thanks Hugo, that makes sense, and maybe leads to a possible way to fix the 
issue in future versions of btrfs-convert or a way to handle it in the balance 
code. 
What I did to find files with extents:

cd /mymedia
find . -type f -print0 | xargs -0 filefrag | grep -v 1\ extent | grep -v 0\ 
extent | awk -F: '{print $1}' > /tmp/extent
cat /tmp/extent | while read file; do mv -v "$file" /dev/shm; f2=`basename 
"$file"`; mv -v "/dev/shm/$f2" "$file"; done

I no longer care that even after doing this I still have a file with multiple 
extents, I just don't want them inside those block groups with >1GB extents. 
(BTW my terminology may be all off here, I just know that in my syslog btrfs 
tries to relocate two particular "block groups" and gets 2 enospc errors for 
them.)

But since I'm still getting the errors the files I care about probably aren't 
in this list so I'll do it for all the files on the system (since I can't find 
out what files are in those block groups).  


-Original Message-
From: Hugo Mills [mailto:h...@carfax.org.uk] 
Sent: Friday, March 07, 2014 3:02 AM
To: Mike Russo
Cc: linux-btrfs@vger.kernel.org
Subject: Re: ENOSPC errors during raid1 rebalance

   The defrag operation, by its nature, _doesn't_ preserve extents, and thus 
can act to break up the large extents, making it possible to balance the chunks 
that the offending extents live on.

   Hugo.

--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- If the first-ever performance is the première,  is the --- 
  last-ever performance the derrière?   
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-07 Thread Duncan

Hugo Mills posted on Fri, 07 Mar 2014 08:02:13 + as excerpted:

> On Fri, Mar 07, 2014 at 01:13:53AM +, Michael Russo wrote:
>> Duncan <1i5t5.duncan  cox.net> writes:
>> 
>> > But if you're not using compression, /that/ can't explain it...
>> > 
>> > 
>> Ha! Well while that was an interesting discussion of fragmentation, I
>> am only using the default mount options here and so no compression.
> 
>I _think_ the problem here is that there may have been some extents
> created during the conversion which were over 1 GiB in size (or at least
> which run across two or more chunks). This causes problems, because
> there's nowhere that they can be written to by the balance --
> which preserves extents -- because none of the allocation units (chunks)
> are big enough.
> 
>The defrag operation, by its nature, _doesn't_ preserve extents,
> and thus can act to break up the large extents, making it possible to
> balance the chunks that the offending extents live on.

Now /that/ would explain the issue, or at least all I've read of it from 
here.  Nice job! =:^)

Obviously >1 GB files break/fragment on 1-gig data-chunk lines when 
(re)written on btrfs, since that's the biggest allocation unit btrfs has, 
and at least once the filesystem has been (near) full once, btrfs is 
extremely unlikely to be able to place those gig-chunks contiguously, 
thus triggering the issue.

Meanwhile, the move-to-tmpfs-and-back should indeed break up those >1-gig 
blocks too, since the that would force a rewrite and thus a reallocation, 
which would then follow btrfs 1-gig-chunk rules.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-07 Thread Hugo Mills

On Fri, Mar 07, 2014 at 01:13:53AM +, Michael Russo wrote:
> Duncan <1i5t5.duncan  cox.net> writes:
> 
> > But if you're not using compression, /that/ can't explain it...
> > 
> 
> Ha! Well while that was an interesting discussion of fragmentation,
> I am only using the default mount options here and so no compression. 
> The only reason I'm really even looking at the fragmentation 
> issue is because running the defragment (with varying sizes of
> "-t" which I'm not sure why that's even necessary) seemed
> to force btrfs to move segments around and let me rebalance
> more block groups.  I don't even care if the files are defragged,
> I just want them all in the RAID1 profile.  Hopefully if I move
> each file out to some other FS like /dev/shm and then back
> it will work, I just gotta hack a script together to do so.

   I _think_ the problem here is that there may have been some extents
created during the conversion which were over 1 GiB in size (or at
least which run across two or more chunks). This causes problems,
because there's nowhere that they can be written to by the balance --
which preserves extents -- because none of the allocation units
(chunks) are big enough.

   The defrag operation, by its nature, _doesn't_ preserve extents,
and thus can act to break up the large extents, making it possible to
balance the chunks that the offending extents live on.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- If the first-ever performance is the première,  is the --- 
  last-ever performance the derrière?   

signature.asc
Description: Digital signature

Re: ENOSPC errors during raid1 rebalance

2014-03-06 Thread Michael Russo

Duncan <1i5t5.duncan  cox.net> writes:

> But if you're not using compression, /that/ can't explain it...
> 

Ha! Well while that was an interesting discussion of fragmentation,
I am only using the default mount options here and so no compression. 
The only reason I'm really even looking at the fragmentation 
issue is because running the defragment (with varying sizes of
"-t" which I'm not sure why that's even necessary) seemed
to force btrfs to move segments around and let me rebalance
more block groups.  I don't even care if the files are defragged,
I just want them all in the RAID1 profile.  Hopefully if I move
each file out to some other FS like /dev/shm and then back
it will work, I just gotta hack a script together to do so.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-06 Thread Chris Murphy


On Mar 5, 2014, at 3:13 PM, Michael Russo  wrote:

> Chris Murphy  colorremedies.com> writes:
> 
>> Did you do a defrag and balance after ext4>btrfs conversion, 
>> but before data/metadata profile conversion?
> 
> No I didn't, as I thought it was only optional and didn't realize 
> it might later affect my ability to change profiles. 


It's possible it's more necessary in certain situations than others.


Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-05 Thread Duncan

Michael Russo posted on Wed, 05 Mar 2014 22:13:10 + as excerpted:

> Chris Murphy  colorremedies.com> writes:
> 
>> You could also try a full defragment by specifying -r on the mount
>> point with a small -t value to effectively cause everything to be
>> subject to defragmenting. If this still doesn't permit soft rebalance,
>> then maybe filefrag can find files that have more than 1 extent and
>> just copy them (make duplicates, delete the original). Any copy will be
>> allocated into chunks with the new profile.
> 
> I would think so too.  But it doesn't seem to be happening.
> Here is an example with one file:
> 
> root@ossy:/mymedia# filefrag output.wav output.wav: 2 extents found
> root@ossy:/mymedia# /usr/src/btrfs-progs/btrfs fi de -t 1
> /mymedia/output.wav root@ossy:/mymedia# filefrag output.wav output.wav:
> 2 extents found
> 
> btrfs does not defrag the file. And copying the file usually doesn't
> defrag it either:
> 
> root@ossy:/mymedia# cp output.wav output.wav.bak root@ossy:/mymedia#
> filefrag output.wav.bak output.wav.bak: 2 extents found
> 
> I even tried copying a large file to another filesystem (/dev/shm),
>  removing the original, and copying it back, and more often than not
> it still had more than 1 extent.

This was covered in one thread recently, but looking back in this one I 
didn't find it covered here, so...

What are your mount options?  Do they include compress and/or compress-
force?  Because filefrag doesn't understand btrfs compression yet and 
counts each 128 KiB (I believe) compression block as a separate extent.  
I'm not sure whether that's 128 KiB pre-compression or post-compression 
size, and I'm not even positive it's 128 KiB, but certainly, if the file 
is large enough and btrfs is compressing it, filefrag will false-positive 
report multiple extents.  That's a known issue with it ATM.

Meanwhile, there's ongoing work to teach filefrag about btrfs compression 
so it can report fragmentation accurately, but from what I've read 
they're working on a general kernel-VFS-level API for that so the same 
general API can be used by other filesystems, and getting proper 
agreement on that API, and having both the kernel and filefrag implement 
it isn't a simple single-kernel-cycle project.  There's a lot of 
filesystems other than btrfs that could potentially use this sort of 
thing, and getting a solution that will work well for all of them is hard 
work, both technically and politically.  But once it's implemented, /
correctly/, the entire Linux kernel filesystem space will benefit, just 
as btrfs is getting the benefit of the filefrag tool that ships with 
e2fsprogs, and the filesystem testing that ships as xfstests. =:^)

But if you're not using compression, /that/ can't explain it...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-05 Thread Michael Russo

Chris Murphy  colorremedies.com> writes:

> Did you do a defrag and balance after ext4>btrfs conversion, 
> but before data/metadata profile conversion?

No I didn't, as I thought it was only optional and didn't realize 
it might later affect my ability to change profiles. 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-05 Thread Michael Russo

Chris Murphy  colorremedies.com> writes:

> You could also try a full defragment by specifying -r on the mount point
> with a small -t value to effectively cause everything to be subject
> to defragmenting. If this still doesn't permit soft rebalance, then maybe
> filefrag can find files that have more than 1 extent and just copy 
> them (make duplicates, delete the original). Any copy will be
> allocated into chunks with the new profile.

I would think so too.  But it doesn't seem to be happening. 
Here is an example with one file:

root@ossy:/mymedia# filefrag output.wav
output.wav: 2 extents found
root@ossy:/mymedia# /usr/src/btrfs-progs/btrfs fi de -t 1 /mymedia/output.wav 
root@ossy:/mymedia# filefrag output.wav
output.wav: 2 extents found

btrfs does not defrag the file. And copying the file usually
doesn't defrag it either:

root@ossy:/mymedia# cp output.wav output.wav.bak
root@ossy:/mymedia# filefrag output.wav.bak
output.wav.bak: 2 extents found

I even tried copying a large file to another filesystem (/dev/shm),
 removing the original, and copying it back, and more often than not 
it still had more than 1 extent. 

If I copy each file out to another filesystem and then back, will btrfs 
not use any of the space on the "single" and just re-allocate space 
on the RAID1 like I want it to?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Chris Murphy

On Mar 4, 2014, at 5:27 PM, Mike Russo  wrote:

> I'm sure this is due to the ext4 conversion, but that means the utility is 
> making a btrfs filesystem that later can't be converted to another profile 
> for some reason. 

https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3

"But still, the new filesystem inherits the block placement and file data 
fragmentation. It is highly recommended to do full defragmentation and full 
rebalance before first use. It's not required, but will have impact on 
performance."

Did you do a defrag and balance after ext4>btrfs conversion, but before 
data/metadata profile conversion?

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Chris Murphy

On Mar 4, 2014, at 5:27 PM, Mike Russo  wrote:

> Chris Murphy  colorremedies.com> writes:
> 
>>> How can I find out what file is on the block group that 
>>> it's having a problem with?
>> 
>> I think that's btrfs-debug-tree -b ? But don't hold me to that. 
>> I haven't done enough debugging to find files
>> from block numbers. Another that might be relevant is 
>> btrfs inspect-internal logical-resolve.
>> 
> 
> Nah, I don't think they're talking about the same thing. The messages I get 
> in syslog for the 2 blocks are:
> 
> Mar  4 07:59:23 ossy kernel: [124731.175116] BTRFS info (device sdd1): 
> relocating block group 894460493824 flags 1
> Mar  4 07:59:32 ossy kernel: [124739.613467] BTRFS error (device sdd1): 
> allocation failed flags 17, wanted 1368420352
> Mar  4 07:59:32 ossy kernel: [124739.613473] BTRFS: space_info 1 has 
> 105551204352 free, is not full
> Mar  4 07:59:32 ossy kernel: [124739.613476] BTRFS: space_info 
> total=1312112508928, used=1201166741504, pinned=0, reserved=565596160, 
> may_use=1368420352, readonly=4828966912
> 
> Mar  4 10:49:33 ossy kernel: [134932.248146] BTRFS info (device sdd1): 
> relocating block group 846142111744 flags 1
> Mar  4 10:49:39 ossy kernel: [134938.440347] BTRFS error (device sdd1): 
> allocation failed flags 17, wanted 1219825664
> Mar  4 10:49:39 ossy kernel: [134938.440353] BTRFS: space_info 1 has 
> 116232978432 free, is not full
> Mar  4 10:49:39 ossy kernel: [134938.440356] BTRFS: space_info 
> total=1322849927168, used=1201311391744, pinned=0, reserved=476590080, 
> may_use=1224540160, readonly=4828966912
> 
> But if I try to examine either of those block groups I get an error message 
> that makes me think the 2 numbers are not really the same thing. 

You could also try a full defragment by specifying -r on the mount point with a 
small -t value to effectively cause everything to be subject to defragmenting. 
If this still doesn't permit soft rebalance, then maybe filefrag can find files 
that have more than 1 extent and just copy them (make duplicates, delete the 
original). Any copy will be allocated into chunks with the new profile.

> 
> If anyone knows why that allocation should fail (what is "flags 17"?), or how 
> to force something more to happen, please reply. I'm sure this is due to the 
> ext4 conversion, but that means the utility is making a btrfs filesystem that 
> later can't be converted to another profile for some reason. 

The conversion description sounds like the main thing that's occurring is 
adding Btrfs metadata. The ext4 data and metadata are left intact. I'm not sure 
what the ext4 allocation looks like compared to data allocated into chunks by 
Btrfs directly. That could be a contributing factor, but it's also possible 
older Btrfs filesystems get fragmented in a similar way that they could have 
profile conversion problems too. Answering that would help determine if 
btrfs-convert or the rebalancing code needs to account for this possibility. A 
reproducer would be useful.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Mike Russo

Chris Murphy  colorremedies.com> writes:

> > How can I find out what file is on the block group that 
> > it's having a problem with?
> 
> I think that's btrfs-debug-tree -b ? But don't hold me to that. 
> I haven't done enough debugging to find files
> from block numbers. Another that might be relevant is 
> btrfs inspect-internal logical-resolve.
> 

Nah, I don't think they're talking about the same thing. The messages I get in 
syslog for the 2 blocks are:

Mar  4 07:59:23 ossy kernel: [124731.175116] BTRFS info (device sdd1): 
relocating block group 894460493824 flags 1
Mar  4 07:59:32 ossy kernel: [124739.613467] BTRFS error (device sdd1): 
allocation failed flags 17, wanted 1368420352
Mar  4 07:59:32 ossy kernel: [124739.613473] BTRFS: space_info 1 has 
105551204352 free, is not full
Mar  4 07:59:32 ossy kernel: [124739.613476] BTRFS: space_info 
total=1312112508928, used=1201166741504, pinned=0, reserved=565596160, 
may_use=1368420352, readonly=4828966912

Mar  4 10:49:33 ossy kernel: [134932.248146] BTRFS info (device sdd1): 
relocating block group 846142111744 flags 1
Mar  4 10:49:39 ossy kernel: [134938.440347] BTRFS error (device sdd1): 
allocation failed flags 17, wanted 1219825664
Mar  4 10:49:39 ossy kernel: [134938.440353] BTRFS: space_info 1 has 
116232978432 free, is not full
Mar  4 10:49:39 ossy kernel: [134938.440356] BTRFS: space_info 
total=1322849927168, used=1201311391744, pinned=0, reserved=476590080, 
may_use=1224540160, readonly=4828966912

But if I try to examine either of those block groups I get an error message 
that makes me think the 2 numbers are not really the same thing. 

root@ossy:~# /usr/src/btrfs-progs/btrfs-debug-tree -b 894460493824 /dev/sdc1
Check tree block failed, want=894460493824, have=7159859086769936959
Check tree block failed, want=894460493824, have=7159859086769936959
Check tree block failed, want=894460493824, have=7159859086769936959
read block failed check_tree_block
Check tree block failed, want=894460493824, have=7159859086769936959
Check tree block failed, want=894460493824, have=7159859086769936959
Check tree block failed, want=894460493824, have=7159859086769936959
read block failed check_tree_block
failed to read 894460493824

Sigh. So I'm stuck at 10GB on single, but I don't know which 10GB it is. 

root@ossy:~# /usr/src/btrfs-progs/btrfs fi df /mymedia
Data, RAID1: total=1.21TiB, used=1.09TiB
Data, single: total=10.00GiB, used=5.50GiB
System, RAID1: total=32.00MiB, used=180.00KiB
Metadata, RAID1: total=2.00GiB, used=1.45GiB

If anyone knows why that allocation should fail (what is "flags 17"?), or how 
to force something more to happen, please reply. I'm sure this is due to the 
ext4 conversion, but that means the utility is making a btrfs filesystem that 
later can't be converted to another profile for some reason. 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Chris Murphy

On Mar 4, 2014, at 11:54 AM, Michael Russo  wrote:

> Chris Murphy  colorremedies.com> writes:
>> Based on my reading of the man page, I think it's expected. 
>> You either need -s -l or -t.
> 
> Ok, although the man page uses [ ] instead of < > and something
> does happen if I don't add them. But if I use "-t 1" wouldn't that
> get everything?

Yeah you're right, you do get a default behavior but I think it's really 
permissive. I've thrown 3 extent files at it, and it's still 3 extents 
afterward.

> 
>> 
>>> Now I've gotten 
>>> it down to only 2 5GB segments that won't move.
>> 
>> Another way would be to just copy it (not reflink) and delete the original.
>> 
> 
> How can I find out what file is on the block group that 
> it's having a problem with?

I think that's btrfs-debug-tree -b ? But don't hold me to that. I haven't done 
enough debugging to find files from block numbers. Another that might be 
relevant is btrfs inspect-internal logical-resolve.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Michael Russo

Chris Murphy  colorremedies.com> writes:
> Based on my reading of the man page, I think it's expected. 
>You either need -s -l or -t.

Ok, although the man page uses [ ] instead of < > and something
does happen if I don't add them. But if I use "-t 1" wouldn't that
get everything?

> 
> >  Now I've gotten 
> > it down to only 2 5GB segments that won't move.
> 
> Another way would be to just copy it (not reflink) and delete the original.
> 

How can I find out what file is on the block group that 
it's having a problem with?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Chris Murphy


On Mar 4, 2014, at 8:55 AM, Michael Russo  wrote:

> Hugo Mills  carfax.org.uk> writes:
> 
>>   This is just a guess, but you might have some large (>1GB) 
> extents
>> in there that span across multiple chunks. I'd suggest running a 
> btrfs
>> defrag on any particularly big files and see if that helps the 
> situation.
>> 
> 
> Doing this is definitely helping, but I have to run the defrag multiple 
> times with different values for "-t". If I don't include -t the defrag 
> runs really quickly but doesn't seem to do anything.

Based on my reading of the man page, I think it's expected. You either need -s 
-l or -t.

>  Now I've gotten 
> it down to only 2 5GB segments that won't move.

Another way would be to just copy it (not reflink) and delete the original.


Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-04 Thread Michael Russo

Hugo Mills  carfax.org.uk> writes:

>This is just a guess, but you might have some large (>1GB) 
extents
> in there that span across multiple chunks. I'd suggest running a 
btrfs
> defrag on any particularly big files and see if that helps the 
situation.
> 

Doing this is definitely helping, but I have to run the defrag multiple 
times with different values for "-t". If I don't include -t the defrag 
runs really quickly but doesn't seem to do anything.  Now I've gotten 
it down to only 2 5GB segments that won't move.  I'm going to 
extract the very big log generated when I run 
"btrfs ba start -dconvert=raid1,soft /mymedia" 
with enospc_debug and send it as a file, hopefully it will help.  
The first couple lines are:

Mar  4 10:49:23 ossy kernel: [134922.806764] BTRFS info (device sdd1): 
relocating block group 894460493824 flags 1
Mar  4 10:49:32 ossy kernel: [134931.650504] BTRFS error (device sdd1): 
allocation failed flags 17, wanted 1368420352
Mar  4 10:49:32 ossy kernel: [134931.650507] BTRFS: space_info 1 has 
113996488704 free, is not full
Mar  4 10:49:32 ossy kernel: [134931.650509] BTRFS: space_info 
total=1320702443520, used=1201311391744, pinned=0, 
reserved=565596160, may_use=1368420352, readonly=4828966912

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Chris Murphy

On Mar 3, 2014, at 1:50 PM, Michael Russo  wrote:

> Oh yeah, it was definitely a problem with either the drives 
> or the external enclosure, which was converting USB to SATA 
> and mirroring the drives internally (it was a WD MyBook 
> Mirror Edition). There was a problem with one of the drives 
> and I replaced it, but before I did it screwed up some 
> data. I think one drive kept trying to do a retry on a read 
> and somehow the logic decided to just give me a "nearby" 
> sector to satisfy the read (I can't really figure out how else 
> 10 seconds of a random but "nearby" in alphabetical order 
> MP3 could get stuck into the middle of another MP3).   

If it's a persistent problem with a particular file, that has the same 10 
seconds from some other song, it's actually been written to disk wrong. That's 
a misdirected write.

10 seconds of some other file playing translates into how many blocks of data 
roughly? That's probably a pretty big misdirect. So I don't know that this 
would be more likely a drive problem, or possibly memory or controller. I'd 
burn a memtest86+ run for a few days if you haven't done this recently to 
remove that as a factor.

> So I removed the entire enclosure and put them inside 
> my case and decided I never wanted it to happen again so I 
> converted to btrfs. :)  Very happy it exists!

The USB to SATA chipsets are all over the map quality wise, a lot of them suck. 
So this sounds like you've eliminated the controller as a factor going forward, 
by changing back to a direct SATA connection. That's a good idea.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Michael Russo

Chris Murphy  colorremedies.com> writes:

> 
> > Chris Murphy  colorremedies.com> writes:
> Gotcha. I think Hugo has the best next step. Defragment.

I think this is going to work. I cancelled a partial defrag 
and did another move attempt, and this time 
5GB got moved! So I'm going to let the whole thing 
finish and try again and that will probably fix it. 
Interestingly I had to specify the -t option to get some
 big files to move, it wouldn't work without it 
(I'm doing "btrfs fi defrag -r -v /mymedia -t 5"). 

> 
> That's not good. It sounds to me like misdirected writes. 
> Or file system corruption. Anyone else?

Oh yeah, it was definitely a problem with either the drives 
or the external enclosure, which was converting USB to SATA 
and mirroring the drives internally (it was a WD MyBook 
Mirror Edition). There was a problem with one of the drives 
and I replaced it, but before I did it screwed up some 
data. I think one drive kept trying to do a retry on a read 
and somehow the logic decided to just give me a "nearby" 
sector to satisfy the read (I can't really figure out how else 
10 seconds of a random but "nearby" in alphabetical order 
MP3 could get stuck into the middle of another MP3).   
So I removed the entire enclosure and put them inside 
my case and decided I never wanted it to happen again so I 
converted to btrfs. :)  Very happy it exists!

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Chris Murphy

On Mar 3, 2014, at 12:24 PM, Michael Russo  wrote:

> Chris Murphy  colorremedies.com> writes:
> 
>> It might be worth adding enospc_debug as a mount option 
> 
> These messages only appear when I mount with enospc_debug.

Gotcha. I think Hugo has the best next step. Defragment.

> As for whether btrfs-convert should be recommended, I 
> mean probably not, but when you've got a big media drive filled 
> with music, pictures, and movies, with no other place to store it, 

Sure I understand the use case. And I think it should work. But I'm wondering 
whether it's more highly regarded than merely just be functional, or not.

> and you've experienced corruption where 5% of your MP3 files
> have 10 seconds of another MP3 stuck in the middle of them
> somehow,  and suddenly realize you NEED a checksumming 
> next-generation file system like btrfs, you're going to convert!

That's not good. It sounds to me like misdirected writes. Or file system 
corruption. Anyone else?

Btrfs can detect and (with data redundancy) can correct for torn and 
misdirected writes, among other problems. But 5% of 1TB is 50GB. That's a lot 
of corrupt data. I think what you need is to replace one or more drives, or 
look for problems elsewhere. Using Btrfs is not bad in this case, but it's also 
a crutch for a bigger problem that shouldn't be happening in the first place.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Michael Russo

Chris Murphy  colorremedies.com> writes:

> It might be worth adding enospc_debug as a mount option 

These messages only appear when I mount with enospc_debug. I 
included them as examples but I can post the full output later if 
needed. As for whether btrfs-convert should be recommended, I 
mean probably not, but when you've got a big media drive filled 
with music, pictures, and movies, with no other place to store it, 
and you've experienced corruption where 5% of your MP3 files
have 10 seconds of another MP3 stuck in the middle of them
somehow,  and suddenly realize you NEED a checksumming 
next-generation file system like btrfs, you're going to convert!




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Chris Murphy

On Mar 3, 2014, at 11:24 AM, Michael Russo  wrote:

> Duncan <1i5t5.duncan  cox.net> writes:
>> 
>> That allows rollback if desired, but does tie up some some space with the 
>> automatically created btrfs "snapshot" that contains the ext3/4 metadata 
>> and untouched data.  
> 
> Nope, I definitely deleted the snapshots, running btrfs sub list 
> gives me nothing back:
> 
> root@ossy:/usr/src/btrfs-progs# /usr/src/btrfs-progs/btrfs sub list /mymedia
> root@ossy:/usr/src/btrfs-progs# 
> 
> Thanks for the detailed reply though. While doing this operation I 
> wanted to not have any snapshots so that I needed the minimum 
> amount of space to do the rebalance.  
> 
> But this all does seem strange right? Why the heck would it refuse
> to move these 70GB and think there are 0 blocks of free space? 

It might be worth adding enospc_debug as a mount option and then retrying the 
balance -dconvert=raid1,soft which should only try to balance unconverted data 
chunks. The metadata all looks converted to raid1. The debug output probably 
only means something to a developer but maybe it'll be enlightening.

Slightly off-topic to your question is whether btrfs-convert is expected to be 
a recommended method of migrating from ext3/4 or if it's more proof of concept 
that ought to work? The current version uses the ext block size for btrfs leaf 
size, rather than the 16KB default leaf size by mkfs.btfs. David Sterba was 
working on this at one point. Also I don't think btrfs-convert yet enables 
extref, although it can be enabled with btrfs-tune after conversion. In any 
case, the conversion isn't quite the same thing as you get with a new file 
system.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Hugo Mills

On Mon, Mar 03, 2014 at 05:23:43PM +, Mike Russo wrote:
> Hi guys -
> I'm trying to convert a disk from single (/dev/sdc1) to RAID1 (dev/sdd1), and 
> the filesystem was previously ext4 but the conversion seemed to go just fine, 
> and I have no snapshots. System and metadata convert, and almost all my data 
> converts, but there are 70 stubborn GB (14 blocks of 5GB each) that refuse to 
> convert and I get ENOSPC errors when trying to reallocate them.

   This is just a guess, but you might have some large (>1GB) extents
in there that span across multiple chunks. I'd suggest running a btrfs
defrag on any particularly big files and see if that helps the situation.

   Hugo.

> Here's my vital stats: (on kernel 3.14-rc4):
> 
> root@ossy:/usr/src/btrfs-progs# /usr/src/btrfs-progs/btrfs fi show
> Label: MyBook  uuid: c99cbefb-6a56-41e2-8f99-19f8b5f67884
> Total devices 2 FS bytes used 1.09TiB
> devid1 size 1.82TiB used 1.12TiB path /dev/sdc1
> devid2 size 1.82TiB used 1.05TiB path /dev/sdd1
> 
> Btrfs v3.12
> root@ossy:/usr/src/btrfs-progs# /usr/src/btrfs-progs/btrfs fi df /mymedia
> Data, RAID1: total=1.04TiB, used=1.03TiB
> Data, single: total=70.00GiB, used=64.98GiB
> System, RAID1: total=32.00MiB, used=160.00KiB
> Metadata, RAID1: total=2.00GiB, used=1.33GiB
> 
> 
> I've already done a -dusage=0 and -dusage=5 balance and that cleans up 
> anything that gets left around when these 14 blocks fail get moved.When 
> mounting with enospc_debug I get messages like this for each block with the 
> problem:
> 
> 
> Mar  3 11:58:16 ossy kernel: [52724.423024] BTRFS: block group 935262683136 
> has 5368709120 bytes, 5321801728 used 0 pinned 0 reserved [readonly]
> Mar  3 11:58:16 ossy kernel: [52724.423025] BTRFS info (device sdd1): block 
> group has cluster?: no
> Mar  3 11:58:16 ossy kernel: [52724.423026] BTRFS info (device sdd1): 0 
> blocks of free space at or bigger than bytes is
> Mar  3 11:58:16 ossy kernel: [52724.423027] BTRFS: block group 942778875904 
> has 5368709120 bytes, 5309296640 used 0 pinned 0 reserved [readonly]
> Mar  3 11:58:16 ossy kernel: [52724.423028] BTRFS info (device sdd1): block 
> group has cluster?: no
> Mar  3 11:58:16 ossy kernel: [52724.423029] BTRFS info (device sdd1): 0 
> blocks of free space at or bigger than bytes is
> 
> I don't have the space to reformat and if this is a genuine bug I'm sure you 
> guys want to fix it too. What else can I do to resolve or help? 
> 
> 
> Sincerely,
> 
> Michael Russo, Systems Engineer
> PaperSolve, Inc.
> 268 Watchogue Road
> Staten Island, NY 10314
> 

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Nothing wrong with being written in Perl... Some of my best ---   
  friends are written in Perl.   


signature.asc
Description: Digital signature

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Michael Russo

Duncan <1i5t5.duncan  cox.net> writes:
> 
> That allows rollback if desired, but does tie up some some space with the 
> automatically created btrfs "snapshot" that contains the ext3/4 metadata 
> and untouched data.  

Nope, I definitely deleted the snapshots, running btrfs sub list 
gives me nothing back:

root@ossy:/usr/src/btrfs-progs# /usr/src/btrfs-progs/btrfs sub list /mymedia
root@ossy:/usr/src/btrfs-progs# 

Thanks for the detailed reply though. While doing this operation I 
wanted to not have any snapshots so that I needed the minimum 
amount of space to do the rebalance.  

But this all does seem strange right? Why the heck would it refuse
to move these 70GB and think there are 0 blocks of free space? 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

2014-03-03 Thread Duncan

Mike Russo posted on Mon, 03 Mar 2014 17:23:43 + as excerpted:

> I'm trying to convert a disk from single (/dev/sdc1) to RAID1
> (dev/sdd1), and the filesystem was previously ext4 but the conversion
> seemed to go just fine, and I have no snapshots. System and metadata
> convert, and almost all my data converts, but there are 70 stubborn GB
> (14 blocks of 5GB each) that refuse to convert and I get ENOSPC errors
> when trying to reallocate them.

While I created entirely new btrfs filesystems here and copied everything 
over rather than converting so I've not had personal experience with the 
conversion process...

The wiki[1] says[2] that while the conversion process uses the same data 
blocks for both ext3/4 and btrfs, it duplicates the ext3/4 metadata, 
creating a new btrfs copy (or two, for default metadata dup mode), 
leaving the original ext3/4 copy untouched.  Btrfs modifications are then 
done using standard btrfs COW (copy-on-write) methods, so the ext3/4 
data, while originally shared, remains untouched as well.

That allows rollback if desired, but does tie up some some space with the 
automatically created btrfs "snapshot" that contains the ext3/4 metadata 
and untouched data.  While you say you have no snapshots, it's unclear 
whether you mean none that you've created /since/ the conversion, but you 
didn't delete that original snapshot so still have it, or whether you 
deleted that automatically created btrfs snapshot of the old ext3/4 
filesystem and simply didn't specifically mention it.

I'm guessing that it's the former, and that btrfs is refusing to balance/
restripe that old ext3/4 snapshot albeit with a very confusing ENOSPC 
error message, since it'd kill the old ext3/4 filesystem and you could no 
longer rollback.

If that's the case, the page at [2] explains how you get rid of the old 
ext3/4 snapshot once you're sure you won't be rolling back and thus no 
longer need it.  With a bit of luck, that's all you need to do, and after 
deleting that, you can finish your balance/restripe. =:^)

If you've already btrfs subvol delete-ed the ext2_saved subvolume, or if 
you hadn't but doing so doesn't solve the problem, well, I went for the 
low-hanging-fruit solution but obviously that wasn't it. =:^(  Hopefully 
someone else can help further.

---
[1] https://btrfs.wiki.kernel.org  Bookmark it! =:^)

[2] https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ENOSPC errors during raid1 rebalance

RE: ENOSPC errors during raid1 rebalance

RE: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

RE: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

Re: ENOSPC errors during raid1 rebalance

25 matches

Site Navigation

Mail list logo

Footer information